PCR-Primers

From Wiki | Meyer Lab
Jump to: navigation, search

Objective

This page describes primer design for different PCR applications. [Primer3] is used for primer calculations.

The PrimerDesign Script

If you have no special requirements, use PrimerDesign.pl without any options:

PrimerDesign.pl -i target.fasta 

The script has many options. To view these options run the script with no arguments, or with -h for help.

PrimerDesign.pl -h
------------------------------------------------------------
PrimerDesign.pl
Designs general purpose PCR primers for a single sequence.
Usage: PrimerDesign.pl -i input <options>
Required arguments:
        -i input:       a FASTA file containing a single sequence to be amplified.
Options:
        -l (length):    ideal amplicon length (bp)
        -s (position):  beginning of region to consider for primer design
        -e (position):  end of region to consider for primer design
        -p (length):    ideal primer length (bp)
        -g (percent):   minimum percent GC content
        -h (percent):   maximum percent GC content
        -n (number):    number of primers to design
        -v (percent):   maximum percent deviation from ideal amplicon size

------------------------------------------------------------


Primers for Sanger sequencing

Sanger sequencing produces reads of ~700 bp, so amplicons should be designed within that range.

The first ~40 bp of sequence are typically useless, so its a good idea to include a 40-50 bp buffer on one end of the amplicon, and sequence from that end.

e.g. to sequence a 500 bp region of the target from positions 200 to 700:

PrimerDesign.pl -i target.fasta -l 550 -s 150 -e 700 -v 30

Primers for qPCR

For optimal qPCR efficiency, its ideal to have amplicons 100-200 bp in length.

For RT-qPCR, its ideal to design amplicons near the 3' end (especially if using oligo-dT primed cDNA)

If you're working with a de novo transcriptome assembly, your transcripts may represent sense or antisense strands. If you're uncertain, try this:

StrandCheck.pl -q target -d db -s minscore -w option

If the sequence matches a protein in your chosen DB, this will give feedback like:

The + strand of comp387481_c0_seq1 matched Q9U6Y7.
comp387481_c0_seq1 already represents the sense strand and doesn't require modification.

If your sequence has no match in the chosen DB, try another DB or accept that its not possible to know the answer and move on to primer design.

If your sequence is sense strand, target the last 500 bp. If its antisense, target the first 500. If youre unsure of sequence length, try:

FastaStats.pl target.fasta

With this information in hand, run PrimerDesign.pl as:

PrimerDesign.pl -i target.fasta -l 100 -s 898 -e 1398 -p 20

This command designs primers for amplicons 100 bp in length between positions 898-1398 in target.fasta, with primer length = 20 bp.

Primers for high-throughput exon sequencing

These platforms offer shorter read length (100-250 bp paired-end reads for 200-500 bp total), and have to design one or more amplicons accordingly. This is reasonably well suited for exon sequencing.

To identify exons from a transcript in gene.fasta, find the genomic scaffold containing this gene using blastn:

blastn -db database -query gene.fasta -out report.br

View the BLAST report, and if you think the match shown here is your target gene, copy the name of that genomic scaffold into a list.

Extract this scaffold from the genome assembly as:

GetSeq.pl list database scaffold.fasta

Then make a splice-aware alignment between the transcript and genomic scaffold using exonerate:

exonerate --model est2genome -s 100 --bestn 1 --showtargetgff TRUE gene.fasta scaffold.fasta > alignment.txt

You can then extract the list of exons as:

grep -P "\texon\t" alignment.txt

The output will look something like this:

adi_Scaffold2700        exonerate:est2genome    exon    31816   31971   .       -       .       insertions 0 ; deletions 0
adi_Scaffold2700        exonerate:est2genome    exon    31577   31694   .       -       .       insertions 0 ; deletions 0
adi_Scaffold2700        exonerate:est2genome    exon    30964   31074   .       -       .       insertions 0 ; deletions 0
adi_Scaffold2700        exonerate:est2genome    exon    29768   29830   .       -       .       insertions 0 ; deletions 0
adi_Scaffold2700        exonerate:est2genome    exon    27516   27633   .       -       .       insertions 0 ; deletions 0
adi_Scaffold2700        exonerate:est2genome    exon    26734   26861   .       -       .       insertions 0 ; deletions 0
adi_Scaffold2700        exonerate:est2genome    exon    25320   25594   .       -       .       insertions 0 ; deletions 0

where the 4th and 5th columns show the beginning and end of predicted exons.

Use this output to identify start and end positions for a series of amplicons, each including a single exon, with primers in the adjacent introns. To amplify a single one of these exons, try something like:

PrimerDesign.pl -i scaffold.fasta -l 165 -s 31796 -e 31991 -v 10

Formatting primers for ordering

To select primers from this list, and name them for ordering, you can run the following:

PickPrimers.pl -i all_primers.tab

this will produce something like this

Sequence1_F1    AACCTACATCATATGCG
Sequence1_R1    TTCATATTAGGGCACTT

If you'd like to give the sequences more convenient names for ordering, make a tab delimited text file with the names like this:

Sequence1       Name1
Sequence2       Name2
etc

then include this file as

PickPrimers.pl -i all_primers.tab -t names.tab

This will name the primers as:

Name1_F1    AACCTACATCATATGCG
Name1_R1    TTCATATTAGGGCACTT 

The choice of name is up to the user but it wouldnt hurt to make these names informative. A good convention is to use a 3-4 digit species code followed by a 3-5 digit gene symbol, when possible. e.g. Pdae_Hsp70.

History

Created 14:42 Feb 26, 2017 By: EliMeyer

Last updated 04:45 Jun 26, 2019 By: Admin