Gene Exprssion Measurement by real time PCR
Exercise : Real-time Quantitative Polymerase Chain Reaction (Real-Time qPCR)
Polymerase Chain Reaction (PCR)
The polymerase Chain reaction is an enzyme based DNA amplification process utilizing one of several different thermal polymerases derived from a thermophylic bacterium such as Thermus aqaticus (hence Taq polymerase). However, there are now a range of different polymerases showing thermal stability which are used for this purpose, including some with proof-reading ability confering greater fidelity (accuracy) although these tend to have lower processivity. There are now an increasing number of enzymes available which are used in Real-Time quantitative PCR that have "FAST" characteristics and tolerate very short extension times.
The thermal polymerases used in PCR that are characterised by their DNA dependent DNA polymerase activity requiring a partially double stranded molecule with a 5' overhang providing a template strand. They thus elogate in a 5' to 3' direction extending the molecule formed usually by annealing a single stranded primer to the template following denaturation of the DNA within the reaction.
The ability to under go multiple rounds of denaturation, primer annealing and elongation ie heating and cooling is an
essential characteristic conferred by the enzymes high thermal stability. The primers used in standard PCR are usually chemically synthesized oligonuleotides of between 18 and 30 bases in length and are specific and complementary to the 3' ends of the forward and reverse strands of the double stranded target template sequence to be amplified. Thus the primers are sometimes refered to the forward and reverse primers. However, primers designed for standard PCR are sub-optimal in performance and cannot usually be used in Real-Time quantitative PCR.
diagramatic representation of three step cycling of PCR appropriate to a three step Sybr Green based assay
PCR has become one of the standard molecular biology tools for amplifying segments of DNA and manipulating these for cloning and many other applications. However since the polymerases used are DNA depenent enzymes PCR will not efficiently use RNA as a template and amplification of specific RNA sequences require that the RNA is first converted to complementary DNA (cDNA) using reservse transcriptase (RT).
The effective and specific amplification of a DNA segment is dependent upon the specificity of the primer sequences used, this means that primer design is a critical step in the process and poor primer design will lead to either non-specific amplification of multiple DNA segments, the wrong segment or failure to amplify anything at all.
The broad criteria used in the design process include:
- designed to have a specified matched melting temperature (Tm) for both primers
- Lack Self Complementarity, if complementarity is present this may result in internal intra molecular structure would have an adverse effect on the amplification
- Lack Primer-Primer complementarity (3’ end complementarity should be avoided, this complementarity results in the amplification of primer dimer products in preference to amplicon)
- The primer binding site should be unique within the target molecule
- The combination of primer binding sites should be unique within the population of molecules being used and should have opposit orientation to each other. Primers can be searched against database entries using BLAST or some similar program
In practice primer design is done using software. Many different packages are used but one of the most commonly used non-commercial package is called Primer3. Primer3 can be accessed over the internet, it is a very versitile package for designing oligonucleotides for various applications including PCR, real time PCR and sequencing
Real-Time PCR technical tutoial produced by
Margaret Hunt at University of South Carolina
http://www.med.sc.edu:85/pcr/realtime-home.htm Steve Rozen and Helen J.
Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S (eds) Bioinformatics Methods and
Protocols: Methods in Molecular Biology. Humana Press, Totowa, NJ, pp 365-386
Real-time Quantitative Polymerase Chain Reaction (Real-Time qPCR)
One of the applications of PCR is to measure the amount expression of any given gene. However, the basic PCR technique does not have a quantitative output and cannot be used to inform template copy number. This
is because PCR reactions rapidly become rate limited by the conditions within a reaction and the end point therefore
does not reflect the initial concentration of template. We therefore use modifications of this technique to provide
measurable output during the exponential non rate limited phase of the amplification. This can be done in “real-time”
and hence the term "real-time PCR" but is better refered to as real-time quantitative PCR (qPCR).
Exercise 1. Primer design for a Sybr Green real-time qPCR assay
For many applications, including diagnostics the sensitivity of PCR has been further extended and made quantitative by the use of a fluorescent intercalating dye such as Sybr Green. This is important for many applications where not only is sensitivity and specificity important along side a diagnostic threshold but also the amount of target DNA is critical to the clinical response such as in measuring viral load in some assays such as HIV.
On the other hand, one of the basic requirements for gene expression analysis is to be able to differentiate between genomic DNA and mRNA in a sample. For eukaryotes this is simply done by designing primers to straddle a large intronic sequence in the mRNA such that the primers to sit within two different exons. The consequence of this is that amplification of the mRNA sequence is massively favoured to the genomic sequence and the primers will not amplify a product to any significant extent because of the intervening intron. In other organisms that don't have introns this is much more of a problem and extreme care and appropriate controls are necessary.
Intercalating dyes such as Sybr Green are used in real-time qPCR to quantiatively detect accumulating product usually in a traditional three stage PCR reaction.
Primer Design for HPV-6 PCR using Primer3plus web interface.
More than 100 different HPV types are known and are identified by a number. Some types are more clinically important than others. For example types 6 and 11 cause around 90% of genital warts and are also associated with oral papillomas. There are two vaccines currently in use Cervarix providing protection against two HPV types linked to cervical neoplasias and Gardasil in addition providing protection against a HPV 6 & 11 . Cervarix is the current vaccine of choice adopted by the NHS in England and Wales. The HPV vaccination programme started in September 2008.
- Now for the first exercise we will design primers to amplify a target region of HPV 6 which has been indicated by sequence alignment to provide some specifictiy that virus type. The rational behind this choice we introduced previously, all we need to know for the time being is that it is located within the middle of the genome as represented below, covering a region before the end of E5a and after the start of E5b.
From the NCBI HPV Type 6 Genome entry (there is a link to the NCBI genome viewer here)
- Open the text file HPV6seq.txt by clicking on the hyperlink. Then return back to this page without closing the browser tab/window.
- Open the primer3plus web interface. Note the interface has a series of Tabs (Main, General Settings, Advanced Settings .....), ignore these for the time being and we will use the default values. Copy the sequence and paste it into the text box in the primer3plus interface. Notice that on the upper left there is a drop down box, the default is "Detection". You do not need to change this, but take a look at the options. The page should look something like that below.
- Now run primer3 by clicking on "Pick primers ".
- The page will show the sequence with the first pair of primers highlighted i.e. the forward primer and complement of the reverse primer. The "best" primers and their statistics are given at the top of the page. Up to 5 primers may be listed with warnings/comments below this. If no suitable primer pairs are found none are listed.
- The results page is divided into sections relating to each pair of primers produced by the software. Take a look at the sequences produced, all of the primers are actually very similar. For the top pair both primers have runs of contiguous nucleotides and in particular the left primer contains a run of ts' at its 3' end. This is undesirable. Can you guess why this might be so? Click the "back" button in the interface on the upper left.
- Now study the sequence you will see in the center of the first line of sequence a run of four ts highlight these and click on the button underneath the text box marked "<>". This inserts angled braces around the highlighted region these can also be typed in at the appropriate places in the sequence. The angled braces identifies within the sequence regions to be excluded by the software for the purpose of primer design. Identify any other similar regions of 4 or more similar contiguous sequences. The interface should now look something like that below.
- We are also going to now adjust some of the parameters the software uses such as the product size range. Click on the "General Settings" tab. The text box at the top labeled "Product Size Ranges", replace this content with "100-200".
- Now click "Pick Primers"
- You will notice that the "Any" value has fallen from 6.0 to 3.0 for the left primer and that the primers contain fewer contiguous runs of the same nucleotide and in particular are more heterogenious at their 3 prime ends. One of the problems here is there is very little scope for shifting the primers within the sequence because it is so short to do so and thus restricts our options.
- The next problem is how to address the specificity of the primers!. We have picked a region that is specific to HPV 6, however that doesn't mean that there is not local homology corresponding to the location of the primers and therefore the poossibility of mis-priming. The best way to do this is to identify if the primers can prime other sequences by asking the following:
- Do they have high similarity, especially 3' complementarity with other sequences.
- Do both primers have similarity to the same segment of DNA and on opposit strands in close proximity such that they could amplify a segment of DNA
- If either of these are true different primers need to be chosen from the list or designed.
- This can be tested by carrying out a sequence similarity search essentially a BLAST search. Now go back to the Primer3 results and click on "Send to Primer3Manager" button on the left of the page in the section for "pair 1"
- A new window will open showing the primers something like that below. Choose the first primer Click on the hyperlink "BLAST" on the right hand side.
- This opens a new window containing the BLAST interface at the NCBI.
- The default database is "other" this is essentially all sequence contained within GenBank.
- We will do a Megablast query so click on the radio button towards the bottom of the page "Highly similar sequences (megablast)". This search is optimised for short sequences such as primers and generally works better for this type of comparison.
- Next to the BLAST button check the box "show results in new window"
- Only NOW click the BLAST button
- The results page will eventually open, this page is divided into three parts:
- A graphical representation of the top 50 or so hits. Because there are so many HPV sequences in the database this is less useful in this case but where less redundancy in the database occurs this can be a very useful summary. Complementarity to the 3' end of the primer can be most easily seen from graphic. The horizontal bars represent the hits aligned against the length of the primer (red bar with black numbering) numbered at the top of the graphic. So a gap toward the right hand end indicates a mis-match at the 3' end of the aligned sequences
- The next section is a list of matching entries and associated scores for the individual alignments. Complemetarity oever the entire length is given by the column headed "Query coverage" not "Max ident" which refers to the % identity over the complemetary part of the alignment)
- The lower section has the actual alignments. Note each alignment states values for the score, expect, identities, gaps and STRAND. This latter information tells you if the match is against the sense or complementary (minus) strand in the database.
- Note which hits/sequences have 100% complementarity. . You may need to scroll down to the alignments and check these in reverse order. For a pair of primers to amplify a product one must be identical to the sense strand and the other identical to the minus strand or at the very least near identitiy and 3' complementarity of the primer in each case. The images below are examples of primer design from the Human CD40 gene, but illustrate the concepts surrounding primer specificity.
- Now go back to primer3Manager and repeat this for the reverse primer. Compare the matches having 100% complementarity for both primers and those that have 3' complementarity. If the primers are "good" the hits in common should only include HPV 6. Some of the sequences of HPV 6 will not have 3' complementarity because the region provided spans two open reading frames E5a and E5b.
- Other considerations for a diagnostic is possible host/patient nucleic acid sequences contaminating the sample so, Chimpanzee, pig, dog other more exotic sequences are of less relavance to the specificity of a diagnostic test but human would be important these would provide false possitivity. If the primer pairs did not satifactoraly meet the criteria, then one would return to identify new sequences and/or primers and repeat the process.
- Congratulations you have designed primers which could be used to detect HPV 6 infection and thus differentiate between HPV 6 and other types of human papilloma virus.
There are a number of different probe based qPCR technologies in the practical we will only look at one of the common detection chemistries used "Taqman". This historically was first commercialised by a company using the name Taqman because of the real-time instrument originally used in the detection. It is a technique which utilises a third oligonucleotide sequence internal to the primer pairs in PCR. The detection system used is the fluorescence released by the degradation of this "internal hybridisation" probe during the amplification process. In its undegraded state the probe has a fluorecent molecule attached to the 5' end which is excited by light of a specific wave length but passes the energy from this process through the molecule (termed forster resonance energy transfer or FRET) to another fluor at the 3', this energy transfer prevents the 5' fluor from emitting light with a specific spectra . The degradation of the probe by the polymerase during amplification releases the 5' fluor and causing it to emit light on excitation which can be detected by a photomultiplier.
The amplification plot above shows a typical sigmoidal curve, characteristic of the increasing effect of multiple factors limiting the efficiency of the reaction with each cycle. The verticle axis represents the fluorescence obtained progeressively after each cycle (horizontal axis) . This measure directly reflects the accumulation of product in the reaction and the rate of accumulation decreases and eventually ceases altogether.
Because the polymerase cannot use RNA as a template, the first step requires the conversion of the target mRNA into cDNA using reverse transcriptase. The second requires the quantification of the copy number of a specific cDNA present in a complex mixture using the particular detection method involving PCR as described above.
Exercise: Primer & Taqman style Probe Design for Real-Time qPCR using Primer3plus web interface
- We will design primers and probes to detect and amplify the extracellular and transmembrane region of CD40 variant 1 (ENSEMBL transcript ID ENST00000372285). CD40 is located on chromosome 20 (this is given on the ENSEMBL page for CD40). The start codon (ATG) can be identified by colouring the transcript sequence in ENSEMBL using the Exons and Codons option in the drop down box at the bottom of the page it should look like that shown below.
- From the UniProt data base entry (there is a link to this from ENSEMBL) the cytoplasmic domain of the protein corresponds to amino acids 216-277 ie the transmembrane and external domian is from 1st to the 215th codon, a quick calculation gives the 645th nucleotide of the coding region (215x3) or 722nd nucleotide of the transcript (645+77 of the 5'UTR)
- One of the basic requirements for gene expression analysis is to be able to differentiate between genomic DNA and mRNA in a sample. For eukaryotes this is simply done by designing the internal oligonucleotide hybridisation probe to straddle an intron exon junction in the mRNA and the primers to sit within two different exons. The consequence of this is that the hybridisation probe will not form a stable hybrid with the genomic sequence and the primers will not amplify a product because of the intervening intron. Now for this exercise we will design primers and a "taqman" internal hybridisation probe to amplify the extracellular and transmembrane region of CD40 variant 1 (ENSEMBL transcript ID ENST00000372285).
- If you have closed the Primer3 open the primer3plus web interface. On the upper left there in the drop down box, click on this and select "Detection". Go back to the CD40 page in ENSEMBL and identify the exon structure. Choose and appropriate exon-exon junction and note where it lies within the CDS. Copy the entire CDS and paste it into the text box for primer3plus.
- The image above shows only part of the sequence starting at the atg going into the second exon of the CDS. Having identified the exon-exon junctions and chosen one, mark this with the "[ ]" brackets this identifies for primer3 a target segment to be included in the internal oligonucleotide. In the above example the junction lies between the two Gs.
- The next step is to restrict the region available for placing the primers. This is done using the exclude notation "< >". Any number of exclude region can be marked but since we want the primers to produce a small amplicon exclude the sequence about 50-60 base pairs either side of the exon junction ie two regions something like that above.
- Click on the check box for "Pick hybridization probe" and now we need to change the settings specific to the requirements of taqman chemistry. Click on
- the "General Settings" Tab in the box adjacent to Primer Tm set Min to 57, Opt to 59, Max to 63 and the Max Tm Difference to 2.0
- the "Advanced Settings" Tab click on the check box Ues Product Size Input ...... and next to Product size set Min to 70, Opt to 80 and Max to 120
- the "Internal Oligo" Tab : adjacent to Hyb Oligo Tm set Min to 67, Opt to 69 and Max to 73 (Note these are 10 degrees higher than the primer Tm settings
- Click Pick Primers
- Next we need to make some manual checks specific to Taqman probe chemistry.
- Should not have more G’s than C’s
- Avoid runs of 3+ of the same nucleotide, especially G’s
- No Guanidines at the 1st or 2nd position at the 5’ end, these act as quenchers and since the nucleolytic degradation by the polymerase leaves the 5’ reporter dye attached to the first base and sometimes subsequent bases, such a molecule would provide a poor readout
- probe and primers anneal to the target, the 5’ end of probe should be as near to the 3’ end of the primer on same strand (max of 10-12). This tends to lead to more efficient displacement and degradation of the probe
- The next problem is how to address the problem of specificity! Specificity can be checked using BLAST comparison via primer3Manager just as before and is particularly important for the probe sequence to ensure the signal obtained is specific to the target gene. The best way to do this is to identify if the primers can prime other sequences by asking the following:
- Do they have high similarity, especially 3' complementarity with other sequences.
- Do both primers have similarity to the same segment of DNA and opposit strands such that they could amplify a segment of DNA
- Does the have high similarity, complementarity with other sequences.
- If any of these are true different primers or probes need to be chosen from the list or redesigned.
- This can be tested by carrying out a sequence similarity search essentially a BLAST search. Now click on "Send to Primer3Manager " at the top of the page
- A new window will open showing the primers something like that below. Choose the first primer Click on the hyperlink "BLAST " on the right hand side.
- This opens a new window containing the BLAST interface at the NCBI.
- click on the radio button "Human genomic and transcript "
- click on the radio button "Highly similar sequences (megablast)"
- next to the BLAST button check the box "show results in new window "
- only NOW click BLAST
- The results page will eventually open, scroll down to the list, this is divided into two sections "Transcripts" and "Genomic Sequences" note which transcript hits/sequences have 100% complemetarity (headed "Query coverage" not "Max ident" which refers to the % identity over the complenetary part of the primer) and which have complementarity to the 3' end of the primer this can be most easily seen from graphic (above the table). The black horizontal bars represent the hits aligned against the length of the primer (red bar with black numbering) numbered at the top of the graphic (position 0 to 18 a length of 19). The actual alignments are found at the bottom below the table. Note each alignment states values for the score, expect, identities, gaps and STRAND. This latter information tells you if the match is against the sense or complementary (minus) strand in the database. For a pair of primers to amplify a product one must be identical to the sense strand and the other identical to the minus strand (leave this window
- Now go back to primer3Manager and repeat this for the reverse primer. Compare the matches having100% complementarity for both primers and those that have 3' complementarity. If the primers are "good" the hits in common should only include CD40 varient1 and 2.
- Now Go back to the Primer3 Manager window. The last question we need to consider is "are there any known common variants of the gene where there are SNPs corresponding to the primer or probe sequences we have chosen". If there are, the primers and or probe may not work on all the samples we wish to compare. This can be done using a tool called SNPCheck. Click here to open a new tab
- Copy and paste the pairs of sequences one pair on a line into the query window (to do more than one at a time you must register), preceeded by an name for the pair (this name cannot contain any spaces) and followed by the chromosome identifier in this case Chromosome "20". Set the maximum amplicon size to 1000 bp Now click on the SNPCheck buttom As below:
- The out put is a table with a graphic shown below. It shows the positions and region bounded by the sequences, the number of matching/mismatching bases and lists any SNPs found within the oligonucleotides submitted.
- The program provides a warning that the probe and forward primer are on the same strand, this is ok. It also indicates that no known variants occur within any of the sequences. However, it does highlight a possible difference in the sequence held in the SNPCheck human genome database and the ENSEMBL sequence. This difference consists of two "mis-match bases which are close to the intron-exon junction. And this is a concern as we don't know which one is correct. This discrepancy highlights some of the problems with have with the quality of data held in the sequence databases and because of this it could be possible that our probe would not work because of this and we would be better to avoid this region of CD40.