Bioinformatics :


Gene Exprssion Measurement by real time PCR

Exercise : Real-time Quantitative Polymerase Chain Reaction (Real-Time qPCR)

Polymerase Chain Reaction (PCR)

The polymerase Chain reaction is an enzyme based DNA amplification process utilizing one of several different thermal polymerases derived from a thermophylic bacterium such as Thermus aqaticus (hence Taq polymerase). However, there are now a range of different polymerases showing thermal stability which are used for this purpose, including some with proof-reading ability confering greater fidelity (accuracy) although these tend to have lower processivity. There are now an increasing number of enzymes available which are used in Real-Time quantitative PCR that have "FAST" characteristics and tolerate very short extension times.

The thermal polymerases used in PCR that are characterised by their DNA dependent DNA polymerase activity requiring a partially double stranded molecule with a 5' overhang providing a template strand. They thus elogate in a 5' to 3' direction extending the molecule formed usually by annealing a single stranded primer to the template following denaturation of the DNA within the reaction.

The ability to under go multiple rounds of denaturation, primer annealing and elongation  ie heating and cooling is an essential characteristic conferred by the enzymes high thermal stability. The primers used in standard PCR are usually chemically synthesized oligonuleotides of between 18 and 30 bases in length and are specific and complementary to the 3' ends of the forward and reverse strands of the double stranded target template  sequence to be amplified. Thus the primers are sometimes refered to the forward and reverse primers. However, primers designed for standard PCR are sub-optimal in performance and cannot usually be used in Real-Time quantitative PCR.


diagramatic representation of three step cycling of PCR appropriate to a three step Sybr Green based assay

PCR has become one of the standard molecular biology tools for amplifying segments of DNA and manipulating these for cloning and many other applications. However since the polymerases used are DNA depenent enzymes PCR will not efficiently use RNA as a template and amplification of specific RNA sequences require that the RNA is first converted to complementary DNA (cDNA) using reservse transcriptase (RT).

The effective and specific amplification of a DNA segment is dependent upon the specificity of the primer sequences used, this means that primer design is a critical step in the process and poor primer design will lead to either non-specific amplification of multiple DNA segments, the wrong segment or failure to amplify anything at all.

The broad criteria used in the design process include:

In practice primer design is done using software. Many different packages are used but one of the most commonly used non-commercial package is called Primer3. Primer3 can be accessed over the internet, it is a very versitile package for designing oligonucleotides for various applications including PCR, real time PCR and sequencing


Real-time Quantitative Polymerase Chain Reaction (Real-Time qPCR)

One of the  applications of PCR is to measure the amount expression of any given gene. However, the basic PCR technique does not have a quantitative output and cannot be used to inform template copy number . This is because PCR reactions rapidly become rate limited by the conditions within a reaction and the end point therefore does not reflect the initial concentration of template. We therefore use modifications of this technique to provide measurable output during the exponential non rate limited phase of the amplification. This can be done in real-time and hence the term "real-time PCR" but is better refered to as real-time quantitative PCR (qPCR).

There are a number of different qPCR technologies in the practical we will only look at one of the common detection chemistries used "Taqman". This historically was first commercialised by a company using the name Taqman because of the real-time instrument originally used in the detection. It is a technique which utilises a third oligonucleotide sequence internal to the primer pairs in PCR. The detection system used is the fluorescence released by the degradation of this "internal hybridisation" probe during the amplification process. In its undegraded state the probe has a fluorecent molecule attached to the 5' end which is excited by light of a specific wave length but passes the energy from this process through the molecule (termed forster resonance energy transfer or FRET) to another fluor at the 3', this energy transfer prevents the 5' fluor from emitting light with a specific spectra . The degradation of the probe by the polymerase during amplification releases the 5' fluor and causing it to emit light on excitation which can be detected by a photomultiplier.


The amplification plot above shows a typical sigmoidal curve, characteristic of the increasing effect of multiple factors limiting the efficiency of the reaction with each cycle. The verticle axis represents the fluorescence obtained progeressively after each cycle (horizontal axis) . This measure directly reflects the accumulation of product in the reaction and the rate of accumulation decreases and eventually ceases altogether.


Because the polymerase cannot use RNA as a template, the first step requires the conversion of the target mRNA into cDNA using reverse transcriptase.  The second requires the quantification of the copy number of a specific cDNA present in a complex mixture using the particular detection method involving PCR as described above.

Exercise: Primer & Taqman style Probe Design for Real-Time qPCR using Primer3plus web interface

  1. We will design primers and probes to detect and amplify the extracellular and transmembrane region of CD40 variant 1 (ENSEMBL transcript ID ENST00000372285). CD40 is located on chromosome 20 (this is given on the ENSEMBL page for CD40). The start codon (ATG) can be identified by colouring the transcript sequence in ENSEMBL using the Exons and Codons option in the drop down box at the bottom of the page it should look like that shown below.
  1. From the UniProt data base entry (there is a link to this from ENSEMBL) the cytoplasmic domain of the protein corresponds to amino acids 216-277 ie the transmembrane and external domian is from 1st to the 215th codon, a quick calculation gives the 645th nucleotide of the coding region (215x3) or 722nd nucleotide of the transcript (645+77 of the 5'UTR)
  2. One of the basic requirements for gene expression analysis is to be able to differentiate between genomic DNA and mRNA in a sample. For eukaryotes this is simply done by designing the internal oligonucleotide hybridisation probe to straddle an intron exon junction in the mRNA and the primers to sit within two different exons. The consequence of this is that the hybridisation probe will not form a stable hybrid with the genomic sequence and the primers will not amplify a product because of the intervening intron. Now for this exercise we will design primers and a "taqman" internal hybridisation probe to amplify the extracellular and transmembrane region of CD40 variant 1 (ENSEMBL transcript ID ENST00000372285).
  3. If you have closed the Primer3 open the primer3plus web interface.  On the upper left there in the drop down box,  click on this and select "Detection". Go back to the CD40 page in ENSEMBL and identify the exon structure. Choose and appropriate exon-exon junction and note where it lies within the CDS. Copy the entire CDS and paste it into the text box for primer3plus.
  1. The image above shows only part of the sequence starting at the atg going into the second exon of the CDS. Having identified the exon-exon junctions and chosen one, mark this with the "[ ]" brackets this identifies for primer3 a target segment to be included in the internal oligonucleotide. In the above example the junction lies between the two Gs.
  2. The next step is to restrict the region available for placing the primers. This is done using the exclude notation "< >". Any number of exclude region can be marked but since we want the primers to produce a small amplicon exclude the sequence about 50-60 base pairs either side of the exon junction ie two regions something like that above.
  3. Click on the check box for "Pick hybridization probe" and now we need to change the settings specific to the requirements of taqman chemistry. Click on
  4. Click Pick Primers
  1. Next we need to make some manual checks specific to Taqman probe chemistry.
  2. The next problem is how to address the problem of specificity! Specificity can be checked using BLAST comparison via primer3Manager just as before and is particularly important for the probe sequence to ensure the signal obtained is specific to the target gene. The best way to do this is to identify if the primers can prime other sequences by asking the following:
  3. If any of these are true different primers or probes need to be chosen from the list or redesigned.
  4. This can be tested by carrying out a sequence similarity search essentially a BLAST search. Now click on "Send to Primer3Manager " at the top of the page
  5. A new window will open showing the primers something like that below. Choose the first primer Click on the hyperlink "BLAST " on the right hand side.
  1. This opens a new window containing the BLAST interface at the NCBI.
  1. The results page will eventually open, scroll down to the list, this is divided into two sections "Transcripts" and "Genomic Sequences" note which transcript hits/sequences have 100% complemetarity (headed "Query coverage" not "Max ident" which refers to the % identity over the complenetary part of the primer) and which have complementarity to the 3' end of the primer this can be most easily seen from graphic (above the table). The black horizontal bars represent the hits aligned against the length of the primer (red bar with black numbering) numbered at the top of the graphic (position 0 to 18 a length of 19). The actual alignments are found at the bottom below the table. Note each alignment states values for the score, expect, identities, gaps and STRAND. This latter information tells you if the match is against the sense or complementary (minus) strand in the database. For a pair of primers to amplify a product one must be identical to the sense strand and the other identical to the minus strand (leave this window open).
  1. Now go back to primer3Manager and repeat this for the reverse primer. Compare the matches having100% complementarity for both primers and those that have 3' complementarity. If the primers are "good" the hits in common should only include CD40 varient1 and 2.
  2. Now Go back to the Primer3 Manager window. The last question we need to consider is "are there any known common variants of the gene where there are SNPs corresponding to the primer or probe sequences we have chosen". If there are, the primers and or probe may not work on all the samples we wish to compare. This can be done using a tool called SNPCheck. Click here to open a new tab
  3. Copy and paste the pairs of sequences one pair on a line into the query window (to do more than one at a time you must register), preceeded by an name for the pair (this name cannot contain any spaces) and followed by the chromosome identifier in this case Chromosome "20". Set the maximum amplicon size to 1000 bp Now click on the SNPCheck buttom As below:
  1. The out put is a table with a graphic shown below. It shows the positions and region bounded by the sequences, the number of matching/mismatching bases and lists any SNPs found within the oligonucleotides submitted.
  1. The program provides a warning that the probe and forward primer are on the same strand, this is ok. It also indicates that no known variants occur within any of the sequences. However, it does highlight a possible difference in the sequence held in the SNPCheck human genome database and the ENSEMBL sequence. This difference consists of two "mis-match bases which are close to the intron-exon junction. And this is a concern as we don't know which one is correct. This discrepancy highlights some of the problems with have with the quality of data held in the sequence databases and because of this it could be possible that our probe would not work because of this and we would be better to avoid this region of CD40.