"Transcriptomics & Functional Genomics"

Welcome to Transcriptomics & Functional Genomics News Letter No 8. This editions focus is on functional genomics using the Illumina Platform.

Currently only the focus topic from each newsletter is being made available on the internet (please note this material is covered by copy right and permission should be sought to reproduce any content). The full newsletter is available internally via the intranet as a pdf. If you are interested in advertising a seminar or promotion via the newsletter or sponsorship please contact : Dr K Laing (Senior Scientist  Intracellullar Pathogen Cooperative).

In Oct 2007 the St George's allocated funding for new equipment to be purchased and housed within The Medical Biomics Facility. The first piece of equipment acquired under this plan is an Illumina 500GX, which became available for users in Dec 2007-Jan 2008. The Illumina system offers state of the art microarray technology capable of both genotyping and gene expression along with an increasing number of other capabilities. Within London there are three such systems and acquisition of this equipment puts St George’s in the vanguard with research capacity matching the best and surpassing most. This edition aims to provide an insight into the technology, the system capability and some of the current applications.

I hope that this edition will provide a stimulus for everyone interested in the technology to explore its use within the scope of his or her current and future research focus.

If you are interested in other topics cover in other newsletters go to:

http://www.sgul.ac.uk/depts/medmicro/newsletter.htm

Ken Laing Intracellular Pathogen Co-operative, Cellular & Molecular Medicine, St George’s University of London

The Illumina 500GX

Illumina Bead Scanner

The Illumina BeadChip System offers state of the art microarray technology capable of both genome wide and custom genotyping and gene expression for Human, Mouse and Rat, along with an increasing number of other applications and organisms. The system provides St George’s users with state of the art technology currently available in very few academic settings. Historically genotyping has been a particular strength of the system and 2007 saw numerous association studies published world wide using the technology including three separate studies on diabetes published in Nature1, Nature Genetics2 and Science3. Other association studies have linked genetic determinants explaining variation in HIV-1 viral load4 during the asymptomatic phase of disease; whilst yet other publications have identified genetic risk to prostate cancer5, childhood asthma6, bipolar disorder7 and celiac disease8 amongst others. Copy number variation is becoming increasingly important and this is an application that has been recently added to the Illumina stable using the existing Infinium assay combining SNP and CNV analysis on one array9.

Illumina as a company first started commercial operations in 2002 and was founded around the current BeadArray technology. Unlike many of the other microarray technologies they owe nothing to the patents covered by Affymetrix or OGT and are therefore free of licensing to either of these companies. The up side of this is that Illumina has had very strong growth of its technology and have been relatively free to develop new applications.

The first offerings the company provided was as a “custom” genotyping platform using the GoldenGate® assay. Although this assay is capable of multiplexing 1536 SNPs in a single assay this is not capable of genome wide coverage. Even so the technology has been estimated as providing something like 70% of the phase 1 HapMap data. The collation of this data showed not only that the technology was popular with geneticists but was robust and had call rates and reproducibility exceeding 99%.

Illumina subsequently developed the InfiniumTM assay to allow genome wide SNP typing. It was as a platform for genotyping that Illumina earned its reputation, it went on to expand the system capability in 2003 with genome wide gene expression with the same Sentrix BeadChips and Array Matrix formats. In 2006 Illumina added Rat to Human and Mouse as off the shelf genome wide products. The platform capability has further expanded into copy number variation (CNV) and specialized assays for cryopreserved or paraffin embedded samples, DNA methylation and microRNA applications. In 2007 the company introduced their 1 million SNP and CNV BeadChip and expanded into next generation sequencing with RNA-seq and other applications having acquired Solexa. Towards the end of 2007 Illumina expanded their methylation capability to include a beadchip interrogating more than 27,000 CpG loci using the Infinium assay. At the beginning of 2008 Illumina launched its 2.3 million SNP and CNV along with the 610-Quad, 1M Duo and exon550S-Duo HD BeadChips. In the latter part of 2008 the HumanHT 12 gene expression array format based on the V3 genome wide arrays was also released although this is based on the standard 3 micron bead size but with lower average bead redunancy. Illumina is also seeking FDA approval for their BeadExpress platform for Invitro-diagnostics also using bead technology. Illumina now rank along side Affymetrix as one of world leaders in commercial microarrays.

Figure 1 Illumina BeadArray Composition

The technology (see figure 1) is based upon 2 or 3 micron beads (depending whether the chips are standard or HD format) which are coated with a nucleic acid reporter, for the gene expression platform this consists of a 29 base Tag sequence followed by a 50 base unique reporter that is specific to a given gene or exon sequence. One of the unique features of this technology is the random self-assembly of a pool of beads labeled in this way into an array in one of two formats. Since the assembly of the array is a random process the location of any given bead is unknown and every array is unique. Decoding10 the array is thus a pre-requisite to obtaining a functional array and Illumina carries this out prior to shipping the arrays using the tag sequence providing an electronic map for each array as a down load or on a CD. The Bead Studio software automatically and seamlessly maps each reporter to its identity on each array. As a consequence of this necessary decoding process each array is quality assayed to ensure the content meets the required standard. As a consequence reporters have an average 20 or 30 fold feature redundancy and even with this over representation 6 human genome wide arrays can be place on a single slide using the standard array density and now with the new HD expression arrays this has doubled to 12.

Illumina have developed a series of array formats (Sentrix/HD BeadChips and Sentrix Array Matrix), assays (DASL, GoldenGate Infinium I and Infinium II) and applications (Genotyping, CNV, gene expression, methylation & microRNA) that can be used in various combinations and consequently the terminology is often unfamiliar and can be confusing, even some Illumina literature refers to the genotyping chips as Infinium BeadChips. So although there are two array formats Sentrix/HD BeadChip and Sentrix Array Matrix (SAM), both formats are reliant upon the same bead technology (figure 1). The major difference in these lies in the support matrix in which the beads are “embedded”. The Sentrix/HD BeadChip is a slide format similar in appearance to a mirrored glass microscope slide. However the slide has the coated beads partially embedded into wells etched into the surface of slide arranged either as a single array (Human 1M BeadChip) or in discrete arrays on a single slide (for example the Human 1M-duo or HumanHap610-Quad BeadChips, which have two and four arrays respectively. In the latter case each array consists of over 550,000 tag SNPs pus 60,000 other genetic markers, whilst the Human-6 gene expression BeadChip consists of six arrays with more than 48,000 transcript probe sets in each). The second format is based not upon a flat surface such as a slide, but is constructed from bundled fibers. The coated beads are randomly located into the ends of the bundled fibers and constitute pins in an array matrix. This is essentially an array of arrays with a 96 well microtitre plate spacing that can be placed over an appropriate plate with the bundled fibres/pins located within the wells of the plate. The advantage of this format is that high sample throughput can be achieved based on a plate assay. The down side of the SAM is the limitation of the number of features available with a maximum of 1536 probe sets. Each of the two array formats (Sentrix BeadArrays and SAMs) can be used in combination with specific assays such as Infinium or GoldenGate®. However, the array content is different, accorded by the particular assay and application. In particular the GoldenGate® assay uses primer pools containing an address sequence corresponding to specific beads within the pool and it is the address sequence which is hybridized to the array and the primer-address combination which therefore provides the specificity to the individual bead. One of the advantages of Illumina’s technology is that it adopts simple handling protocols that would be very familiar to anyone moving from glass slide based systems to Illumina.

Figure 2 Illumina’s standard approach to gene expression using Sentrix arrays. A eberwine based amplification using the ambion’s Illumina TotalPrep amlification Kit and B. the DASL assay (cDNA-mediated Annealing, Selection, Extention and Ligation), a PCR based amplification and ligation assay based on Illumina’s GoldenGate Technology

In the case of standard gene expression profiling, Illumina have adopted a “direct” assay that is a very similar approach to that used on other platforms. In this instance an Eberwine based amplification (see the last edition Newsletter No7) incorporating biotin into an amplified cRNA/aRNA (figure 2a). Depending on the amplification used, for example single round amplification using the Illumina TotalPrep kit (Ambion/Applied Biosystems) requires an input of around 50-500ng of total RNA and 750ng to 1.5 micrograms of biotinylated product for hybridization. Such a protocol should yield in excess of 5-10 micrograms of amplified RNA. Following hybridization with the array, labeled cRNA is detected with a streptavadin–Cy3 conjugated molecule.

The DASL assay (cDNA-mediated Annealing, Selection, Extention and Ligation) on the other hand is a PCR based amplification and ligation assay based on Illumina’s GoldenGate® Technology. The first step in DASL is conversion of the RNA to cDNA, for this a mixture of random and oligo dT primers are used since one of the main applications for DASL is profiling from partially degraded samples recovered from formalin-fixed, paraffin-embedded (FFPE) tissue sections (figure 2b). Such starting material would invariably yield RNA of poor quality. RNA degradation results in the loss of the polyadenylated tail and conventional oligo dT priming of cDNA synthesis would fail to yield meaningful signal hence the use of both random and oligo dT priming. GoldenGate® and thus DASL relies upon a pool of gene specific forward and reverse primers (Allele Specific Oligo - ASO,Locus Specific Oligo - LSO) containing a universal tag sequence or an array specific address sequence for extension-ligation and a PCR based amplification. With DASL the PCR can incorporate a Cy3 or Cy5-labelled primer for separate labeling and subsequent co-hybridisation of a genomic or reference sample in parallel to the sample under going profiling and thus the application has the capability to be used in a dual sample “two-colour” mode.

The second major application area is that of genotyping using Single Nucleotide Polymorphisms11 (SNPs) with applications in genome wide and follow-on gene association studies 12 as well as Linkage, Loss Of Heterozygosity and Copy Number Variation 9,13 (including SNP based arrayCGH). These applications are all based on the detection of SNPs within the genome and Illumina have adopted several different approaches to SNP/CNV detection.

For genome wide SNP/CNV interrogation12 Illumina use the Infinium® assay (Infinium I and Infinium II). These assays use isothermal whole genome amplification followed by enzyme controlled fragmentation as its starting point from around 200-400ng for the HD chips or 750ng of DNA depending upon the array and therefore the hybridization volume. Single base discrimination and therefore SNP detection for Infinium I12 is based upon the failure of DNA polymerase to extend a 3’ end mis-matched primer. Hence, the assay confers base extension specificity dependant upon the formation of a complete duplex and is therefore a single colour assay (figure 3a). The more recent introduction to the Illumina portfolio of these assays is Infinium II14. This assay is now the mainstay of genome wide SNP/CNV detection. Infinium II performs a single base extension with dideoxynucleotides labeled with one of two haptens, biotin or 2,4-Dintrophenol (DNP). Each pair of nucleotides corresponding to the bi-allelic transitions, such as. A to C/G or T to C/G etc these are labeled with one of the two haptens and are detected either by Strepavadin or anti-DNP immunoglobulin. Signal amplification is used in combination with anti-strapavadin or antibody to the anti-DNP immunoglobulins conjugated to a fluorescent reporter (figure 3b). Infinium II thereby reduces the required number of probes by half in comparison with Infinium I and thus allows a doubling of the array content. Because of the assay requirements the oligonucleotide design differs between Infinium I and II. The former includes the SNP within the sequence attached to the bead whilst the latter stops short of this position. In some assays a mixture of Infinuim I and Infinum II feature coverage is required to include all allele combinations.

The third approach to SNP genotyping is that primarily of a high through put low multiplicity assay (1536 plex) more usually used for follow-on or focused custom genotyping. This approach adopts Illuminas’ GoldenGate® assay (figure 4), a modified form of which is used for the DASL assay we has seen already. The assay, unlike DASL is an allele specific PCR based amplification and ligation assay that relies upon a pool of locus (LSO) and allele specific (ASO) primers each containing one of three universal tag sequences (P1, P2 & P3) and an array specific address sequence (which forms the duplex with the oligo attached to the bead specifying the location within the array). It is the combination of these that allow extension; ligation and PCR based amplification conferring detection of the specific alleles. Because of this design GoldenGate® genotyping is a two-colour system.

The next of the main application areas is that of epigenetics, more specifically the interrogation of DNA methylation at well-known CpG loci. DNA methylation occurs throughout the genome within CG dinuleotides, primarily located within CpG islands. Methylation plays a role in the regulation of gene expression through gene silencing and although often referred to as epigenetics is closely allied to both genotyping and expression profiling. Gene silencing probably operates through several mechanisms either directly or indirectly. Indirect inhibition of transcription may result from the interaction of methylcytosine with various binding proteins such as methylctosine-binding protein (MBP) and in turn the interaction of MBP and other proteins with the chromatin scaffold prevents the access of transcription factors to their binding domians. Evidence also exists for the specific inhibition of transcription factor binding by methylation of CG dinucleotides within or near promoters or cis-acting sites. As such changes in the methylation pattern is an early marker of altered transcription and both hypomethylation and regional hypermethylation states have been associated with specific diseases15.

The largest of the current methylation BeadChips includes more than 27,000 loci and uses bisulfite conversion of the DNA combined with either the Infinium® or GoldenGate® assays. Methylation protects DNA from the conversion of cytosine to uricil by the action bisulfite. Monitoring the chemical alteration of cytosine to uricil at unmethylated postions within the genome is similar to detecting a C to T polymorhism and both can be done either by use of single base extension or PCR based primer extension, ligation and amplification based assays as we have already seen (figures 3 & 4). Both are dependant upon either complete duplex formation or 3’ primer mis-match. In the Infinium® or GoldenGate® assays the genomic DNA is treated with bisulfite to convert unmethylated cytosines to uricil, followed by isothermal amplification, fragmentation and concentration or further purification of the DNA. In the Infinium® assay primer extension takes place on the array, while the GoldenGate® assay does so on paramagnetic beads. In both cases the data provides a ratio of the methylated to unmethylated DNA at each locus for which the assay is specifically designed. The attraction of microarray platforms like the Illumina GX500 lies in its low costs, ease of use, the robust nature of the data, its flexibility and wide range of end user applications spanning genomics, epigenetics and gene expression. As with all technology driven methodologies microarray based applications including those introduced above are rapidly adapting in response to the scientific questions we would like to address and the ever-changing capacity of the technology. As a consequence new applications become available and array content rapidly changes presenting both advantages and challenges to us as scientists.

References

  1. Hakonarson H et al (2007) A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature 448(7153): 591-594.
  2. Steinthorsdottir V et al(2007) A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nat Genet 39(6): 770-775.
  3. Scott LJ et al. (2007) A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316(5829): 1341-1345.

  4. Fellay J et al. (2007) A whole-genome association study of major determinants for host control of HIV-1. Science 317(5840): 944-947.
  5. Gudmundsson J et al. (2007) Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes. Nat Genet 39(8): 977-983.
  6. Moffatt MF et al.(2007) Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature 448(7152): 470-473.
  7. Baum AE et al.(2007) A genome-wide association study implicates diacylglycerol kinase eta (DGKH) and several other genes in the etiology of bipolar disorder. Mol Psychiatry May 8.
  8. van Heel DA et al.(2007) A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21. Nat Genet 39(7): 827-829.
  9. Peiffer DA et al. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Res. 2006 Sep;16(9):1136-48
  10. Gunderson K.L. et al (2004) Decoding Randomly Ordered DNA Arrays. Genome Research 14:870-877
  11. Oliphant A. et al (2002) BeadArray Technology: Enabling an accurate, cost effective approach to genotyping. BioTechniques 32:S56-S61
  12. Kevin L Gunderson, et al. (2005). A genome-wide scalable SNP genotyping assay using microarray technology. Nature Genetics37, 549 - 554
  13. Peiffer DA, Gunderson KL (2007) Analyzing copy number variation with Infinium whole-genome genotyping. Bio-IT March.
  14. Frank J Steemers et al. (2006) Whole-genome genotyping with the single-base extension assay. Nature Methods 3, 31 - 33
  15. J.T. Attwood, R. L.Yung and B. C. Richardson. DNA methylation and the regulation of gene transcription. CMLS, Cell. Mol. Life Sci. 59 (2002) 241–257
  16. Bibikova M,et al. (2006) High-throughput DNA methylation profiling using universal bead arrays. Genome Res. 2006 Mar;16(3):383-93.
St George's Internet

St George's Portal