EVENTS | VIEW CALENDAR
Genomic sequencing gets ready to take off
CAMBRIDGE—The 1000 Genomes Project recently published an analysis of its completed pilot phase in an ongoing effort to produce better knowledge and tools for research into genetic contributors to human disease. Those results made it into the journal Nature in late October, with the public-private consortium providing what is said to be "the most comprehensive map of these genetic differences, called variations, estimated to contain approximately 95 percent of the genetic variation of any person on Earth."
Talk about the next steps was already hitting the various mainstream and pharma/biotech media within days, with Affymetrix Inc., for example, announcing that it will contribute genotyping data for a large set of validated rare and common genomic variants to the 1000 Genomes Project. That data will consist of genotyping information from more than 2 million single nucleotide polymorphisms (SNPs) and insertions/deletions (indels), many of which were not previously available from any source, according to Affymetrix.
The release includes common and rare variants genotyped on the four major HapMap populations: Caucasian, Yoruba/African, Chinese and Japanese, as well as on the larger HapMap Phase 3 sample set. The data will be incorporated into the 1000 Genomes public data repository and will be freely available.
"The 1000 Genomes Project is pleased to have Affymetrix join the project and contribute this large data set to complement existing genotyping data in the public domain," said Dr. David Altshuler, co-chair of the 1000 Genomes Project, in a prepared statement.
Cambridge, Mass.-based Altshuler, who is also an associate professor of genetics and of medicine at Harvard Medical School and deputy director of the Broad Institute of Harvard and MIT, added: "The genotypes that they have generated will be of interest to the project's participants, as well as to the broader research community."
Also, shortly after the 1000 Genomes Project data was published, media outlets were noting that the researchers next plan to sequence 2,500 individuals over the next two years at low coverage using the Illumina Genome Analyzer and HiSeq 2000, as well as Life Technologies' SOLiD system. In addition, they will perform deep whole-exome sequencing for those 2,500 individuals.
The work thus far, and the work yet to come, are important because small genetic differences between individuals can help explain why some people have a higher risk than others for developing illnesses such as diabetes or cancer, notes Dr. Richard Durbin of the Wellcome Trust Sanger Institute and co-chair of the consortium, who, like Altshuler, is based in Cambridge—the one in the United Kingdom, though, in his case.
1000 Genomes Project researchers produced their genetic map using next-generation DNA sequencing technologies to systematically characterize human genetic variation in 180 people in three pilot studies. The full scale-up from the pilots is already under way, with data collected from more than 1,000 people.
"The pilot studies of the 1000 Genomes Project laid a critical foundation for studying human genetic variation," Durbin says. "These proof-of-principle studies are enabling consortium scientists to create a comprehensive, publicly available map of genetic variation that will ultimately collect sequence from 2,500 people from multiple populations worldwide and underpin future genetics research."
Becoming part of the 1000 Genomes Project is important for Affymetrix because it allows the company to make significant contributions that will be of benefit not only to the project itself, but also to the broader genomics/disease research community, says Jay Kaufman, Affymetrix's vice president of strategic marketing, DNA.
"Affymetrix has historically leveraged its capabilities, technologies and innovations to advance science and to make enabling contributions for the benefit of researchers," Kaufman notes. "This ongoing commitment to the science is paramount to our corporate goal of keeping Affymetrix at the forefront of influential scientific endeavors such as the 1000 Genomes Project. We also anticipate that researchers will want to perform additional genome-wide association studies using this information, which can be facilitated via our Axiom myDesign custom genotyping capabilities, and our growing database of variants, which will approach 8 to 9 million in early 2011."
The wide access to genetic data is something at the forefront of Altshuler's mind as well, as he notes, "By making data from the project freely available to the research community, it is already impacting research for both rare and common diseases."
The 1000 Genomes Project has studied populations with European, West African and East Asian ancestry. Using the newest technologies for sequencing DNA, the project's nine centers sequenced the whole genome of 179 people and the protein-coding genes of 697 people. Each region was sequenced several times, so that more than 4.5 terabases of DNA sequence were collected.
The centers at which sequencing was performed were the Wellcome Trust Sanger Institute, BGI-Shenzhen, the Broad Institute, The Genome Center at Washington University, Baylor College of Medicine's Human Genome Sequencing Center, the Max Planck Institute for Molecular Genetics in Berlin, Illumina, Life Technologies and Roche 's 454 Life Sciences.
"What really excites me about this project is the focus on identifying variants in the protein-coding genes that have functional consequences," says Dr. Richard Gibbs, director of Baylor's Human Genome Sequencing Center. "These will be extremely useful for studies of disease and evolution."
In addition to the more than 2 million variants to be submitted to the 1000 Genomes Project—probably before the end of this year—Affymetrix also reported that it will release a larger data set of approximately 5 million variants on its web site at about the same time. According to Kaufman, the new data includes approximately 3 million human SNPs discovered by the 1000 Genomes Project that were not found in the HapMap Project or the NCBI's Single Nucleotide Polymorphism Database (dbSNP, release 130). This information, which was generated by Affymetrix using the Axiom Genotyping Solution, will allow scientists to create customized Axiom Genotyping Arrays containing 50,000 to as many as 2.6 million markers.