4 April 2012. Sequencing a person’s entire genome will foretell many diseases only marginally better than gazing into a crystal ball, according to a report in the April 2 Science Translational Medicine online. Researchers at Johns Hopkins University in Baltimore, Maryland, modeled data from identical twins to determine if a person’s genome can predict disorders such as cancer, Alzheimer’s, and heart disease. The results sound neither a ringing endorsement nor a death knell for whole-genome sequencing. The analysis suggests that the average person’s genome would likely predict one future condition with a reasonable degree of accuracy. But it would also yield a host of negative results that are largely meaningless, because all they indicate is that the person’s risk of those conditions is no higher than the risk to the general population.
A few notable exceptions: Sequencing would predict many cases of Alzheimer’s disease, the scientists found, as well as thyroid autoimmunity, type 1 diabetes, and coronary heart disease in men. Ultimately, the value of whole-genome sequencing will be unique to each individual and his or her situation, said Nicholas Roberts, who was co-first author with Joshua Vogelstein. “We hope to start debate about the merits of personal genome sequencing based on this model or other models,” Roberts said. Bert Vogelstein, co-senior author with Victor Velculescu, presented the study on April 2 at the American Association for Cancer Research annual meeting in Chicago, Illinois.
Roberts and colleagues based their approach on sets of identical twins. Assuming that the pairs would have matching genomes, any variation would be due to environmental factors or the random nature of some illnesses. The researchers did not actually sequence anyone for this study, nor did they need sequences for their analysis. Rather, they input into the computer model the number of instances in which both twins had a disease, both were healthy, or were mismatched. The scientists culled these data from several twin studies performed in the U.S. and Europe, including Gatz et al., 1997, for Alzheimer’s and dementia and Tanner et al., 1999, for Parkinson’s.
The computer analyzed the matched and mismatched twin pairs, and tried to come up with different scenarios for how their genomes could result in the observed outcomes. For example, the Alzheimer’s data included two pairs who both had AD, eight mismatched pairs, and 388 pairs who were all healthy. Given that each pair had identical genomes, how could those genomes result in this particular distribution? The program assumed that each genome carried a specific genetic risk for Alzheimer’s, say, 7 percent or 15 percent or 70 percent. By randomly trying different risk levels for each of the genomes, the computer worked out several hypothetical scenarios by which the twins’ genes could lead to their disease states. It settled on the version most likely to predict disease, which the authors used as a model for how closely genomes and conditions might align.
The study rests on two major assumptions, Roberts said, making it a “best-case scenario” rather than an accurate representation of genetic sequencing as it stands today. For one, the model presupposes that every genome sequence is 100 percent accurate; in reality, mistakes are currently inevitable. Second, the model assumes that researchers know every variant linked to a given disease, and how all the alleles work together to determine risk. While this information is not necessary for the modeling that Roberts and colleagues performed, it would be required for sequencing to actually predict disease as they envision. This is an ideal which science is likely to someday approach, but never reach, said Nathan Pearson, director of research for Knome, Inc., of Cambridge, Massachusetts. Knome helps researchers interpret genome data to further understanding of disease.
Peter Visscher of the University of Queensland in Brisbane, Australia, said the model is “unusual and unconventional” for studies trying to answer questions about genetic predictions. For one thing, the model assumes there are up to 20 discrete risk levels for each disease. Many diseases incorporate risk from hundreds or thousands of loci, which means the number of possible genotypes and risk levels is closer to infinity, Visscher said. Other researchers have modeled sequencing and risk with unlimited possible genomes (Janssens et al., 2006), noted Visscher. He prefers his own methods, which also do not restrict genotypes (Wray et al., 2010). In addition, Visscher said, in studies like this it is difficult to untangle the genes shared by twins from all the other things they share, such as upbringing and diet, meaning that the model is likely to overestimate genetic contributions to disease.
For most of the 24 diseases considered, sequencing would miss the majority of cases, Roberts said. And a negative result would not be a “free pass,” he added; it would only mean no genes point to a higher-than-average risk. “[It] is sobering,” wrote Svante Pääbo, of the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, in an e-mail to ARF (see full comment below). “I would have naïvely thought that monozygotic twins would be more similar in the diseases that afflict them.”
John Hardy of University College London, U.K., was not so surprised. He has identified a pair of twins who share the late-onset autosomal dominant Park8, or G2109S, mutation in leucine-rich repeat-kinase type 2, but only one of them is sick with the disease. The finding will be published in an upcoming issue of Movement Disorders. The phenotypic differences between monozygotic twins might be due to epigenetics, somatic mutations, or random chance, suggest Susanne Schneider of the University of Lübeck, Germany, and Michael Johnson of Imperial College London, U.K., in a related editorial that will be published with Hardy’s paper in the same journal.
In contrast to the majority of diseases, the model predicted that genetic screening for AD was highly sensitive. Of people destined to develop Alzheimer’s, sequencing would identify 80-90 percent. This might be because AD has a particularly strong genetic basis; alas, it is impossible to tell at this point how much of this prediction is due to specific genes such as ApoE. The team also analyzed Parkinson’s disease, for which genes could predict 20-30 percent of cases, and dementia in general (including both AD and vascular dementia), for which sequencing would identify 50-60 percent of people who would eventually have it. No other neurodegenerative conditions were examined.
“This paper is, in some sense, a grain of salt to accompany the great expectations for whole-genome analysis,” Pearson said. “For many people, [sequencing] is going to provide limited insight into the risk for many common diseases.” On the positive side, he added, the study found that more than 90 percent of people would receive a useful prediction of above-average risk for at least one condition. The currently popular genomewide association studies to identify common but weak risk variants, Roberts said, remain valuable for the clues they provide to biological pathways involved in disease.
One positive interpretation of the study, Pääbo noted, is that “our destiny is not in our genes. Rather, many other things that we can influence, such as our lifestyle and doing medical checkups, may be much more important for reducing our risk of prematurely being affected by diseases.”—Amber Dance.
Roberts NJ, Vogelstein JT, Parmigiani G, Kinzler KW, Vogelstein B, Velculescu VE. The predictive capacity of personal genome sequencing. Sci Transl Med. 2012 Apr 2. Abstract