Previous genome studies suggest that all of us possess a few hundred non-functional genes. Is this number accurate, and if so, what does it mean for human health? In the February 17 Science, a large international group led by Daniel MacArthur at Massachusetts General Hospital, Boston, describes the first large-scale attempt to verify these predicted loss-of-function (LoF) gene variants. The group reports that each person more likely carries only about 100 non-functional genes, with around 20 of those being homozygous. This finding suggests that humans tolerate some gene loss surprisingly well, but more research is needed to determine if these non-functional proteins have subtle effects on health, the authors write. In total, the researchers identified a catalog of almost 1,300 “high-confidence” human LoF mutations, which they note includes most variants that occur at a population frequency of 1 percent or higher. About 70 percent of these are novel. This catalog could provide a starting point for geneticists hunting for disease-causing mutations, the authors suggest.
In 2008, an international consortium launched the 1000 Genomes Project, which sought to develop a comprehensive catalog of human genetic variation by sequencing the genomes of about 2,500 people from around the world. In the pilot phase, the project analyzed the genomes of 185 healthy people from four populations: Chinese, Japanese, people of European background living in Utah, and a West African ethnic group, the Yoruba. Initial data from this project, as well as from other genetic studies, suggested that each person carries around 200 to 300 LoF gene copies (see ARF related news story on 1000 Genomes Project Consortium, 2010; Pelak et al., 2010). However, this number was expected to contain a high percentage of false positives due to sequencing and annotation errors.
To clean up the data, MacArthur and colleagues took the nearly 3,000 putative LoF variants found by the pilot 1000 Genomes Project and reanalyzed the DNA to check for sequencing errors. They also identified mutations that would be unlikely to inactivate the gene product, for example, a stop codon that occurs near the end of an open reading frame. This validation process removed more than half the candidates, leaving a catalog of 1,285 high-confidence LoF variants. The authors note, however, that the true number is probably larger, as their methods could not detect some LoF mutations, such as variants that affect gene regulation or single-nucleotide changes that disrupt protein function. Notably, about one-third of these high-confidence LoF alleles are predicted to inactivate only some of the splice forms made from that gene, though these may still cause disease if the splice form is essential, the authors note (see, e.g., Uzumcu et al., 2006).
How do these ineffective gene products influence human health? At least some of the LoF alleles carry serious consequences when homozygous. The authors identified 26 known disease-causing mutations in the catalog, as well as 21 novel variants that they predict will be harmful, given that other mutations in the same genes are known to cause disease. In addition, most of the LoF variants occur at less than 5 percent frequency in the population, suggesting that they are selected against, as would be expected of harmful genes.
Nonetheless, for many of the genes in the LoF catalog, humans seem to cope with loss of their protein products. Previous work estimated that a typical person carries less than two recessive lethal genes (see Bittles and Neel, 1994), suggesting that most LoF mutations occur in non-essential genes. In fact, around 20 percent of the LoF variants were homozygous in at least one person in the study without causing serious health consequences. The authors found that these “LoF-tolerant” genes differ in several ways from typical genes. They tend to be less conserved among species, which implies that they are less essential to survival. They often have closely related gene family members (paralogs), which may have redundant function and help compensate for the loss of the gene. LoF-tolerant genes also interact minimally in gene networks, suggesting their functions are not part of key cellular pathways. And they tend to occur in non-essential systems, such as olfaction and taste, or in benign human differences such as blood groups. Based on these findings, the authors developed an algorithm to predict which genes will tolerate functional loss. This, they suggest, could be used to prioritize LoF candidates for follow-up in disease studies.
To see if the LoF variants might play a role in common, complex disorders, the authors analyzed genetic data from around 13,000 people with diseases such as Crohn’s and rheumatoid arthritis, as well as from about 3,000 healthy controls. People with diseases did not show any enrichment for LoF alleles, with the sole exception of a single mutation known to be associated with Crohn’s disease. This may be because most LoF variants occur at very low population frequencies, giving them a limited role in common diseases, the authors suggest. More research will be needed to see if rare, inactive genes have effects on human health, they add. Finally, some LoF mutations may be beneficial. For example, inactivation of a cholesterol-regulating protein protects against high cholesterol and heart disease (see Mayne et al., 2011), and loss of a specific cell-surface receptor in an African population may decrease the risk of getting malaria (see Fry et al., 2009). Although the authors did not find clear evidence for favorable mutations in their catalog, they identified 20 candidates that might be undergoing positive selection for further follow-up.—Madolyn Bowman Rogers
- A map of human genome variation from population-scale sequencing. Nature. 2010 Oct 28;467(7319):1061-73. PubMed.
- Pelak K, Shianna KV, Ge D, Maia JM, Zhu M, Smith JP, Cirulli ET, Fellay J, Dickson SP, Gumbs CE, Heinzen EL, Need AC, Ruzzo EK, Singh A, Campbell CR, Hong LK, Lornsen KA, McKenzie AM, Sobreira NL, Hoover-Fong JE, Milner JD, Ottman R, Haynes BF, Goedert JJ, Goldstein DB. The characterization of twenty sequenced human genomes. PLoS Genet. 2010 Sep;6(9) PubMed.
- Uzumcu A, Norgett EE, Dindar A, Uyguner O, Nisli K, Kayserili H, Sahin SE, Dupont E, Severs NJ, Leigh IM, Yuksel-Apak M, Kelsell DP, Wollnik B. Loss of desmoplakin isoform I causes early onset cardiomyopathy and heart failure in a Naxos-like syndrome. J Med Genet. 2006 Feb;43(2):e5. PubMed.
- Bittles AH, Neel JV. The costs of human inbreeding and their implications for variations at the DNA level. Nat Genet. 1994 Oct;8(2):117-21. PubMed.
- Mayne J, Dewpura T, Raymond A, Bernier L, Cousins M, Ooi TC, Davignon J, Seidah NG, Mbikay M, Chrétien M. Novel loss-of-function PCSK9 variant is associated with low plasma LDL cholesterol in a French-Canadian family and with impaired processing and secretion in cell culture. Clin Chem. 2011 Oct;57(10):1415-23. PubMed.
- Fry AE, Ghansa A, Small KS, Palma A, Auburn S, Diakite M, Green A, Campino S, Teo YY, Clark TG, Jeffreys AE, Wilson J, Jallow M, Sisay-Joof F, Pinder M, Griffiths MJ, Peshu N, Williams TN, Newton CR, Marsh K, Molyneux ME, Taylor TE, Koram KA, Oduro AR, Rogers WO, Rockett KA, Sabeti PC, Kwiatkowski DP. Positive selection of a CD36 nonsense variant in sub-Saharan Africa, but no association with severe malaria phenotypes. Hum Mol Genet. 2009 Jul 15;18(14):2683-92. PubMed.
- MacArthur DG, Balasubramanian S, Frankish A, Huang N, Morris J, Walter K, Jostins L, Habegger L, Pickrell JK, Montgomery SB, Albers CA, Zhang ZD, Conrad DF, Lunter G, Zheng H, Ayub Q, DePristo MA, Banks E, Hu M, Handsaker RE, Rosenfeld JA, Fromer M, Jin M, Mu XJ, Khurana E, Ye K, Kay M, Saunders GI, Suner MM, Hunt T, Barnes IH, Amid C, Carvalho-Silva DR, Bignell AH, Snow C, Yngvadottir B, Bumpstead S, Cooper DN, Xue Y, Romero IG, , Wang J, Li Y, Gibbs RA, McCarroll SA, Dermitzakis ET, Pritchard JK, Barrett JC, Harrow J, Hurles ME, Gerstein MB, Tyler-Smith C. A systematic survey of loss-of-function variants in human protein-coding genes. Science. 2012 Feb 17;335(6070):823-8. PubMed.