Like baseball coaches selecting top prospects for their team, geneticists studying Alzheimer disease face a formidable challenge: how to identify the true stars on a playing field now sprawling with hundreds of genes that might contribute to AD risk. At present, ApoE tops the roster, with early onset AD genes APP, PSEN1, and PSEN2 rounding out the short list of established AD genetic risk factors. However, since only 1 to 5 percent of Alzheimer cases are early onset, ApoE is the field’s only “major league” gene. At least that’s how baseball fan Rudy Tanzi, Massachusetts General Hospital, Boston, sees it. “If ApoE is a major leaguer, the other genes on the list would be Little Leaguers,” wrote Tanzi, who spoke at the Alzheimer disease conference held 24-29 March in Keystone, Colorado, and corresponded with this reporter via e-mail after the meeting.

So what can be said about these so-called Little League genes? At this point, not much. According to Alzgene, a publicly available database that keeps a running scorecard of more than 500 genes implicated thus far as contributing to AD risk, all but ApoE increase disease risk just 1.1- to 1.4-fold. By comparison, the ApoE4 allele boosts AD risk about threefold when one copy is inherited, more than 10-fold if two copies are inherited (Corder et al., 1993).

It’s unlikely that any Little League genes will even come close to ApoE in their impact on AD risk. Most have relative risk ratios so small that researchers cannot reliably replicate them. Scaling up sample sizes from several hundred to tens of thousands of cases should help overcome this challenge, as it has in recent studies in other fields such as diabetes and cancer, and in an AD study of more than 17,000 gene variants in 4,000 volunteers conducted by researchers at Cardiff University, United Kingdom (Grupe et al., 2007).

In the meantime, a growing number of scientists in the AD field are looking toward integrative approaches that don’t consider genes individually but rather as groups of factors that together shape disease risk. “If you combine information on many gene variants, you get a better grasp on risk,” said Elizabeth Corder, Duke University, Durham, North Carolina, lead author on the 1993 Science paper reporting the discovery of ApoE4 as an AD risk factor (Corder et al., 1993). In a phone conversation with this reporter, Corder likened the identification of AD genes to the process by which physicians diagnose disease. Fever is a symptom common to many disorders and, though important, is not that informative in and of itself, Corder explained. “But when it is used in combination with other symptoms, doctors can make a more accurate diagnosis.”

An ongoing challenge in the study of complex, polygenic disorders such as AD concerns the development of informative case definitions. “What we're trying to do is look at alternate ways of clustering genes and phenotypes,” said Deborah Blacker, a geriatric psychiatrist and epidemiologist working on Alzheimer’s genetics at Harvard Medical School, Boston. “If you consider 20,000 ways to define things, you have to account for the fact you could find something by chance by massaging the data. People are slogging along to find a better way to do this. The field is somewhat stuck now. What [Corder] is doing is one thing that could be brought to bear.”

At the Alzheimer disease Keystone meeting in March, Corder presented data from a paper about an alternative approach to analyzing AD genetic data (Licastro et al., 2007). With colleagues at the University of Bologna and the University of Palermo in Italy, she applied a statistical method called Grade-of-membership (GoM) to analyze 260 AD patients and 190 controls. Combining a slew of information—including genetic profiles for ApoE and a handful of genes involved in inflammation (IL-6, IL-10, IL-1α, IL-1β, TNFα), along with other factors such as gender and age of disease onset—first author Federico Licastro and colleagues grouped the subjects into four risk sets that differ according to their likelihood of developing AD. For each risk set, each person received a membership score (i.e., a value between 0 and 1) reflecting his or her probability of belonging to that set. According to this analysis, variations among the genes related to inflammation—especially IL-10 and IL-1β—turned out to be more informative in identifying the risk sets than was ApoE.

Corder and colleagues (Licastro et al., 2007) have used the same GoM method to identify genetic risk sets for heart attack. As it turns out, the risk factors for AD and heart attack are quite similar. A manuscript in preparation will describe these findings, Corder said. “While these models are not perfect or complete, I think they represent a major improvement by being inclusive,” she told Alzforum, noting constraints of single-gene studies. “They're more biologically realistic.”

As it has been used to identify overlapping risk profiles for AD and heart attack, Corder said, GoM could be similarly applied in future studies involving comparisons of risk sets—for example, those that would distinguish AD from dementia with Lewy bodies (DLB).

Developed in the early 1990s by Duke mathematician Max Woodbury for social science applications, GoM has rarely been applied in disease research, let alone AD genetics. Many AD geneticists who corresponded with this reporter for the story were not familiar with the statistical approach nor with the findings Corder presented at Keystone earlier this year. Chris Carter, a UK-based systems biologist who formerly led the neuroscience genomics group at a European pharmaceutical company, was among the few who knew of and was impressed by Corder’s work. It “injects some hard statistics into the complex reality of multi-factorial polygenic diseases and shows that risk is better matched to sets of relevant gene variants than to any particular gene polymorphism,” he wrote via e-mail (see also Carter comment below).

Hilkka Soininen and colleagues at the University of Kuopio, Finland, have used the GoM approach to analyze single nucleotide polymorphisms of the apolipoprotein D gene in AD patients. In this study of 394 Finnish AD patients and 470 controls (Helisalmi et al., 2004), traditional analyses of SNP data and GoM analysis “provided comparable results, suggesting that GoM might be useful in profiling AD subpopulations,” Soininen wrote in an e-mail to ARF.

Rosalind Neuman, a mathematician at Washington University, St. Louis, has compared GoM with a related statistical approach called the latent class model in analyses of other conditions such as attention deficit hyperactivity disorder and alcoholism. “These methods may be very useful for trying to refine phenotypes in biological systems that are complicated,” Neuman told Alzforum.

Separate work published last month in the journal BMC Medical Genetics strengthens the push for more AD studies that look at gene-gene interactions rather than single-locus interactions. In this study of 200 late-onset AD patients and controls, researchers led by Craig Atwood, University of Wisconsin, Madison, uncovered a surprising reversal of risk: males with an ApoE4 allele who also carried a luteinizing hormone receptor intronic variant had almost no risk for AD (Haasl et al., 2008). “While the study is small,” wrote Atwood via e-mail to ARF, “it may help explain why some studies see an interaction with one gene, but then another study does not (i.e., all the hundreds of studies in AlzGene).” Despite ApoE4’s strong overall effect on AD risk, far from every person who inherits even two copies of this allele develops AD.

Amid growing recognition that new statistical approaches could more effectively cull top prospects from the swarms of minor genes, other researchers instead place their bets on genome-wide association studies. The typical AD human genetics study involves hundreds of cases, which doesn’t have enough statistical power, said Lars Bertram of Massachusetts General Hospital, who also is the lead investigator and scientific coordinator for Alzgene and related databases. But with concerted efforts to pool samples for systematic analyses of specific gene interactions, he said, even genes with very small effects on AD risk will come to the forefront. (For more about how data sharing can increase the power of human genetics studies, see ARF Live Discussion).

Citing genome-wide association studies in cancer and diabetes involving tens of thousands of samples (Sladek et al., 2007; Hung et al., 2008), Alison Goate, Washington University, St. Louis, Missouri, agreed, “At that point, you really can determine whether something was a real effect. That is where AD needs to go.”—Esther Landhuis


  1. This is exactly what needs to be done in AD genetics and in other complex disorders where multiple, physiologically plausible, gene candidates have been identified that are in many cases relevant to the complex multifactorial pathologies of these diseases (see Alzgene, SchizophreniaGene and Polygenic Signaling Pathways). Combarros et al. have reviewed the evidence for statistical epistasis between candidate genes in AD and illustrate that dozens of genes can influence the risk-promoting effects of several others. Such studies usually report on gene pair interactions (for example Apo4 + one other), but when so many genes are implicated in these polygenic disorders (over 200 in late-onset AD), it is clear that a more complex permutational approach is needed. Single candidate gene association studies are notoriously inconsistent in these types of polygenic disorders, but given the number of genes and pathological processes involved, and the multitude of potential interactions involved (epistatic and others), this is perhaps not too surprising.

    Elizabeth Corder's pioneering work injects some hard statistics into the complex reality of multifactorial polygenic diseases and shows that risk is better matched to sets of relevant gene variants than to any particular gene polymorphism. A similar argument has been proposed for bladder cancer, another polygenic disorder plagued by inconsistency in single gene association studies (Wu et al., 2006). These observations surely simply reflect the complex multifactorial nature of such diseases, in which genetic polymorphisms are but one way to control the malfunction of several different pathological processes, and the efficiency of their counteracting networks.

    This type of systems biology approach should be extremely useful in the analysis of the wave of whole-genome association studies currently underway.


    . Epistasis in sporadic Alzheimer's disease. Neurobiol Aging. 2009 Sep;30(9):1333-49. PubMed.

    . Bladder cancer predisposition: a multigenic approach to DNA-repair and cell-cycle-control genes. Am J Hum Genet. 2006 Mar;78(3):464-79. PubMed.

  2. Regarding the value of identifying all of the minor genetic risk factors for AD and the role of systems biology approaches in AD research, both are important endeavors and essential to elucidating the biological pathways involved in the etiology and pathogenesis of AD. Using a baseball analogy, if ApoE is our only "major league" gene, the next best AD genes would have to be considered “Little Leaguers,” i.e., those that have only minor effects on risk, but yield statistically significant p-values by meta-analyses on Alzgene (see "top Alzgene results"). The “top Alzgene results” (total of 29) are routinely updated and ranked by Lars Bertram and colleagues according to greatest effects on AD risk. ApoE is, predictably, number one. One copy of ApoE increases risk for AD by ~threefold, and two copies, by more than 10-fold. In contrast, the other 28 genes listed in the “top Alzgene results” increase risk by Some have questioned the value of determining the identity of these “Little Leaguers.” I would argue that although the “Little League” genes exert only minor effects on AD risk, if we can statistically confirm enough of them, e.g., by testing in multiple independent samples and employing meta-analyses, we will eventually be able to take this growing list and begin searching for biological pathways in which these genes functionally intersect. Bioinformatic and systems biology approaches on a large set of genes with minor, but statistically significant effects on AD risk should provide valuable clues to the etiology and pathogenesis of AD.

    So, I believe that we need to establish the full list of “Little League” genes, including the vast majority with minor effects, by meta-analyses (as determined on Alzgene), and then determine the biological pathways in which they may functionally intersect. In this way, we should be able to discover novel biological pathways involved in the disease. The key is to perform the genetic studies first, so as to establish the full team of relevant players, even if they are mainly “Little Leaguers.” If we assemble a big enough team, we can then search for the biological pathways in which they functionally intersect.

    I would also argue that systems biology approaches best follow genetic studies, not drive them. The full set of genetic risk factors for AD can be established by the “biased” approach of testing biologically plausible candidate genes, or by “unbiased” genetic analyses, e.g., genomewide association screens. The strongest hits emerging from these studies must be subjected to replication testing in multiple independent samples and meta-analyses, e.g., on Alzgene, to determine which ones carry the highest probability of being real risk factors. The statistically confirmed AD genes, the vast majority of which will exert only minor effects on risk, can then be compared to the "x, y, and z" variables that must be integrated into a large and complex algebraic equation. You cannot solve (or in this case, even formulate) the equation until you start to replace the many "variables" with "givens.” Testing multiple genes in multiple independent samples and then conducting meta-analyses across all samples to see which ones statistically pass muster can reveal these "givens.” Subsequently, systems biology can be employed to better formulate and ultimately solve the equation.

    The history of the AD field has shown that genetics has best informed us as to the identity of the relevant biological factors involved in the etiology and pathogenesis of this disease. Today, it is difficult to carry out a relevant experiment in AD without working on one of the four established AD genes (APP, PSEN1, PSEN1, ApoE) revealed by genetics studies. These four genes have been estimated to account for only 30 percent of the genetic variance of AD. “Sporadic” AD is heavily influenced by genetic factors; twin studies have revealed that up to 80 percent of AD is caused by inherited factors. Dozens of labs around the world are attempting to identify novel AD genes. Most use the biased approach of testing biological candidate genes. A growing number are using the unbiased approach of genomewide association screens. For the past two years, our lab has been conducting the "Alzheimer's Genome Project" (supported by the Cure Alzheimer’s Fund), in which we are employing genomewide association screening of more than 400 late-onset AD families (NIMH sample) with replication testing in more than 900 additional late-onset AD families (NIA, NCRAD, and CAG samples) to search for population-based AD genetic risk factors for late-onset AD of all effect sizes, beyond ApoE.

    We now realize that there are no more “major league” AD genes like ApoE. However, the field continues to obtain and publish novel genetic hits every week. Our manuscript presenting some of the strongest hits from our family-based genomewide association screen is currently under review. Ultimately, only meta-analyses of these hits in multiple independent samples will reveal whether they will ultimately join the growing list of “top Alzgene results” and be worthy of inclusion in bioinformatic and systems biology analyses. Every newly confirmed gene, even those with minor effects, will contribute to the elucidation of novel biological pathways, providing new clues for effectively treating and preventing AD.

  3. What about the concept that AD, instead of being a common disease with a number of "risk factors,” might be a set of numerous rare diseases with a more or less common phenotype, but each with a different cause? That way, there would be no Little Leaguers within, say, the U.S., but only Major Leaguers, each one within its own small country. The problem is partly semantic: "risk factor" in a statistical, non-bayesian, context means a greater-than-chance association with, say, a disease, which does not entail a causal relationship, whereas "factor,” meaning etymologically "agent,” connotes a cause.

    The Tanzi "Little League,” the early onset AD (EOAD) genes APP, PSEN1, and PSEN2, provide a ready-made model (Bruni et al., 1992) for the proposed concept of AD as a multigenic, as distinct from polygenic, entity. The only needed postulate is that late-onset AD (LOAD) genes/mutations are expressed stochastically like the EOAD ones, but much later. Death from other causes before expression of the mutation masks the familial transmission, giving the appearance of "sporadic" disease to LOAD.


    . Alzheimer's disease: a model from the quantitative study of a large kindred. J Geriatr Psychiatry Neurol. 1992 Jul-Sep;5(3):126-31. PubMed.

Make a Comment

To make a comment you must login or register.


Webinar Citations

  1. Whole Genome Study for Parkinson Disease

Paper Citations

  1. . Gene dose of apolipoprotein E type 4 allele and the risk of Alzheimer's disease in late onset families. Science. 1993 Aug 13;261(5123):921-3. PubMed.
  2. . Evidence for novel susceptibility genes for late-onset Alzheimer's disease from a genome-wide association study of putative functional variants. Hum Mol Genet. 2007 Apr 15;16(8):865-73. PubMed.
  3. . Genetic risk profiles for Alzheimer's disease: integration of APOE genotype and variants that up-regulate inflammation. Neurobiol Aging. 2007 Nov;28(11):1637-43. PubMed.
  4. . Acute myocardial infarction and proinflammatory gene variants. Ann N Y Acad Sci. 2007 Nov;1119:227-42. PubMed.
  5. . Genetic variation in apolipoprotein D and Alzheimer's disease. J Neurol. 2004 Aug;251(8):951-7. PubMed.
  6. . A luteinizing hormone receptor intronic variant is significantly associated with decreased risk of Alzheimer's disease in males carrying an apolipoprotein E epsilon4 allele. BMC Med Genet. 2008;9:37. PubMed.
  7. . A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature. 2007 Feb 22;445(7130):881-5. PubMed.
  8. . A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature. 2008 Apr 3;452(7187):633-7. PubMed.

External Citations

  1. Alzgene

Further Reading


  1. . Epistasis in sporadic Alzheimer's disease. Neurobiol Aging. 2009 Sep;30(9):1333-49. PubMed.
  2. . Cluster analysis of risk factor genetic polymorphisms in Alzheimer's disease. Neurochem Res. 2009 Jan;34(1):23-8. PubMed.

Primary Papers

  1. . Genetic risk profiles for Alzheimer's disease: integration of APOE genotype and variants that up-regulate inflammation. Neurobiol Aging. 2007 Nov;28(11):1637-43. PubMed.
  2. . Acute myocardial infarction and proinflammatory gene variants. Ann N Y Acad Sci. 2007 Nov;1119:227-42. PubMed.
  3. . A luteinizing hormone receptor intronic variant is significantly associated with decreased risk of Alzheimer's disease in males carrying an apolipoprotein E epsilon4 allele. BMC Med Genet. 2008;9:37. PubMed.