Looking outside the Brain for Early Signs of AD
Quick Links
The use of gene expression profiling to characterize Alzheimer disease pathology has, unsurprisingly, focused on changes in brain tissue (see ARF related news story), but the search for biomarkers that might allow early diagnosis is moving outside of that anatomical box. Reaching for a more easily attainable tissue, researchers from Sumitomo Pharmaceuticals in Osaka, Japan, and the Karolinska Institute in Sweden chose fibroblasts as the starting point for comparing global gene expression patterns between people with FAD mutations and their wild-type siblings. Their results show a clear gene expression signature that can distinguish FAD gene carriers from non-carriers long before signs of dementia appear. Interestingly, the changes in gene expression caused by three different FAD genes (the Swedish and Arctic APP mutations and PSEN1 H163Y) were all similar, suggesting that mutations in either APP or PS1 can cause common changes in the physiology of cells outside the brain long before clinical disease sets in. The research appears online in the September 29 PNAS Early Edition.
For the study, first author Yosuke Nagasaka and colleagues probed the genomewide expression of cultured fibroblasts from skin biopsies. The investigators found 56 genes that were most highly differentially expressed and 200 that showed smaller but significant differences. The gene expression profile using 200 genes predicted FAD status with 97 percent accuracy, regardless of whether the subjects displayed signs of dementia or not. Other factors, such as age, ApoE status, or gender did not seem to contribute to the difference in gene expression observed.
The authors did not list the identities of the differentially expressed genes in the paper but note they will make them available on request. The Alzheimer Research Forum has made such a request. The authors characterize the significance of their finding as “a unique gene expression signature for FAD caused by three different mutations in two different genes…that…can be detected in fibroblasts, which may seem to be an organ completely unrelated to the tissue affected by the disease.” Within the 56 or 200 genes may lay smaller sets or even individual genes that represent potential biomarkers for early AD.
The elephant on the chip is the open question of whether the gene signature exposed in FAD fibroblasts will translate in any way to sporadic AD. The similarities of the profiles with three mutations suggest that underlying changes in gene expression might be central to all types of AD. But fibroblasts express APP and PS1 proteins, putting them directly in line for FAD-related effects. The question of whether similar alterations, or any at all, occur in fibroblasts in sporadic AD will be of considerable interest for the development of surrogate markers for brain pathology.—Pat McCaffrey
Updated 5 October 2005:
See Supplemental Article with gene list (.pdf)
Q&A with Toru Kimura and Caroline Graff. Questions by Pat McCaffrey.
Q: Of possible tissues, why did you choose fibroblasts for your analysis?
A: We agree that the much more common and easy procedure of sampling peripheral blood would make lymphocyte or lymphoblast studies preferable. Actually, in parallel with the fibroblast biopsies, peripheral lymphocytes (blood) were sampled from the same family members, followed by microarray hybridizations. Unfortunately, the inter-individual variation was very high and in some cases (more often than in the fibroblasts), the RNA quality was too poor and we were thus unable to interpret the hybridization signals. Actually, we experience from this study that skin biopsies appear to be less sensitive to differences in external conditions directly after sampling as compared to lymphocytes, which require direct handling after sampling.
Besides the bad RNA quality of our lymphocyte samples, there are several other reasons why fibroblasts are more attractive than lymphoblasts. First, we would like to challenge the appropriateness of using lymphoblasts since these cell lines are established after immortalization, typically with Epstein-Barr virus transformation. This in itself changes the genetic make-up of the cells, leading to uncontrolled changes in the genome which theoretically may lead to spurious changes in gene expression. Second, the use of RNA from peripheral blood lymphocytes may be an attractive alternative, but these cells are very reactive to acute stimuli such as nutritional status, fever, infections, and drug treatment; this makes them less attractive. Moreover, if it were possible to make a presymptomatic diagnosis using fibroblasts and there were drugs which could delay the onset of symptoms, we believe that a skin biopsy would be tolerated and requested by most patients. For example, many muscle dystrophies and myopathies can only be diagnosed based on results from muscle biopsies which are routinely sampled on patients for this purpose.
Q: Your gene expression profiles clearly distinguish FAD mutation carriers from wild-type siblings, but what about sporadic AD? Do you expect to see the same differences in the absence of a causative mutation?
A: Naturally we are interested in investigating the gene expression profile in patients with the common, sporadic forms of AD. It can be anticipated that there will be shared changes on the gene expression level between sporadic and familial AD since the diseases are clinically indistinguishable except for the age at onset and the family history. However, it is also plausible that the gene signature we have identified is related to the biochemical pathways perturbed by the specific FAD mutations included in this study. If this is true, we may find a similar profile in other AD-causing APP, PSEN1, and PSEN2 mutations. Therefore our next step will be to characterize family members with other FAD-causing mutations. If we can validate the gene signature in additional FAD mutation carriers, it may be possible to use the signature in order to identify sporadic AD patients who share the same gene expression profile. That is, the heterogeneous nature of sporadic AD suggests that the etiology is also heterogeneous. This makes it unlikely that all sporadic AD patients will share the same gene expression profile. However, such gene expression classification may serve as a tool to subcategorize the disease etiologies in the common forms of the disease.
Q: Your paper doesn't talk about which genes were affected by the mutations. Did these fall into any interesting classes, for example, cell cycle genes, or other groups?
A: Actually, the 200 differentially expressed genes were subjected to functional classification based on their known functions using the FatiGO program. FatiGO is a Web interface which carries out simple data mining using Gene Ontology for DNA microarray data. The FatiGO results showed that effects were seen in virtually all biological processes. The three largest functionally categorized groups are those of metabolism, cellular growth, and/or maintenance, as well as cell communication.
Q: Do your results suggest any good candidate genes for stand-alone analysis as biomarkers?
A: We intend to follow up the study in order to validate our findings and, if possible, reduce the number of informative genes. At this point it is unlikely to expect a single gene or a handful of genes to be sufficient. However, it may be true that some of the gene products could be used as biomarkers to categorize the heterogeneous sporadic AD patients into more homogeneous subgroups.
Q: What happens next to validate or further develop the gene expression profiles as diagnostic tool? Will you be pursuing that, and if so, how?
A: Yes, we are willing to and going to pursue this approach of gene expression analysis of RNA from fibroblasts. First, we are planning to analyze more FAD samples, which are independent from the 30 samples we analyzed in this study, in order to see if the same expression differences can be observed. Then we will assess if the expression differences can be observed in sporadic AD. We consider this study as exploratory, and our ambition is to make a large and carefully performed validation study with additional samples. We are continuously collecting samples from these rare FAD mutation families; however, it is a very slow and time-consuming process, and we hope that the data from this paper will encourage further collaboration and perhaps thereby also speed up the follow-up validation. Beside the collection of FAD mutation families, we are planning to validate the gene signature in sporadic AD patients, as well as validation with respect to other neuropathological conditions.
References
News Citations
Other Citations
External Citations
Further Reading
Papers
- de Leon MJ, Desanti S, Zinkowski R, Mehta PD, Pratico D, Segal S, Rusinek H, Li J, Tsui W, Saint Louis LA, Clark CM, Tarshish C, Li Y, Lair L, Javier E, Rich K, Lesbre P, Mosconi L, Reisberg B, Sadowski M, DeBernadis JF, Kerkman DJ, Hampel H, Wahlund LO, Davies P. Longitudinal CSF and MRI biomarkers improve the diagnosis of mild cognitive impairment. Neurobiol Aging. 2006 Mar;27(3):394-401. PubMed.
- Blalock EM, Geddes JW, Chen KC, Porter NM, Markesbery WR, Landfield PW. Incipient Alzheimer's disease: microarray correlation analyses reveal major transcriptional and tumor suppressor responses. Proc Natl Acad Sci U S A. 2004 Feb 17;101(7):2173-8. PubMed.
Primary Papers
- Nagasaka Y, Dillner K, Ebise H, Teramoto R, Nakagawa H, Lilius L, Axelman K, Forsell C, Ito A, Winblad B, Kimura T, Graff C. A unique gene expression signature discriminates familial Alzheimer's disease mutation carriers from their wild-type siblings. Proc Natl Acad Sci U S A. 2005 Oct 11;102(41):14854-9. PubMed.
Annotate
To make an annotation you must Login or Register.
Comments
University of Kentucky
This paper describes an innovative and interesting use of gene microarrays for Alzheimer disease (AD) research. Prior microarray studies of AD have focused on identifying genes that are expressed differentially in the postmortem brains of idiopathic AD and control subjects, in attempts to elucidate the pathobiology of the disease. In contrast, the authors here use fibroblasts from living familial AD mutation bearers (most of whom are presymptomatic) to identify differentially expressed genes. In addition, they turn the identification process around and show that these genes also can discriminate subjects bearing three known familial AD (FAD) mutations from their wild-type siblings. To do this, the authors first employ Allen’s cross-validation test (CV) to identify 200 genes expressed differentially in fibroblasts from FAD and wild-type subjects. They then apply two discriminant methods, hierarchical clustering and principal components analysis, using these 200 genes, to accurately classify all of the same subjects.
The novel features of this work include the use of peripheral tissues, the study of expression profiles linked to FAD, and the study of FAD subjects who were presymptomatic. Another strength is that the group sample sizes (n) were sufficiently large (30 total samples) to provide good statistical power. However, given these sample sizes, it is a little surprising that the authors relied on the somewhat arcane CV test rather than on more familiar and robust statistical tests for differential gene identification. With the information presented, it’s difficult to estimate the expected false discovery rate. Nonetheless, the authors used confirmatory statistical approaches (e.g., leave one out validation) and the 200 genes identified, though likely containing some false positives, appear to reliably discriminate the classes.
Because the study focuses on diagnostic potential, its full implications may not become apparent until additional key studies are performed. That is, FAD carriers can also be discriminated at present by a genotyping test. Thus, as stated in the paper, it will be important to determine whether the identified signature can be extended to presymptomatic diagnosis of the common forms of idiopathic AD (which would suggest that similar pathways and programs become activated as AD develops, regardless of whether it arises from mutations or idiopathic factors).
Another question of particular interest, for several theoretical and experimental reasons, is whether there are similarities in the profiles expressed in FAD fibroblasts and those expressed in early-stage idiopathic AD brains. Consequently, a minor disappointing aspect is that the authors chose not to present their gene list or discuss its biochemical significance in this paper. At this point, therefore, we do not know whether there is greater overlap than expected by chance between their fibroblast list and lists identified in AD brains, for example, in our study that found over 600 hippocampally expressed genes correlated with incipient AD (Blalock et al., 2004) or in studies of other brain regions. However, the authors offer to make their gene list available upon request, so this question will presumably be resolved fairly soon.
Thus, in summary, this study clearly provides an innovative and important proof of principle. However, determining its full ramifications will likely require further assessment of the relevance of FAD fibroblast expression patterns to the development of incipient idiopathic AD.
References:
Blalock EM, Geddes JW, Chen KC, Porter NM, Markesbery WR, Landfield PW. Incipient Alzheimer's disease: microarray correlation analyses reveal major transcriptional and tumor suppressor responses. Proc Natl Acad Sci U S A. 2004 Feb 17;101(7):2173-8. PubMed.
Barrow Neurological Institute
The search for a biomarker that distinguishes AD from other neurological dementias is a fertile research area for both clinical, basic, and biotech investigators. In this article, findings are presented demonstrating the ability of gene array technology to identify differences in the genetic signature between those carrying one of three FAD gene mutations (Swedish and Arctic APP mutations and PSEN1 H163Y) from wild-type siblings lacking these mutations. Unlike many other studies, which have used brain tissue, these experiments were performed on cultured skin fibroblasts. The choice of fibroblasts is interesting, as they are an easily accessible source of cells to investigate gene differences in familial AD. These investigators demonstrated that fibroblast genetic signatures could distinguish FAD gene carriers from non-carriers prior to the onset of dementia. The observation that alterations in gene expression induced by the three different FAD genes overlapped suggests that mutations in either APP or PS1 cause a common physiologic cellular response, which can be detected in non-brain tissue prior to the clinical expression of the disease. The researchers report that 56 genes were most highly differentially expressed and 200 displayed smaller but significant differences. The gene expression profile using 200 genes predicted FAD status with 97 percent accuracy, independent of signs of dementia. Other factors, such as age, ApoE status, or gender did not seem to contribute to the difference in gene expression observed.
A major question arising from this research is what are the genes, and are the same genes related to onset of sporadic AD? This is an important issue since only 2-5 percent of AD cases are familial. Thus, the genetic signature reported may not be predictive of sporadic AD but unique to the FAD form of the disease. Since the precise genes were not listed in the article, it is difficult to know whether they are directly related to the neuropathologies of AD or are random sequences with unclear relation to the disease state. This will be determined once the genes are made available to the public. In this regard, the authors mention that they will make the gene available upon request and that they are interested in investigating whether the gene patterns occur in sporadic AD. Even if they do, it will be important to determine how they affect the structural and functional impairment seen during the early stages of AD, including in people with mild cognitive impairment. Moreover, it would be interesting to determine whether the differentiating genetic signatures are a characteristic of single neurons in areas of the brain selectively vulnerable to neuronal dysfunction, or are a general genetic pattern found in all human cells. The study by Nagasaka and colleagues is an intriguing jumping-off point, but it is still too early to slow the quest for the development of a biomarker for the determination of sporadic AD and, more importantly, a marker for the prodromal stages of AD.
Uppsala University
Expression arrays in Alzheimer disease mutation-carriers: a common biochemical pathway?
This paper paper ”A unique gene expression signature discriminates familial Alzheimer’s disease carriers from their wild-type siblings”, published in the October issue of Proc Natl Acad Sci USA by Nagasaka et al, is an interesting example of how gene expression array techniques can be applied in Alzheimer research. The use of this technology has been hampered by some fundamental problems. Most importantly, array experiments have mainly been performed on brain autopsy tissue, comparing samples from cases affected by dementia with those from individuals without any brain disorder. This design is problematic, as the results necessarily reflect the end stage of a disease process that typically has been ongoing for several decades.
The present study represents an attempt to circumvent this problem. By analyzing lymphocytes and fibroblasts from a few rare families with dominant mutations in the APP and presenilin genes, the investigators asked whether there are characteristic signatures in the transcriptome already at a presymptomatic stage of the disease. Subjects with and without mutations, representing APPSwe, APPArc, and presenilin H163Y mutations from three Swedish families were investigated in the study.
The lymphocyte analyses were unsuccessful, but an interesting expression profile emerged for the mRNA derived from cultured fibroblasts. After reverse transcription, the resulting cDNA was subjected to the Affymetrix Human Genome U133A GeneChip array, containing more than 22,000 gene probes. After excluding samples and genes with weak signal intensities, the expression of approximately 11,000 genes could be evaluated. When performing SVM and PCA, two different statistical analyses to discriminate between sample classes, the authors could conclude that non-mutation carriers described one cluster, whereas mutation carriers, regardless of mutation type and regardless of whether they were at a presymptomatic or symptomatic disease stage, were clustered together. Importantly, all three mutations seem to give rise to the same pattern of expression changes, indicating a common biochemical pathway. This pathway is most likely downstream of Aβ-induced changes.
A few minor mistakes can be found in the paper:
First, the authors have misunderstood our investigations concerning Aβ metabolism in the case of the Arctic mutation. In the original publication (1), we described low Aβ levels both in plasma from carriers of the Arctic mutation and in media from transfected cells, as measured by ELISA. In a recent paper (2), we have further investigated this question and found that ELISA is not well-suited for the measurement of Aβ, especially for aggregated peptides. Recent data from a mouse model with the Arctic mutation further indicate that Aβ accumulates inside the cell before resulting in extracellular deposits (3).
Second, the authors discuss the clinical and neuropathological features of the Arctic mutation family and point out that one case that came to autopsy at Huddinge Hospital had a peculiar neuropathology with ringlike plaques. However, in another case with the same mutation, a more “traditional” neuropathology was reported with cored plaques. We probably have to realize that within the AD spectrum a wide variety of neuropathological and clinical features may occur. In this context, it is important to remember the presenilin-1 mutations leading to cotton-wool plaques (4), which also do not fulfil traditional criteria of AD.
Third, the procedures for taking the biopsies have not been correctly described in the paper. For example, the tissues were not always processed the same day as the biopsy was performed, the reason being that the patients often had to be investigated far away from the laboratory. We know this because, in fact, the actual study was initiated by us some years ago, with the idea to investigate both primary fibroblast cell lines and lymphocytes from peripheral blood in order to identify transcriptional differences between non-affected family members and mutation carriers at presymptomatic and symptomatic stages. Over several years, we investigated and sampled these families and conducted a large number of arduous tours around Sweden to collect this unique material. The findings are an important contribution to the understanding of Alzheimer disease pathogenesis, interestingly pointing towards common downstream disease mechanisms. However, studies such as this depend upon good ideas for clinical research, identifying the mutations, contacts with patients and relatives, and then the collection of patient material. This is a painstaking effort that often requires years of devoted work. Such initiatives will be less likely to happen in the future if those behind the accomplishments are not acknowledged properly.
References:
Nilsberth C, Westlind-Danielsson A, Eckman CB, Condron MM, Axelman K, Forsell C, Stenh C, Luthman J, Teplow DB, Younkin SG, Näslund J, Lannfelt L. The 'Arctic' APP mutation (E693G) causes Alzheimer's disease by enhanced Abeta protofibril formation. Nat Neurosci. 2001 Sep;4(9):887-93. PubMed.
Stenh C, Englund H, Lord A, Johansson AS, Almeida CG, Gellerfors P, Greengard P, Gouras GK, Lannfelt L, Nilsson LN. Amyloid-beta oligomers are inefficiently measured by enzyme-linked immunosorbent assay. Ann Neurol. 2005 Jul;58(1):147-50. PubMed.
Lord A, Kalimo H, Eckman C, Zhang XQ, Lannfelt L, Nilsson LN. The Arctic Alzheimer mutation facilitates early intraneuronal Abeta aggregation and senile plaque formation in transgenic mice. Neurobiol Aging. 2006 Jan;27(1):67-77. PubMed.
Steiner H, Revesz T, Neumann M, Romig H, Grim MG, Pesold B, Kretzschmar HA, Hardy J, Holton JL, Baumeister R, Houlden H, Haass C. A pathogenic presenilin-1 deletion causes abberrant Abeta 42 production in the absence of congophilic amyloid plaques. J Biol Chem. 2001 Mar 9;276(10):7233-9. Epub 2000 Nov 17 PubMed.
Banner Research Institute
In this paper, Nagasaka et al. extracted total RNA from cultured, frozen, thawed, and recultured fibroblasts from 33 individuals from two families with mutations in APP (Swe or Arc) and one family with PSEN1H163Y. Wild-type siblings (N = 11) formed a comparison group to the 19 mutation carriers. (Samples from three individuals were discarded due to data criterion issues.) Affymetrix U133A chips were used to obtain array data. Allen’s cross validation (CV) criterion identified 200 individual genes [sic] whose intensities were different between mutation carriers and wild-type siblings. Further data analysis was by clustering and by multivariate Principal Components Analysis. These 200 transcripts were also used as input to a “powerful supervised machine learning method” which was able to “perfectly separate the samples into two classes: one with 19 mutation carriers and the other with the remaining 11 wild-type controls.” With the same probe sets they were unable to distinguish the carriers of the three different mutations from each other; neither were they able to distinguish demented from nondemented mutation carriers.
Although neither the genes of interest nor the classes of functions they represent are identified in the paper, the authors state the list of 200 genes is available from the authors upon request. The PNAS paper does not contain any link to Supplementary Data presenting the gene list or an explanation of Allen’s CV criterion, as one would normally expect. This information has, however, been made available through the Alzheimer Research Forum website. It is surprising that PNAS and the responsible member did not ask that this information be posted as Supplementary Material. In the list provided by the authors to the Alzforum website, direction of change is not given, i.e., whether expression of each of the 200 genes was decreased or increased in mutation carriers. In the paper itself there is no synthesis of the possible meanings of the gene classes found to change or any comparison with any of the other human array data available from a variety of sources.
The comparison of gene expression profiles of APP and presenilin mutation- bearing individuals to wild-type siblings presents an outstanding research opportunity. The collection of families for such studies is a daunting task that almost necessarily results in studies with small N. The use of small N, as is the case in this paper, imposes certain limitations on analysis and interpretation, especially when the research is not directed by any pre-specified, explicit hypotheses.
Almost 50 percent of the 22,238 probe sets on the chips used were called absent in all 30 samples used (an unusually large percentage). Of the remaining 11,138 probe sets, the reader must ask how many probe sets would be found to be significantly different between the two samples by chance. Two hundred probe sets were determined to be differentially expressed between mutation-carrying and wild-type individuals. These 200 probe sets emphasized in this paper were determined by an obscure unreferenced method (Allen’s cross validation criterion) whose “details are available from the authors upon request” (or at the Alzforum website). It is not possible to assess whether 200 is more, less than, or equal to a chance finding on the basis of the information provided. No external validation data are provided. Nor is there any attempt to deal with potential selection bias (see Ambroise and McLachlan, 2002). Unfortunately, this leaves the casual reader in the difficult position of trying to understand the meaning of 200 genes that may represent a chance finding obtained by an obscure method.
The list of 200 genes contains a number of the usual suspects: transcripts associated with mitochondria, ubiquitin, and MAP kinases, to mention the more frequently represented classes. Many of the expected suspects are missing or only very sparsely represented. These include those related to the cell cycle/cell death and inflammatory system, all of which have been described by others as changed in AD peripheral non-neuronal cells. The possible reason(s) for the absence of these expected genes is not discussed.
What does it really tell us to know that the expression profile of 200 genes can be used to distinguish one small group from another small group? Many questions are left unanswered. Are the findings presented as merely the result of chance, obtained by analyzing a large set of 11,000-plus probe sets? Would the expression profile of these same 200 genes distinguish other mutations of other diseases or would another 200 genes be required? Why did the 200 genes used in the present study not distinguish demented from nondemented carriers? Indeed, what new information can these 200 genes provide about the cellular and molecular pathophysiology of Alzheimer disease? The authors have cast an extremely broad net whose catch they have examined with a variety of statistical methods, none of which satisfactorily convinces the reader of the generality of the results presented. The result is a strange creature that may be the only specimen of its kind or that may provide clues that merit further study. In the absence of any external validation, the present paper does not allow a decision either way. The authors could best address this by posting their original, raw data on the Web so that it may be analyzed by others using more rigorous methods.
References:
Ambroise C, McLachlan GJ. Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci U S A. 2002 May 14;99(10):6562-6. PubMed.
University of Rochester Medical Center
1. In this study, it is unclear how the results of clustering were used to attain the ends of this study, namely, the construction of a diagnostic signature.
2. One can only guess what kind of likelihood was used in the authors' procedure. It is highly plausible that the authors assumed a univariate normal distribution for the independent
model and a bivariate normal for the dependent model.
3. The non-parametric likelihood is clearly infeasible with such small samples.
4. No rationale for parametric assumptions has been given. Furthermore, it is a well-known fact that normality of gene expression cannot be adopted as a general assumption for all genes. This is even more so in the bivariate case.
5. The multiple testing aspect of the preliminary selection of feature variables is completely ignored.
Karolinska University Hospital
Reply to comment by Eric Blalock and Philip Landfield
Regarding our choice of statistics to select differentially expressed genes, we have shown the formula used to calculate the CV values, and explained the way of thinking for the CV criterion, in the supplemental data posted on the ARF website, linked below the news summary. A more qualitative explanation for our method is as follows. Each statistic has its own feature. Some kind of distribution can be distinguished more easily with one statistic. In our study, 200 genes were chosen based on their CV value; i.e., the 200 genes are the ones with the largest CV values. When we compared our 200 genes with the 200 genes selected by Welch’s t-test (commonly used parametric statistics) or by Mann and Whitney’s U test (widely used non-parametric statistics), about half of ours are included in the 200 genes selected with either one of these commonly used methods. However, we chose to continue our calculations based on the CV values, and as shown in the paper, this generated a robust predictive tool to distinguish the mutation carriers from the wild-type siblings.
One commentary by Eric Blalock and Philip Landfield was that we chose not to present the list of genes or discuss the biochemical significance in the paper. The list of genes was not published as supplementary information because the editor of the journal asked us to remove it. To remedy this lack of information we have submitted the list of genes to the ARF website and thus anyone can use the information for comparative analyses of their own gene expression data, for example, gene expression data on RNA from AD brains, to see if there is a greater-than-expected overlap. Regarding the biochemical implications of the list of genes, we refer to the answers posted above; i.e., we performed a FatiGO gene ontology data mining and found that the three largest functionally categorized groups are those of metabolism, cellular growth and/or maintenance, as well as cell communication.
We have also investigated potential associations between the identified genes and functional pathways by an automated literature-to-gene search using PubGene analysis (1). However, we believe one should be cautious in the interpretation of such an analysis since the algorithms we used to identify differential gene expression are effective in identifying classifiers, but the gene lists generated may have little or no biological coherence. Further, before any biological meaning can be established, the gene products should be followed up with experimental cell culture studies. Therefore, we choose at this point not to go into such details about the functional aspects of the gene products. However, for those interested, the PubGene procedure generated a literature network consisting of 32 of the 200 genes, where the signal transduction molecule MAPK1 was positioned in the center of the network. In addition, the indirect upstream activator of MAPK1, RAF1, was included in the 32-gene network.
References:
Jenssen TK, Laegreid A, Komorowski J, Hovig E. A literature network of human genes for high-throughput analysis of gene expression. Nat Genet. 2001 May;28(1):21-8. PubMed.
Karolinska University Hospital
Reply to comment by Paul Coleman
We expect that most of the comments would be solved by reading our paper and supplemental data posted in the ARF website carefully. The supplemental data were submitted to PNAS together with our paper manuscript, but unfortunately, the PNAS editor decided not to post it on the PNAS website but asked us to provide it on request. The criticism is based on Coleman’s opinion that the method we used is an obscure, unreferenced method. However, we have already provided the reference and formula.
We understand that the number of samples we analyzed in this study is not large, but we believe it is enough to show the potential of the approach. We are currently planning to perform a validation study with a larger number of additional samples.
Dr. Coleman suggests that the findings are merely the result of chance. As we described in the paper, we performed bootstrap analysis to assess the random chances of observing this kind of difference in expression. Only 1 percent of the 10,000 replicates generated a greater expression difference between two groups of samples with randomized genotypes. Naturally we cannot completely exclude the possibility that the findings we observed are merely the result of chance. However, we would think that 1 percent error rate is sufficiently low for this kind of innovative study. As we mentioned above, in order to provide more reliability to our results, we are performing another study with a larger number of additional samples.
With regard to biological implications for the 200 genes, see our reply to Blalock and Landfield’s comment.
The following comment by Dr. Coleman may result from a misunderstanding. “Almost 50 percent of the 22,238 probe sets on the chips used were called absent in all 30 samples used (an unusually large percentage).” The present ratio we observed in this study is quite normal. We would like to ask the commentator to give an example of what he considers to be a standard present ratio, if it is much different.
Karolinska University Hospital
Reply to comment by Martin Ingelsson and Lars Lannfelt
The first comment suggests that we have misunderstood the investigations made by Dr. Lannfelt on Aβ metabolism of the APParc mutation. “In the original publication (1) we describe low Aβ levels in media and transfected cells as measured by ELISA,” Ingelsson and Lannfelt write. In a recent paper (2) Stenh et al. find that “ELISA is not well suited for the measurement of Aβ, especially for aggregated peptides.” Still, paper (2) describes a reduction of Aβ by 30-70 percent in cells transfected with APPswearc compared with cells transfected with APPswe alone as measured by ELISA, and a 40 percent increase of Aβ levels in APPswearc compared with APPswe when measured by Western blot. We interpret this as an overall relative increased Aβ level in APPswearc by the method recommended by the authors, i.e., the denaturing Western blotting. Furthermore, Stenh et al. did the same measurements on in-vivo tissue, i.e., brain homogenates from 2-3-month-old transgenic (Tg) mice, and reported that the ELISA detects a 50 percent decrease of Aβ in APPswearc Tg mice as compared with APPswe Tg mice. Again, the Western blot denaturing technique detects a 40 percent increase of Aβ in the APPswearc Tg mice as compared to APPswe Tg mice. It is surprising to us that this group, having made such a strong case for the need to use denaturing Western blotting, in a more recent paper (3), only presented Aβ levels as detected by ELISA on the same Tg mice as above. We were even more surprised when these authors reported that there is no difference in Aβ levels between APPswearc Tg and APPswe Tg mice. We would welcome clarification by Dr. Lannfelt and co-workers as to what they think the APParc mutation does to APP metabolism and Aβ levels.
The second point refers to the neuropathology of the APParc mutation carriers. Nenad Bogdanovic, who has performed the neuropathological examination on the Swedish patient with the APParc mutation, will reply separately to the commentary made on the neuropathology.
The third comment refers to how biopsies are handled. In studies on disease-associated gene expression, perhaps even more so in surrogate tissues, it is of utmost importance to exclude confounding caused by non-disease associated factors. Issues like age, gender, cultivating conditions, and number of passages are examples of such potential confounders. As pointed out in the paper, these factors were not significantly different between the mutation carriers and wild-type sibling groups, or did not result in significantly different gene expression levels. We regret the way we expressed the handling of the fibroblast samples in the paper. We actually refer to the handling of the lymphocyte samples that we do not present in the article. Instead, we should have phrased the sentence as follows: There was no significant difference in the time period between sampling and culturing of the skin biopsies in the mutation carriers compared with the wild-type siblings. A reply to the final commentary will be posted separately.
References:
Nilsberth C, Westlind-Danielsson A, Eckman CB, Condron MM, Axelman K, Forsell C, Stenh C, Luthman J, Teplow DB, Younkin SG, Näslund J, Lannfelt L. The 'Arctic' APP mutation (E693G) causes Alzheimer's disease by enhanced Abeta protofibril formation. Nat Neurosci. 2001 Sep;4(9):887-93. PubMed.
Stenh C, Englund H, Lord A, Johansson AS, Almeida CG, Gellerfors P, Greengard P, Gouras GK, Lannfelt L, Nilsson LN. Amyloid-beta oligomers are inefficiently measured by enzyme-linked immunosorbent assay. Ann Neurol. 2005 Jul;58(1):147-50. PubMed.
Lord A, Kalimo H, Eckman C, Zhang XQ, Lannfelt L, Nilsson LN. The Arctic Alzheimer mutation facilitates early intraneuronal Abeta aggregation and senile plaque formation in transgenic mice. Neurobiol Aging. 2006 Jan;27(1):67-77. PubMed.
Karolinska University Hospital
Reply to comment by Andrei Yakovlev
Though most of the comments will be answered by reading our supplemental data, we are going to provide brief answers to each comment as follows.
“In this study, it is unclear how the results of clustering were used to attain the ends of this study, namely, the construction of a diagnostic signature.” As Dr. Yakovlev guesses, we did not intend to use the results of clustering for construction of a diagnostic signature, but we used the data just to demonstrate an overall expression difference of the selected 200 or 56 genes between FAD carriers and wild-type siblings.
2. “One can only guess what kind of likelihood was used in the authors' procedure. It is highly plausible that the authors assumed a univariate normal distribution for the independent model and a bivariate normal for the dependent model.” The commentator’s guess is partly correct: We assumed univariate and bivariate distributions, but we did not assume normality for either of them.
3. “The non-parametric likelihood is clearly infeasible with such small samples.”
In general, non-parametric likelihood may be infeasible with small samples, but we think that there is no rationale to conclude that the non-parametric likelihood estimation as used by us would not be applicable to the case described in this study.
4. “No rationale for parametric assumptions has been given. Furthermore, it is a well-known fact that normality of gene expression cannot be adopted as a general assumption for all genes. This is even more so in the bivariate case.”
We agree that normality of gene expression cannot be adopted as a general assumption. That is why we employed a kind of non-parametric likelihood to select 200 genes.
5. “The multiple testing aspect of the preliminary selection of feature variables is completely ignored.”
We don’t think multiple testing was applied for feature selections in our study. Before selection of probe sets, we only removed three outlier samples regardless of genotype, and discarded about half of the original probe sets because their level of expression was at the level of noise.
Banner Research Institute
Rather than focus on details, I would like to emphasize the main point
that the authors have analyzed the relative expression levels of a
large number of genes using a (necessarily) small number of cases. The
authors correctly saw the need for validation, but the method they used
was based on the same data from the same subjects. There was no
independent or external validation.
There may be several ways in which it would be possible to engender
confidence in array data, especially data in which there is a large
disparity between number of genes and number of cases.
1. One could use alternate methods applied to the same samples to
validate results. Quantitative RT-PCR has been a method of choice, but
other methods such as some quantification of in situ hybridization or
immunohistochemistry may be informative. Since the correspondence
between message expression and levels of corresponding protein is not
always linear, protein-based methods may not validate transcript-based
methods. On the other hand, it is generally the protein that does the
work of the cell.
2. One could repeat the study using a completely different sample. In
my opinion, this is the only way to satisfy the requirement of
independent validation. However, that may not always be possible.
3. Confidence in the data would also be increased if, prior to data
collection, there had been stated an explicit hypothesis to be tested
by the data. In a study such as the one reported here, it could have
been possible to make selected predictions (hypotheses) regarding
subsets of transcripts on the basis of what has already been described
for peripheral cells in AD. Or, absent any prior prediction or
hypothesis, confidence could also be increased by an analysis of data
that showed internal coherence. Thus, if expression of gene X was
found to be altered, one might expect expression of gene Y to be
altered in predictable ways. Such examination of the data would,
hopefully, demonstrate biological (or molecular) coherence in the data.
Since biological systems usually involve complex interactions among
multiple pathways, the crucial word in the preceding sentence is
"hopefully."
Of course, the approaches outlined above are by no means mutually
exclusive. In fact, the most informative approach would be one that
used all of the above—at least with regard to a subset of gene
products that were pertinent to a hypothesis.
Make a Comment
To make a comment you must login or register.