8 August 2011. In the fall of 2009, a large quality control (QC) program started comparing measurements of cerebrospinal fluid (CSF) biomarkers from labs across the world (see ARF related news story). The aim was to come up with standard procedures that would ensure consistent results. Two years later, the initiative’s main accomplishment has been to recruit more than 60 labs to participate in the project, but consistency across labs remains elusive, according to presentations at the 11th annual Alzheimer’s Association International Conference (AAIC, formerly ICAD) held 16-21 July 2011, in Paris, France.
Achieving consistency across labs has emerged as a key challenge in the field, which in the past two decades has amassed strong evidence that CSF markers can predict AD. Consistency is necessary if this evidence is ever to be applied to multicenter treatment trials, and especially in broad clinical settings beyond a few leading academic medical centers. Brain imaging faces its own standardization challenges. Compared to CSF biochemistry, this field is at an even earlier stage in that regard, as large-scale standardization initiatives are only just beginning to form.
Three times a year, the Alzheimer’s Association Cerebrospinal Fluid Quality Control Program, headquartered at the Sahlgrenska University Hospital in Molndal, near Göteborg, Sweden, sends out for testing three CSF samples to each of the participating labs. Two of the samples are unique to each round of testing, and serve to gauge the variability in measurements among labs; the third sample is the same every time to assess longitudinal stability of measurements. At the same time, four reference laboratories process several additional copies of every CSF sample to determine the variability in measurements within a single lab (see ARF related news story). The samples are analyzed for the presence of the three most established biomarkers: amyloid-β42 (Aβ42), total-tau (t-tau) and phosphorylated tau (p-tau).
At AAIC, Kaj Blennow at Sahlgrenska University Hospital, who heads the QC program, presented the results of the first six rounds of testing (Mattsson et al., 2011). He showed scatter plot graphs of biomarker measurements, indicating variation among laboratories in the range of 13 to 36 percent. Variation of less than 5 to 10 percent is the goal. “You can see quite marked drift among labs,” said Blennow.
Variation was similar irrespective of the assay kit used for the test (i.e., INNOTEST ELISAs, Luminex xMAP or MSD). Furthermore, the group did not see differences in variability among rounds of testing, in other words, variability is not yet going down with repeat testing. “Variability for tau is a bit less than for Aβ42 but still higher than we need,” said Blennow’s colleague Henrik Zetterberg, also at the Sahlgrenska University Hospital.
The left panel shows results from all labs doing the INNOTEST amyloid-β42 (Aβ42) ELISA on one of the unique CSF samples sent to them, with one lab indicated in color. The panel on the right shows the percent deviation in measurements obtained by the same lab for the longitudinal sample over six rounds of testing (two measurements performed for each round). Both panels show marked variation. Although some of the reference labs that are part of the QC initiative are able to obtain reproducible results time after time, most participating laboratories do not yet produce consistent results. View larger image. Image credit: Kaj Blennow
But despite this seeming lack of progress, Zetterberg said there is reason to be optimistic. “The reference laboratories that are part of the initiative are collecting longitudinal data, and those labs do produce stringent results that are reproducible time after time, except that we sometimes see changes due to batch variation among kit lots,” he said. “This means that standardization across labs should also be possible.”
There are many possible sources of variation in measuring CSF biomarkers, which have been previously discussed on Alzforum (see ARF related news story). In order to pinpoint the main ones and eliminate them from future rounds of testing, the Sahlgrenska group has started asking participating labs to fill out a checklist for each analytical technique. The checklists include information on assay reagents and instruments, details on sample handling and storage, such as which kinds of pipette tips or plates the technician used, as well as questions about internal control samples, assay conditions, settings for data analysis, and so on. They will be available on the program’s website.
The QC program has already developed standardized protocols for participating labs to follow when measuring CSF biomarkers, but they are “not detailed enough,” Zetterberg told ARF. “We are collecting more detailed information and in the long-run will be able to reduce variability.” The group will conduct hands-on practical courses to train lab personnel on how to carry out the standardized protocols (see ARF related news story).
Moreover, the program is pushing for better assay kits on the market. “Biomarkers in other fields of medicine (i.e., troponin-T and cholesterol) have gone through similar standardization procedures, and biotech companies producing this type of assay have put in the time, effort, and money it takes to make highly validated tests. We hope that the same will come true for these AD CSF biomarkers,” wrote Blennow in an e-mail to ARF.
Zetterberg stressed that the observed variability in CSF biomarkers measurements among labs does not diminish the value of these markers in research. “A lab that has established rigorous methods of its own can reliably monitor relative changes in biomarkers,” he said. As an example, at AAIC, Zetterberg presented a study showing that relative changes in biomarker levels can be used to predict with high certainty which patients with mild cognitive impairment (MCI) will go on to develop AD. His group measured CSF biomarker levels in 137 people with MCI at the start of the study and then followed these people clinically for more than nine years. Previous studies have reported that CSF markers predict AD risk at the MCI stage, but not with follow-up that long. During those nine years, 54 percent developed Alzheimer’s, and those people who did had had lower CSF Aβ42 levels and higher t-tau and p-tau when compared to people who did not. “Aβ is already down nine to 10 years before progression to AD,” said Zetterberg. The ratio of baseline Aβ42/p-tau predicted the development of AD within 9.2 years with a sensitivity of 88 percent and specificity of 90 percent.
Eventually, researchers would like to have biomarkers that determine whether someone who has no clinical symptoms of MCI is at risk for developing AD—somewhat akin to taking a cholesterol measurement to ascertain one’s risk for cardiovascular disease. At AAIC, Zetterberg said his group just completed a study of 86 healthy people who were followed for 13 years. Fourteen of them developed AD during this time. Preliminary data from his lab indicate that those individuals with CSF Aβ42 levels lower than 700 pg/mL are at higher risk for developing AD. “But I do not advocate screening at this stage,” said Zetterberg. “These data are purely for research purposes.”
In future, if researchers do find reliable CSF biomarkers for preclinical AD, having consistent results across labs will be essential to instituting universal cutoff levels that can be used to determine risk or establish a diagnosis. Standardization will also be necessary for comparing results of studies from different labs.
Think Standardizing CSF Is Hard? Try Hippocampal MRI
In the wake of the CSF Quality Control Program, other initiatives are springing up to make diagnostic markers for the early detection of AD robust across centers. Giovanni Frisoni of the San Giovanni di Dio Fatebenefratelli Hospital in Brescia, Italy, and Clifford Jack of the Mayo Clinic in Rochester, Minnesota, are heading an initiative formally named A Harmonized Protocol for Hippocampal Volumetry: An EADC-ADNI Effort. (EADC stands for the European Alzheimer’s Disease Consortium, and ADNI is the U.S.-based Alzheimer’s Disease Neuroimaging Initiative).
Hippocampal volumetry has proved its value in aiding AD diagnosis and in tracking disease progression (see ARF related news story on Erickson et al., 2011). Before it can move into wider clinical use “we have to agree precisely on what to measure,” said Frisoni. “Measurements must be standardized so that they can be used in all memory clinics all over the world.”
As a first step in this process, Frisoni’s group surveyed all published protocols for assessing hippocampal volume. “If you look at the hundreds of publications on the subjects, you'll find tens of different ways of segmenting the hippocampus. Of these, 12 are the most popular among scientists,” said Frisoni. Segmentation is the way outlines of brain structures are drawn on magnetic resonance images to delineate those structures; using different segmentation techniques leads to different estimates of hippocampal volume.
Frisoni’s group took the 12 most popular protocols and evaluated the information each one provides and its reliability in measuring AD-atrophy. These data were then fed to an international task force of experts who were asked to harmonize currently used segmentation protocols and come up with a single standard procedure for everyone to use. The task force comprises principal investigators at EADC and ADNI centers as well as other imaging centers across the world, comprising more than 30 groups in all.
The task force is currently developing and testing the harmonized protocol. Once it has been defined in minute detail, researchers at different imaging labs participating in this effort will have a chance to compare it to the segmentation procedures they currently use. Following this validation step, Frisoni and Jack plan to create a Web portal for people to obtain certification as an expert Qualified Hippocampal Tracer, wrote Frisoni in an e-mail to ARF. Tracers are people in imaging labs with expertise in brain anatomy; for the certification, they will need to manually trace the boundaries of the hippocampus slice by slice on a high-resolution computer screen. In addition, the group plans to develop educational material on how to use the harmonized protocol and hippocampal probability maps that will provide a reference for the tracings.
The finalized guidelines should be made available to the research community in September 2012, said Frisoni. Once available, they will be open to discussion and validation. More information and updates on the program can be found on the EADC-ADNI Initiative website.—Laura Bonetta.