This is Part 1 of a two-part series. See also Part 2.
22 May 2009. On April 26, two days after the Human Amyloid Imaging (HAI) conference drew Alzheimer imaging aficionados to Seattle (see ARF related news story), some of them joined the 135-plus who crammed into a room a few blocks west, at the city’s luxurious Fairmont Olympic Hotel, for a peek at the latest data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI). And at the 61st Annual Meeting of the American Academy of Neurology (AAN), held in Seattle that same week, a session on ADNI drew a standing-room-only crowd. Launched in fall 2004 and set to conclude next year, this $64 million tour de force is comparing imaging methods and fluid biomarkers in the same set of people to determine which measures can best predict and track clinical change over time. The project is approaching the home stretch of data collection. By fall 2010, ADNI scientists will have collected three years of longitudinal data from more than 800 participants (about 200 normal, 400 with mild cognitive impairment (MCI), and 200 with Alzheimer disease) at 59 U.S. and Canadian sites. The Seattle meetings featured preliminary analysis of the one-year data. By and large, ADNI has helped identify a number of precise, clinically meaningful biomarkers that should be able to stand in for slower-to-budge cognitive measures, slashing time and cost from AD drug trials. “I think that ADNI is not only on track for meeting the goals we set out with, but we’ve added so many additional goals,” said principal investigator Michael Weiner, University of California, San Francisco, in a post-meeting phone conversation with this reporter. “The whole project has become more ambitious and is having more impact.”
Last fall, the Alzforum ran a six-part series introducing the movers and shakers behind ADNI’s core components—MRI, FDG-PET, CSF biomarkers—and its add-on studies in amyloid imaging and genetics (see ARF related news story). Reams of data from these brain scans and spinal taps have made their way onto the ADNI website, where they are freely available to the scientific community at large. All researchers who had downloaded data from this site received an invitation to the ADNI Data Presentations meeting at the Fairmont last month. ADNI-funded investigators spoke in the morning, and the afternoon session featured presentations by scientists who have pulled ADNI data into their research but do not receive direct funding from the public-private consortium. This story highlights broad themes that emerged from the ADNI core updates. Visit the ADNI-info website for abstracts [.pdf] and presentation slides from this meeting.
Weiner kicked things off by presenting data from his own group on regional rates of brain atrophy, and how they correlate with cognitive decline and with CSF biomarkers. To identify the brain areas that shrank the fastest and also tracked with cognitive decline, his team examined rate of change in tissue volume across 96 brain regions in 135 healthy elderly, 223 people with MCI, and 79 AD patients. All participants had three consecutive clinical and MRI assessments within a year, and about half also had baseline CSF phospho-tau and Aβ1-42 measurements. In statistical analyses to determine how useful these regional MRI volumetry measures would be in an AD drug trial, the researchers found that the hippocampal region had the greatest atrophy rate and the greatest power. “To detect a 25 percent reduction in rate of cognitive decline…we didn't find that it really helped to look at any other volumes,” Weiner reported. Further winnowing of participants using baseline covariates—that is, choosing only people with low hippocampal volumes or at-risk CSF profiles (e.g., low Aβ1-42, high tau, high phospho-tau)—boosted the statistical power even more. All told, the data suggest that AD prevention trials of healthy elderly and MCI patients could use CSF Aβ42 and phospho-tau as predictors, and MRI-detected atrophy rates as outcome measures, Weiner said.
A major issue that kept popping up throughout the day’s discussion concerns a new class of imaging approaches. These data-driven statistical methods appear quite promising. Yet their success in the ADNI dataset comes clouded with uncertainty as to whether they will gain traction in the field and win favor with the U.S. Food and Drug Administration in future drug trials. Traditionally, scientists have analyzed imaging data by simply focusing on structures in the brain that succumb to disease. However, these anatomical approaches have limitations, said Eric Reiman, who heads the Arizona Alzheimer’s Consortium and directs the state’s Banner Alzheimer’s Institute in Phoenix. Their statistical power depends on how well the a priori-defined regions of interest (ROI) happen to correspond with actual patterns of change, and this can be non-uniform within brain structures, he said. In Seattle, Reiman presented initial findings from a fundamentally different strategy for analyzing FDG-PET data. Rather than extracting information from pre-specified brain structures, this approach uses statistics to probe brain images for individual clusters of voxels that reflect the greatest difference between clinical groups, or that show the greatest change within a group. After defining this set of change-prone voxels in a training cohort, the scientists assemble them into a “mask” that they place on an independent group of participants to extract information from the same places.
To demonstrate the effectiveness of this voxel-based approach, Reiman and colleagues used 40 percent of their ADNI subjects as the training cohort and the remaining 60 percent as the test set. They calculated that a 12-month multi-center AD trial using FDG-PET with these statistical ROIs as a primary endpoint would need on the order of 100 patients per arm to detect a 25 percent treatment effect with 80 percent power. By comparison, trials using standard cognitive tests (i.e., ADAS-Cog or MMSE) as the primary outcome measure would need well more than 400 subjects per arm to achieve the same power. For most other FDG-PET approaches, the required sample size fell between 300 and 1,000, according to an analysis by the ADNI biostatistics core that compared the FDG-PET methods side-by-side.
Other research groups have used similar voxel-based strategies in their MRI studies. Paul Thompson’s at the University of California, Los Angeles, is doing tensor-based morphometry using data-driven ROIs to construct three-dimensional maps showing rates of brain tissue loss in healthy/MCI-stable patients (less than 0.5 percent/year), MCI patients who converted to AD (2 percent/year), and AD patients (2.5 percent/year). Applying Reiman’s statistically guided approach in this context—that is, focusing on voxels that show the highest atrophy rates—dropped from 132 to 85 the number of MCI patients required to detect 25 percent slowing of disease in a clinical trial, Thompson reported in Seattle. Colin Studholme of the University of California, San Francisco, described diagnostic (e.g., converters vs. stable MCI) and genetic (e.g., with or without ApoE4 allele) factors that account for the considerable variability in how each subject loses brain tissue. These factors not only affect cognitive performance but can also confound voxel choice in the data-driven approaches, he said. Data presented by Charles DeCarli of the University of California, Davis, suggested another potentially confounding factor, that is, cardiovascular disease burden revealed by MRI as white matter hyperintensity. DeCarli reported that greater increases in white matter burden were associated with greater declines on the MMSE and on several measures of executive function and working memory.
Gene Alexander heads a group at the University of Arizona that does MRI voxel-based morphometry of gray matter. He reported that his method can help identify people within the heterogeneous MCI group who are on the verge of converting to AD. His analysis also seemed to indicate that people are declining cognitively even while they have not yet converted. Nick Fox at University College, London, measures brain atrophy by calculating ventricular boundary shift integrals. He found that voxel scaling errors resulting from AD-related brain atrophy introduce variability into all MRI measures, and that correcting for them can reduce sample sizes by 10 to 12 percent. James Brewer, at the University of California, San Diego, reported that he could pick up changes at just six months using a fully automated algorithm that assesses baseline MR volumetry. If trained to identify subregions that best discriminate between AD patients and healthy controls, the method was able to identify people within the MCI group who would decline quickly over the next 12 months, and those who would remain stable.
Will Data-driven Approaches Hold Water?
On the whole, data-driven approaches fared well in the ADNI biostatistics core’s head-to-head comparisons of FDG-PET and MRI approaches. This makes intuitive sense, especially for the PET measures, since they assess glucose metabolism, which reflects brain function, said Danielle Harvey in a post-meeting interview with ARF. Harvey, a biostatistician at the University of California, Davis, presented the ADNI biostatistics core analyses in Seattle. “There may be functional networks that are different than the anatomy of the brain. The more data-driven methods are capturing something you might lose if you were only looking at a particular region of the brain,” she told ARF.
That said, a big question mark looms. “It’s unclear exactly what the FDA’s opinion is about these sorts of data-driven approaches and whether [the agency] would approve something that had been developed in one set and then applied in a different set of subjects,” Harvey said. However, she said that the voxel-based methods have already shown decent mettle across the ADNI training and test cohorts. If the masks that were defined on the ADNI training set were to be used in future trials, they could essentially be considered a priori-defined regions, she said. There would be one mask for the AD group, one for the MCI group, one for a certain treatment duration or set of entry criteria, and so forth.
In the MRI comparisons, hippocampal and ventricular measures came up winners. The ADNI biostatistics core determined this using two criteria—precision and correlation with cognitive change. Both of these are critical, Harvey stressed. “Just because you can measure something with very little noise (i.e., high precision) doesn’t mean it has to do with the disease process going on,” she said. This is crucial for drug trials. “For the marker, we’re hoping it can detect something that is changing, for instance, in an image, and that these changes correlate with meaningful clinical benefit.”
On the phone with this reporter, Reiman emphasized that the ADNI data suggest the use of a multi-modal approach in AD clinical trials. “[The ADNI data] are encouraging the use of multiple endpoints so we can get a better understanding of which ones are going to serve our needs in evaluating treatments,” he said. “ADNI has provided wonderful information about progression and power. What we need next is to embed the biomarkers in clinical trials and see which ones predict benefit.”
In Seattle, Harvey discussed a third parameter the ADNI biostatistics core is beginning to incorporate into its analyses. It is quality control. For example, one imaging method requires registration of MRI scans to calculate tissue loss between various time points. Even when individual scans pass muster, if the scans do not align adequately with one another, the accuracy of the resulting atrophy measure can suffer. Method-specific quality control thus becomes important, because high fail rates can inflate the sample sizes required to get an appropriate number of viable scans. Thus far, only a few labs have supplied this sort of information; the ADNI biostatistics core hopes more groups will do so in the future.
The meeting brought a degree of closure to long-standing concerns about whether the field’s gradual shift from 1.5- to 3-tesla MRI machines was compromising data reliability. To gain insight into any differences that might arise when switching to machines with a higher field strength magnet, the biostatistics core studied 200 ADNI subjects who had been scanned on both types of scanner. Comparing precision of 1.5T and 3T MRI approaches used by five different research groups, each scanner seems to have some relative advantages and disadvantages, observationally speaking. However, “we’re not actually seeing any statistical differences,” Harvey said. “It might be all right to have studies of mixed field strengths (i.e., some sites using 1.5T scanners, others using 3T) as long as people who were scanned at one field strength stay at that field strength for the duration of the study,” she said.—Esther Landhuis.
This is Part 1 of a two-part series. See also Part 2.