This is a two-part story. See also Part 1.
Identifying Predictors of Placebo Decline in AD Trials
29 July 2008. Are oddly behaving placebo groups scuppering clinical trials, or are potential treatments simply not living up to expectation? As discussed in Part 1 of this two-part story, some Alzheimer disease clinicians worry that cognitive function among placebo groups is not declining as quickly as it used to. If so, treatment effects could be harder to spot. For example, the phenserine study data reported at the 2005 AD/PD conference in Sorrento, Italy (see ARF companion news story), and several other large AD trials—of galantamine (Brodaty et al., 2005) and rosiglitazone (Risner et al., 2006)—have failed to show a difference between treatment and placebo groups in at least one primary endpoint. “When you looked at the data and at why you didn't pick up a treatment effect, what became a recurrent theme was that you didn't see a decline in the placebo groups,” said Michael Gold, vice president of neurology at GlaxoSmithKline, Research Triangle Park, North Carolina. “That prompted the question of whether something had changed in the patients, or in the way we conducted our trials.” These issues were addressed in a recent meta-analysis by Gold (Gold, 2007), and in an ICAD poster presented today by Michael Irizarry and other GlaxoSmithKline colleagues.
For his study, Gold collected published data from prospective, placebo-controlled, double-blind, parallel group studies that included subjects with mild to moderately severe probable AD and that used the ADAS-Cog as an endpoint. Emerging from his analysis of 69 AD trials, which met these criteria and were published between 1992 and 2006, were several broad observations: patients enrolling in trials are progressively older, and trials are becoming longer. In fact, trial length came up as the most robust predictor of decline in placebo groups. Gold acknowledged that this finding was no surprise, as greater deterioration would be expected in longer studies of a progressive disease. However, this could become a sticking point for more recent AD clinical studies that have bucked the overall trend toward longer trials—a move Gold attributes to the intensifying need to cut costs and gain approval by institutional review boards (IRBs) that are increasingly wary of side effect risks in prolonged studies.
His analysis also identified baseline dementia severity as a predictor of decline—another no-brainer, Gold conceded, given that milder AD patients tend to progress more slowly (Morris et al., 1993). However, his finding becomes relevant in light of the field’s greater emphasis on early intervention, which requires testing experimental drugs on patients with fewer cognitive symptoms. “If patients are coming in milder and you have IRBs that are reluctant to do longer studies, then you’ll have a problem detecting significant decline,” he said.
Throughout the 1990s, pivotal AD drug trials tended to be six-month studies. Toward the late 1990s, some trials went out to 12 months with the aim of showing longer-lasting drug benefit. As disease-modifying drugs entered the clinical pipeline in the early 2000s, companies began launching 18-month trials in hopes of being able to demonstrate disease modification.
But longer is not necessarily better. In Gold’s analysis, lengthier trials were associated with greater cognitive decline, but they also tend to involve more evaluations—a factor Gold found to be linked, unexpectedly, with less placebo decline, possibly due to practice effects that kick in with increased exposure to psychometric testing. In a separate discussion, Lon Schneider of the University of Southern California, Los Angeles—who presented a study deflating the idea that placebo groups have declined less in more recent AD trials (see ARF companion news story)—noted that longer studies can also pose problems because their data sets are often riddled with features that make them harder to analyze—for example, more dropouts and broader standard deviations due to greater range and trajectory in cognitive decline of individual patients.
A bottom-line problem, Gold told ARF, is that “clinical trials have become prohibitively expensive, so everyone is trying to cut costs.” Add to this scenario the skyrocketing demand for research subjects that comes from testing a growing number of experimental AD drugs, and companies have started going into areas of the world where trials are cheaper to conduct. However, increasing the number of trial sites often means smaller sample sizes per location, more raters, and greater diagnostic and treatment variability—factors that can decrease a study’s statistical power. It is not just “a matter of more noise in the ADAS-Cog instrument; there's also more noise in the populations,” said Gold, whose analysis found increased numbers of investigational sites to be associated with less decline among placebo subjects. If the higher “noise” within patient populations from multicenter trials arises from increased frequency of misdiagnosis—a scenario Gold finds plausible, especially for global trials—cognitive decline would be blunted in trials with more sites. Perhaps related to Gold’s finding that increased numbers of sites predict less decline is Schneider’s data, which drew an association between less worsening and trials with non-English-speaking sites (see ARF companion news story).
Gold’s meta-analysis of placebo populations was extended in an analysis to be presented at ICAD tomorrow in a poster by GlaxoSmithKline colleagues Michael Irizarry and others at several of the company’s U.K. sites. Using pooled data from 773 placebo participants in six AD trials conducted in 1996-1997, Irizarry’s team performed multivariable linear regression analysis to identify patient characteristics that predict cognitive decline. When the number crunching was finished, baseline cognitive status and screening-to-baseline change in the ADAS-Cog turned up as the strongest predictors of 24-week ADAS-Cog change. In other words, lower MMSE scores at baseline were associated with greater ADAS-Cog decline at 24 weeks, as was marked worsening on the ADAS-Cog during the four-week stretch between screening and baseline. If investigators detect a large cognitive drop in a patient at the end of those interim weeks—a period during which medications get tweaked and caregiver education takes place—“that’s a red flag,” Gold said.
Future AD Trials: What Should Change?
These findings should help guide the design of AD drug studies, which often struggle to demonstrate significant efficacy at Phase 2. “There’s a tendency to do Phase 2 studies with too few patients and, if the results are negative, to kill it and blame it on cholinesterase inhibitors,” Schneider said.
Eric Siemers, medical director for the Alzheimer disease research team at Eli Lilly and Company in Indianapolis, agreed that lack of statistical power hinders many Phase 2 studies. Phase 2 trials typically involve between 30 and 300 patients. “So even with 300 people, if it’s a three-arm study (1 placebo, 2 treatment), you’d be at just 100 per arm,” he said—too small for reliable cognitive decline by Schneider’s standards. “If you use cognitive measures as your primary outcome, you’re always going to be left with this statistical variability. There are some potentially good drugs that you’re going to kill because you don’t see the effect when in fact it was there,” Siemers said. With cognitive tests, “it takes longer and more people to see decline.”
This difficulty with cognitive measures is what led Eli Lilly to design Phase 2 trials that rely less on cognitive decline and more on biomarkers, which generally require fewer subjects to show statistical significance. Last year, Siemers presented data from a Phase 2 study of Eli Lilly’s γ-secretase inhibitor LY450139 that used plasma and cerebrospinal fluid Aβ levels as its endpoint (see ARF Washington news story). However, the risk of using biomarkers as Phase 2 readouts is that they might not translate into the cognitive responses needed in Phase 3 trials. Ideally, Phase 2 studies would use surrogate markers—biomarkers with a demonstrated ability to substitute for clinically meaningful outcomes (for more on biomarkers and surrogates, see ARF related news story). Very few biomarkers are able to meet validated surrogate criteria outlined previously (Temple, 1999), but “from a drug development standpoint, it's not so important that something is a validated surrogate marker,” Siemers told ARF. “What you’re really looking for is evidence that your drug has the right mechanism, the right immediate biochemical effect.”
Time will tell whether biomarkers will increasingly replace cognitive measures as endpoints in Phase 2 trials of AD therapeutics. Meanwhile, the studies of placebo decline described in this two-part story highlight several concerns the field will continue to confront. As Rachelle Doody, Baylor College of Medicine, Houston, Texas, sees it, “the issues to be grappled with are—either develop compounds that are expected to improve people over baseline as well as slow their decline, or deal with the fact that in a one-year trial, if all you’re expecting is for your drug to hold people flat, you won’t have a lot of differentiation from placebo.”
If AD trials using cognitive decline as a primary readout are further constrained by recruitment of milder patients, the challenges intensify. “Either we do longer studies, or the standards for calling something efficacious has to change,” Gold said. In March, a workshop involving AD activist group leaders, U.S. Food and Drug Administration representatives, clinicians, and industry leaders explored the latter possibility (see ARF related news story).
Further insight into the problem of unpredictable placebo decline could come from a large-scale effort—led by Marilyn Albert of Johns Hopkins University in Baltimore, Maryland, and Ron Petersen of the Mayo Alzheimer's Disease Research Center in Rochester, Minnesota—to analyze placebo data from AD and MCI clinical trials. Plans for collecting and analyzing such data were put on the table at a workshop last year (see ARF workshop report) and discussed further at a 21 July meeting in Washington, DC.—Esther Landhuis.
This is a two-part story. See also Part 1.