Presenting the most eagerly awaited data at this year’s Clinical Trials on Alzheimer’s Disease meeting (CTAD) held December 8-10 in San Diego, Lawrence Honig, Columbia University, New York, reported that the Expedition 3 Phase 3 trial of Eli Lilly’s solanezumab was not a total bust. The company had given up on development of the therapy for mild AD last month following release of the topline result that treatment failed to slow cognitive decline (see Nov 2016 news). Still, researchers want a breakdown of the data to see if they sustain hope for ongoing prodromal and prevention trials that are further testing this therapeutic Aβ antibody.

After intense discussion, many CTADeers took solace in positive secondary outcomes and a trend on the primary outcome. Others consider this data to be a bellwether of the amyloid hypothesis. “I do not see this as a refutation of the amyloid hypothesis, but a confirmation of it,” said Paul Aisen, Alzheimer’s Research Therapy Institute at the University of Southern California, San Diego. Aisen co-organizes CTAD and is a principle investigator on the A4 secondary prevention trial that is still enrolling to test solanezumab in asymptomatic but biomarker-positive participants. He took heart in similar trends seen across all Expedition trials. “All three show separation of curves—a consistent story—showing that treatment slows decline by a small amount,” he noted during a panel discussion. Others were less convinced. “Results from these three negative solanezumab trials neither prove nor disprove the amyloid hypothesis,” said Lon Schneider, University of Southern California, Los Angeles.

Expedition 3 recruited 2,100 people with mild AD at 210 sites in 11 countries. Volunteers had to test positive for brain amyloid by either CSF analysis or a florbetapir PET scan. Half received intravenous infusions of 400 mg solanezumab every four weeks for 19 months, the other half got mock infusions. Considering how invasive and burdensome this procedure is, Honig was impressed that about 85 percent of participants in both arms completed the trial.

How, then, did the data break down? For the primary outcome, the ADAS-Cog14, patients on solanezumab declined 11 percent less than did those on placebo (p= 0.095). The curves showed a statistically significant separation between the active and placebo arms at week 28, and this continued at weeks 40, 52, and 64, but not at week 80.

Secondary outcome measures showed similar trends, and a few were statistically significant at the end of the trial. On the MMSE by week 80, people on solanezumab had declined 13 percent less than those on placebo (p=0.014). On the ADCS-iADL, the active group did better starting at week 64 (p value by week 80=0.019). On the CDR-SB, they declined 15 percent less by week 80 (p=0.004). The separation was weaker on the Functional Activities Questionnaire (FAQ). Aisen said this was no surprise because this is a relatively new scale; still, the trend favored the treatment arm at weeks 52 and 80. “In summary, the primary and all secondary [clinical] outcomes directionally favored solanezumab, though the magnitudes of the differences were small,” said Honig. He did not present data on seven other secondary markers because analysis is ongoing, he said.

Despite their disappointment at the overall result, scientists at CTAD lauded Lilly for persevering with the therapeutic strategy. Rachelle Doody, who recently moved from Baylor College of Medicine, Houston, to Roche/Genentech, thanked Eric Siemers and his colleagues at Eli Lilly for helping the field understand how to run clinical trials and for showing that amyloid is a valid treatment target. Based on the pooled analysis of patients with mild AD in the Expedition 1 and 2 trials, many had expected a positive result. Lilly powered Expedition 3 to detect a significant slowing of cognitive decline by doubling the number of patients with mild AD in Expedition/Expedition 2 and by using amyloid scans to weed out the 30 percent or so of people who had been clinically diagnosed with mild AD but turned out to have something else.

So what went wrong? In the end, the drug was simply too weak at that dose, experts agreed, and that showed up in the statistics. “Expedition 3 missed [its endpoint] because when the effect size is small, statistical significance typically varies from one trial to another,” said Aisen. Some wondered if the choice of primary endpoint was wrong, given that the CDR-SB showed a stronger effect. (This, incidentally, was also true in the negative LipiDiDiet trial presented later by Tobias Hartmann, Saarland University, Homburg, Germany.)

Honig rejected that idea. “Cognition sometimes looks better than function and other times it’s the reverse,” he said. “What we really want is clinical meaningfulness.” Aisen agreed. “In mild AD, the ADAS-Cog and the CDR-SB have similar power to detect treatment effects, and either may seem better than the other. It really depends on effect size,” he said. “The current standard is to rely on cognition first,” he added. Interestingly, Siemers noted that the placebo group declined faster than expected, which may reflect the Aβ-selection criteria and more active disease, he said, but he did not address whether that affected the outcome of the trial.

Others, noting that the trial came oh-so-close, wondered if the dose was right. Siemers agreed a higher dose may have been effective, but countered that safety was a major factor in settling on 400 mg. When asked if Lilly would have filed for FDA approval had the primary outcome been positive, Siemers said that would have been a major discussion. When Lilly scientists presented data on the persistence of the Expedition 1/2 small benefit on 3½-year follow-up to FDA statisticians at the 2015 AAIC conference, they received a rather pointed reminder to instead aim for a large effect size (see Aug 2015 conference news). With a small effect size, they would have had to address the question of clinical meaningfulness, Siemers said.

All things considered, some researchers at the meeting thought the outcome a blessing in disguise, since approval of a first drug, even if weak, can douse enthusiasm for developing other drugs that might turn out to be much stronger. Moreover, an expensive biologic drug of questionable effect size can attract unwelcome controversy about drug pricing and cost versus benefit.

Where does this leave the ongoing trials of solanezumab in prodromal AD? Nick Fox, University College London, England, suggested the dose be raised given that the antibody appears safe—there were no serious adverse events in Expedition 3. Only a few minor ones reached significance, such as nasal congestion and vitamin D deficiency, and some of those affected the placebo group more than the active group. Siemers said there is a lot of discussion around raising the dose but that it is not straightforward and no decision has yet been made.

Aisen said he expects the treatment effect to be larger at earlier stages of the disease. If true, this would bode well for the DIAN, A4, and ExpeditionPro trials. CTAD co-organizer Bruno Vellas, University Hospital of Toulouse, France, agreed that earlier will be better, but emphasized the medical need of the millions of people currently living with mild AD. “We have to continue to develop treatments for mild AD for the sake of those patients,” he insisted.

Some took a more cautionary stance. Steve Salloway, Brown University, Providence, Rhode Island, warned against putting much faith into the secondary outcome measures since the statistics were not corrected for multiple comparisons. Honig argued that correction was not required, since Lilly was not trying to claim efficacy based on those secondary measures. Suzanne Hendrix, president and CEO of Pentara Corporation, Salt Lake City, split the difference. “When the primary endpoint is not significant, one should interpret any secondary with caution,” she said, “but at some point the level of evidence of the secondary is good enough that you don’t have to question if it is real.” The secondaries in Expedition 3 were all correlated because they all measure the same underlying disease process, hence one doesn’t have to correct as much, she said. “All the evidence together says we have changed the disease process [in Expedition 3], but by only a small amount,” Hendrix concluded.

Others did not buy that the disease process was altered, because the biomarker data was weak. While a 500- to 800-fold increase in plasma Aβ40 and Aβ42 in the active group indicated the antibody bound and held the peptide in the blood, the brain biomarker data were murkier. Brain volume shrank with both drug and placebo at about the same rate, and brain ventricular volumes expanded by the same amount. Florbetapir PET at baseline and week 80 hinted that solanezumab lowered the cortical SUVR a tad, but the difference was not significant. Honig showed no CSF Aβ data because its analysis is ongoing.

The tau data troubled people the most. Both CSF total tau and p-tau rose more in the treatment than placebo group, with the former just missing significance (p=.06). Tau PET using AV1451 also hinted at higher levels in the treatment group, though again this was not significant. “The tau data is definitely concerning,” said Reisa Sperling, Brigham and Women’s Hospital, Boston, co-principle investigator on the A4 trial with Aisen. “It is impossible to know what that means in this small substudy, but I would like to see more data in the subgroup who started with lower levels of pathology,” she told Alzforum. Anton Porsteinsson, University of Rochester, New York, agreed. “That there’s no difference in density of fibrillar Aβ, and that the trend for tau is in the wrong direction is more worrying than the clinical effect size,” he said.

Honig cautioned that the data are difficult to interpret. “A drug might work and not affect plaques, or it might affect plaques and not work,” he said. “The best scenario would be if the biomarkers and clinical outcomes move in the same direction, but we don’t have complete data and because the effect sizes are small, it is too hard to judge,” he said. Aisen stressed that this antibody was not designed to attack plaques, but to target soluble Aβ. “For this antibody, the best marker may be plasma Aβ, because it shows you have tied up the soluble peptide,” he said.

Observers agreed that this trial does not reduce the likelihood that other anti-Aβ immunotherapies will succeed. “Solanezumab has no bearing on aducanumab. Different studies, different targets,” said Schneider. “I’d expect any of the anti-Aβ antibodies might work better at a preclinical stage when neurodegeneration is less extensive,” Aisen concluded.—Tom Fagan


Make a Comment

Comments on this content

  1. A calculated perspective on solanezumab’s outcomes

    Much discussion about Expedition 3 treats the p values as though they are measures of magnitudes of effects. For example, since the p values range from .004 to .14 and four of the six outcomes presented are less than .02, this is considered by some as evidence for a consistent, strong, clinically meaningful effect. Moreover, as the CDR-sb showed a very low p value at .004 and has been used as the sole primary composite outcome in other trials, it’s been suggested that it showed a large effect and could serve as the (admittedly, post hoc) primary outcome. Along similar lines, two of the three EMA-required outcomes for Alzheimer disease trials were statistically significant at p = .009 and .004 for ADLs and the CDR-sb, respectively. If significance on two of three key outcomes was good enough for the FDA to approve memantine in 2002, then solanezumab could be approved on this basis as well.

    These opinions, however, conflate p values with effect sizes; p values are probability statements about the randomness of the distributions of the outcomes, but are not measures of magnitude of effect. A less impressionistic and more nuanced approach to interpreting the clinical significance of solanezumab is to examine the effect sizes of the outcomes. The Cohen’s d effect size is commonly used, easily calculated from the data Lilly presented, and expressed as the mean difference between solanezumab and placebo divided by the pooled standard deviation of change of the outcome scale. The effect sizes in Expedition 3 for the ADAScog14, ADL, and CDR-sb are d = 0.07, 0.11, and 0.10 standard deviation units, respectively. These numbers mean that the distributions of the solanezumab and placebo outcomes for each scale overlap by 96 to 97 percent, and that there is only a 52 to 53 percent probability of superiority for solanezumab over placebo for any scale, e.g., that a patient picked at random from the solanezumab group will have scored better on the CDR-sb than a patient randomly chosen from the placebo group only 53 percent of the time. Notably, these very small effects exist with low p values < .01 in two cases. We can go further, however, and calculate what the effect size would have been for the ADAScog14 if the outcome had resulted in a p = .001 instead of .095. Here, keeping the sample size and standard deviation the same, the effect size would have been d = 0.14, twice as large as that actually observed but still very small, implying 94.4 percent overlap. (By comparison, the effect sizes for many donepezil trials with sample sizes of about 450 patients is about d = 0.25 or greater, 90 percent overlap, still a small effect, and is part of the reason why the effectiveness of donepezil has been so controversial.)

    In summary, we might be seeing in the Expedition 3 trial, with four of six statistically significant outcomes, p < .02, and very small effect sizes, d < 0.11, the results of a very large sample size trial with substantial clinical heterogeneity, heterogeneity of outcomes, measurement error, and possibly unknown, small systematic errors contributing as well. It will be difficult to judge whether or not there is a clinically meaningful effect or, indeed, whether any compelling solanezumab-responsive subgroups emerge from post hoc analyses, as these subgroups likely will be identified on the basis of low p values and small effects. 

  2. These data should serve to remind us all that it is always much easier to start an avalanche than it is to stop one, and that cleaning up the debris field cannot undo the reality of the damage left in its wake. Unfortunately, while the idea is certainly enticing, we must recognize too that increasing the dose may also have untoward off-target effects.

  3. Drugs of course fail for many reasons—wrong mechanism, wrong target, wrong patient population, wrong dose, etc. In the case of solanezumab, if one accepts that the drug was "active," albeit with a very modest effect size (an assumption I share but that is certainly debatable), then one really needs to assess the adequacy of the dose used. Effective and safe drug doses are generally determined by systematic dose ranging Phase 1 (safety) and Phase 2 (safety and efficacy) clinical studies, but in the case of solanezumab these dose-ranging studies were very limited in scope and there was no real evidence that the dose subsequently used in the pivotal Expedition trials (a dose of 400 mg or about 5.7 mg/kg monthly) was by any means optimal.

    Although "hindsight is always 20/20," Lilly might have been better served by conducting more extensive dose-ranging Phase 2 trials. Given recent results with other Aβ antibodies, it is quite possible that an effective dose of solanezumab (assuming it works) might be as high as 10-60 mg/kg (two to 10 times the dose used!), and similar to the doses now being explored for other Aβ antibodies, including aducanumab and crenezumab. Given the very favorable safety profile of solanezumab, it is unfortunate that much higher doses weren't adequately explored in Phase 2.

    Finally, it should be emphasized that the encouraging data from the Phase 2 study of aducanumab included a large number of prodromal/MCI patients, another critical variable in comparing results of various Aβ antibodies. One can only wonder what the treatment effect size of solanezumab might have been had the doses tested been considerably higher and the patient population less advanced. 

  4. Steven Paul raises some interesting points, but surely the earlier Phase 1 work would have been the primary basis, toxicology aside, for the choice of dose?

  5. The choice of dose requires both a dose-ranging safety assessment (done with both single and multiple doses in Phase 1) and some reasonably definitive measure of efficacy (usually accomplished by dose-ranging studies in Phase 2, and not usually done in Phase 1). Importantly, however, these require some readout of efficacy, e.g., lowered amyloid plaque, improvement or slowing of cognitive impairment, etc. These Phase 1/2 studies were very short in duration and because the timeline for determining efficacy for a drug like solanezumab requires many months of continuous dosing (the pivotal trials were 18 months in duration), such dose-ranging studies to assess efficacy were never really done for solanezumab, and only two doses were ever tested (400 and 800 mg). The decision to advance the 400 mg dose was not made based on any real efficacy readout/assessment. In my opinion, given that both doses were well tolerated, in hindsight, higher doses should have been explored (much as was done with the recently published Phase 2 trial of aducanumab which examined four doses of active drug vs. placebo over 12 months). These critical dose ranging Phase 2 studies for both safety and efficacy were never really carried out for solanezumab. Of course, it's also possible that higher doses of solanezumab would have been ineffective in slowing cognitive impairment as well ... but my point is that we will now never know. 

Make a Comment

To make a comment you must login or register.


Therapeutics Citations

  1. Solanezumab
  2. Aducanumab

News Citations

  1. Lilliputian Effect Size Fells Phase 3 Trial of Solanezumab, Leaving Its Future Uncertain
  2. Aducanumab, Solanezumab, Gantenerumab Data Lift Crenezumab, As Well

Further Reading

No Available Further Reading