25 September 2008. Showing that an experimental Alzheimer’s therapy delivers meaningful cognitive change is hard enough. But increasingly, AD drug sponsors face an additional challenge: making sure the tests capturing that change are up to snuff. For well over a decade, investigators have clung to the ADAS-Cog, an 11-item portion of the Alzheimer Disease Assessment Scale that takes less than an hour to administer and seems to cover the key cognitive domains that fade in AD.
But AD trials are evolving, and so must their cognitive assessment scales, some say. As new imaging and biochemical tools reveal the brain unraveling at the molecular level years in advance of symptoms, consensus is building around the idea that intervention must start earlier to be effective at all. As a result, more AD trials are recruiting less impaired subjects, who are believed to stand a greater chance of benefiting from the latest batch of experimental drugs. But milder AD patients tend to decline more slowly (Morris et al., 1993), forcing longer, larger, and costlier studies to show clear separation between drug and placebo (see ARF related news story).
This shift has caused the field to take a hard look at whether the time-tested ADAS-Cog is in fact the best instrument for measuring cognition in these newer trials. Though the test has been the primary outcome measure in late-stage trials of all AD drugs approved by the U.S. Food and Drug Administration (FDA) thus far, there is growing dissatisfaction with its ability to gauge mental decline in early-stage dementia patients. To track changes more reliably in these populations, clinicians are working to devise newer instruments, some of which are computerized. Though none have yet come close to dethroning the ADAS-Cog as the de facto clinical endpoint for AD trials, some wonder whether the field could be poised for an evolution in the near future.
One instrument that appears headed for wider use in AD drug studies is the Neuropsychological Test Battery (NTB). This 40-minute, paper-and-pencil test was developed by Elan Corporation by patching together parts of other commonly used scales of memory and executive function. It has been correlated with a number of cognitive scales widely used in AD studies (ADAS-Cog, Mini-Mental State Exam (MMSE), and Clinical Dementia Rating—Sum of Boxes [CDR-SB]) and validated by its makers as a reliable measure of cognitive change in mild and moderate AD patients (Harrison et al., 2007). The NTB also seems useful at earlier disease stages. An Elan-led team presented a poster)
at the International Conference on Alzheimer’s Disease, held 26-31 July in Chicago, suggesting that the test distinguishes MCI patients from healthy subjects and those with AD.
In completed AD trials, the results have been mixed. According to recently released Phase 2 results of Elan/Wyeth’s humanized monoclonal anti-Aβ antibody, AAB-001 (aka bapineuzumab), the NTB seemed to fare just as poorly as the ADAS-Cog at separating treatment and placebo groups (see ARF related news story). But the battery has shown promise in trials of several other AD drugs—for example, Elan’s AN1792 vaccine (Gilman et al., 2005) and Prana Biotechnology’s metal-protein interaction-attenuating compound, PBT2 (see ARF related news story). In both cases, drug-treated participants who showed no significant change using other cognitive measures, including the ADAS-Cog, had measurable gains when assessed by the NTB.
But increased sensitivity comes as a two-edged sword. “The more sensitive your test is, the less assurance you have that any change you see means anything,” said Russell Katz, director of the FDA's Division of Neurology Products. “Nobody's interested in finding a drug that has no value to the patient but can make you remember one more word in five minutes.”
Nevertheless, Elan and Wyeth have forged ahead with the NTB, using it as the primary endpoint for ongoing multinational Phase 3 trials of their passive AD immunotherapy. When asked whether the NTB would be accepted as an alternative to the ADAS-Cog, Katz told ARF, “I believe we've told companies that the NTB would be adequate as the cognitive measure.”
From Face-to-Face Q&A to Computer Mouse Clicks?
However, the NTB, ADAS-Cog, and other paper-and-pencil tools have inherent limitations. “It's not only important how well a patient responds to various tasks, but how quickly,” said Ely Simon of NeuroTrax Corporation, Newark, New Jersey, maker of the Mindstreams® computerized cognitive assessment tool. “We can measure in milliseconds how quickly a patient can make transitions.”
The ability to gauge response times quantitatively is a key feature of computerized batteries—and one with presumed appeal in this era of disease-modifying AD drugs for which trials have become exceedingly expensive. Tests such as Mindstreams® “will be a tremendous cost savings to the industry,” Simon contends. “If you're able to have a more sensitive instrument with more consistent results, then you'll need fewer patients to demonstrate efficacy.”
All responses in this 30-minute battery are captured by mouse clicks or keypad entries. Used thus far by more than 6,000 patients in research studies and 45,000 in the clinic, Mindstreams® covers similar domains to the ADAS-Cog but appears more sensitive for detecting mild cognitive impairment (MCI) (Dwolatzky et al., 2004).
Computer-based cognitive instruments may be starting to catch on with some AD drug companies. Epix Pharmaceuticals of Lexington, Massachusetts, for example, used the NeuroTrax test alongside the ADAS-Cog as outcome measures in Phase 2a trials of its serotonin receptor agonist PRX-03140. According to data presented by J. Thomas Megerian earlier this year at the Keystone conference on AD and at ICAD, subjects taking the Epix drug in combination with donepezil, a cholinesterase inhibitor widely used in AD, showed no change on the ADAS-Cog but had measurable improvement on visual spatial and memory indices of the computer-based assessment (see ARF Keystone story).
As for why Mindstreams® was chosen as an endpoint for those trials, Megerian offered several reasons that apply to computer-based tools in general. One is better standardization. It doesn’t matter if one person gets the test on day 1 and another on day 20, he said. “The interviewer effect is removed when you use a computerized test.” Another reason is that ADAS-Cog spits out a single composite score, without the breakdowns that many computer-based tests offer for performance in subcategories such as executive function, verbal memory, and so forth. Moreover, Megerian said, some ADAS-Cog components are completely subjective—for example, language fluency.
Susan De Santi, scientific consultant for Cogtest, Inc., Newark, Delaware, which makes another computerized cognitive testing system, agrees that administration and scoring methods could account for disparities between outcomes assessed by the ADAS-Cog and computer-based scales. At ICAD, De Santi, a psychiatrist at New York University, and Roger Bullock of Kingshill Research Centre, Swindon, U.K., presented a poster showing a significant difference for the change from baseline measured by Cogtest’s word list memory test but not by the analogous ADAS-Cog section in a 28-day trial of a nicotinic acetylcholine receptor agonist in mild AD patients. “You got a signal when you used Cogtest but didn’t see it in the ADAS-Cog,” De Santi said. “It could be the way the tests are administered, or the way they’re scored. With ADAS-Cog, you’re coding not what the person is learning but what they didn’t learn—the errors they made.”
In the Cogtest word list memory test, individuals are asked to recall words from a list of 16 presented by computer audio. The examiner records the subjects’ responses on a special touch screen, and results are instantly accessible on a Web-reporting system.
Other features that differentiate paper-and-pencil and computerized tests may actually favor the former. “A computerized test doesn’t give an impressionistic sense of a patient as much as the ADAS-Cog,” Megerian said. There is “human-to-human stuff that a computer test can’t get at.” An example is the ADAS-Cog component that asks subjects to address mock letters to themselves.
“Most of these computerized tests simplify everything into a push of a button,” Megerian said. “That’s not real life.” As he sees it, one of the biggest knocks against computerized testing—especially for the older populations sought for AD trials—is that it can be hard to gauge how much dysfunction derives from a participant’s lack of familiarity with computers.
But this concern will likely dissipate with time, Megerian said. He believes a hybrid approach involving both traditional and computer-based cognitive tools may become more prevalent in AD trials. Linda Berkowitz, Cogtest’s director of business development, sees early Phase 2 studies as a context in which a blend of test formats might be particularly strategic. “When you’re working with such a blunt instrument as the ADAS-Cog, if you can get a signal from other instruments, at least you know whether you want to move on or not,” she told ARF.
While this seems to make sense for early-stage clinical studies, using computerized tests as endpoints in later pivotal trials is riskier, in Megerian’s view. Despite the apparent benefits of the Mindstreams® computerized assessment in Phase 2a trials of Epix’s serotonin receptor antagonist, the company chose not to use the NeuroTrax test in its ongoing Phase 2b studies. “Every time you add an endpoint, it adds money,” Megerian said. “Mindstreams® looks great on paper, but we had to pick four endpoints. We ended up going with the four most published (i.e., ADAS-Cog, CIBIC-plus, NTB, and ADCS-ADL).”
To clarify the FDA’s stance on cognitive scales for AD trials, Katz told ARF in a recent phone interview that the agency has “never gone on the record saying you must use the ADAS-Cog. We’ve never said this is the gold standard. It’s just evolved that way.” The EMEA, Europe’s analogous agency, stated in its recent draft dementia drug guidelines that it has approved the ADAS-Cog and NTB for use in AD trials but also indicated that in its view, there is “no ideal measurement.”
Katz said the FDA is “certainly open to some alternative measures” but re-emphasized the importance of using secondary global outcomes to validate effects seen on cognitive assessments—a requirement for AD drug approval since the early 1990s. “We want to make sure the drug’s effect is clinically meaningful,” he said. “That’s the overarching principle.” (What’s considered “clinically meaningful” could be evolving, though. For more on this, see ARF related news story).
Despite the FDA’s purported willingness to consider newer endpoints, there is a certain inertia in drug development—a tendency to go with a suboptimal test simply because it is tried and true. “First you've got to convince the FDA that you have good results. Then you have to convince them your tests were valid,” Megerian said. By using an alternate cognitive scale, “you're asking yourself to go through two hurdles instead of one.”
One reason most newer cognitive tools have not enjoyed wider use in AD research may be the inherent difficulty of validating these measures against the 15-year go-to assessment, especially when the alternate tests are believed to be more sensitive. “If you think the old test is the gold standard,” Megerian said, you won’t know if your test isn’t matching because it’s better or because it’s not working properly. “Anything more sensitive than the gold standard will look bad.”
Combining EEG With Cognitive Testing
Two North Carolina-based companies are taking computerized cognitive assessment to even newer heights—by adding an electrophysiology component. Cognitive test maker CNS Vital Signs and Neuroscan, a major developer of electroencephalogram (EEG) hardware and software, have joined forces to offer a customizable tool whereby a 32-electrode “bonnet” measures brain electrical activity while the wearer takes a 30-minute computerized test. The system was on display among the exhibit booths at the recent ICAD meeting in Chicago. It not only tracks right and wrong responses, but also “how quickly they do it and what part of the brain was stressed,” said Alan Boyd, president and CEO of Chapel Hill-based CNS Vital Signs. The cognitive test and EEG applications “are synched up so we know to the millisecond what they’re seeing and responding to.” The system randomizes test questions at each sitting, and offers age-matched scoring.
EEG has been used in AD only sparsely thus far because standard recordings have not picked up decline deep in the brain where early AD damage occurs. However, recent data on network dysfunction in AD mouse models suggest the possibility of subclinical seizures (see ARF related news story). Such studies have renewed interest in improving EEG so it might reflect more directly the changes in activity in AD.
The first step would be to document on EEG that these “silent” seizures in fact occur in AD patients (not just rodents), and for this the CNS/Neuroscan system could be helpful, said Beth Leeman, a neurologist at Massachusetts General Hospital in Boston. “The most useful thing about a system like this would be to have an interface between the EEG and the cognitive task presentation software. That's very difficult to create independently,” Leeman said. “This would be very useful—to be able to correlate abnormalities on EEG with the exact time that a stimulus is presented on a cognitive test.”
The small handful of newer cognitive scales mentioned in this story is by no means comprehensive, and it remains to be seen whether any will be ready for prime time soon—or ever. “If companies think they have a scale that has better properties or is more useful for certain purposes, they should come to us with it. There isn't that much that people have come to us with as alternatives,” Katz said. “We'd actually like to see different things, quite frankly. We've been looking at the ADAS-Cog for 15 years.”—Esther Landhuis.