When COVID forced much of the world into virtual work, dementia scientists sped up their ongoing adaptation of cognitive assessments to digital versions (Dec 2021 conference news). How are they performing now? At the Clinical Trials in Alzheimer’s Disease conference, held October 24-27 in Boston, scientists described smartphone apps that flagged amyloid positivity after just a few days, discerned early AD as well as in-clinic exams, and pinpointed areas of brain atrophy from narrated stories. Others optimized automated scoring algorithms to the point where they beat humans at rating cognitive tests.

  • Four days with the BRANCH app indicated amyloid positivity.
  • The Mobile Toolbox app detected AD as well as in-clinic PACC5.
  • AI-driven software outperformed raters at scoring story recall.
  • Storytelling skills tracked with AD diagnosis, atrophy in language regions.

“Many digital tools are not ready for prime time, but they are getting there,” Dorene Rentz, Massachusetts General Hospital, Boston, told Alzforum. “Within a year or two, we will see an explosion in this area.”

Why digitize testing? The energy for the push across the field toward doing so comes from indications that asking a person to complete tasks on their personal computing device in their home, and doing so regularly, might better reflect their state of mind and memory than what paper and pencil can reveal during rare, stressful clinic visits. Rentz envisions using digital tests in clinical trials to probe cognition more frequently than was typical with in-person tests, so researchers can more accurately detect change over time.

Sussing Out Preclinical Alzheimer's in Days
How well an older person learns seems to signal the presence of AD pathology and subtle cognitive decline. That’s what Kate Papp at Brigham and Women’s Hospital, Boston, had figured out when she asked cognitively healthy people to take the tablet-based Computerized Cognitive Composite (C3), which includes the Cogstate brief battery, face-name associations, and an object-recall task. Participants took the C3 each month for a year, and at CTAD 2021, Papp reported that all got better at it, showing practice effects. However, after three months, people who were amyloid- or tau-PET-positive, or who slipped fastest on the in-clinic Preclinical Alzheimer's Cognitive Composite (PACC5), improved less on the C3 than did amyloid- or tau-negative participants or people who declined slowly on the PACC5. To Papp, such diminished practice effects signal a failure to consolidate memories.

Might daily testing spot this more quickly? To find out, Papp and colleagues developed a smartphone app called the Boston Remote Assessment for Neurocognitive Health. BRANCH includes face-name association, digit-symbol pairing, and grocery item and price recall tasks (image below; Papp et al., 2021).

At this year’s CTAD, Papp reported results from 164 cognitively normal adults enrolled in the Harvard Aging Brain Study (HABS), of whom 36 had positive amyloid scans. For one week, they spent an average of 12 minutes each day completing the same BRANCH tasks. “We often try to limit practice effects by using alternate forms or increasing the time between assessments,” Papp said at CTAD. “With BRANCH, we want to embrace the practice effect and use it over a short interval to detect individuals with subtle memory consolidation difficulties.”

Practice Makes Perfect? People using the BRANCH app repeat the same tasks each day for a week to measure how well they learn. [Courtesy of Roos Jutten, Massachusetts General Hospital.]

Participants found the app easy to use, with 97 percent completing all seven days of testing (Weizenbaum et al., 2023). This means older people can fit these assessments into their daily lives.

Lo and behold, after only four days, a subtle difference between amyloid-positive and -negative participants emerged, when the former completed tasks about 8 percent less accurately than the latter.

Repeated testing on the BRANCH app accomplished what a single in-person test could not, as baseline PACC5 scores did not discern amyloid status. This suggests that the app might become more useful for clinical trial recruitment than a single in-clinic assessment.

Poor learning on BRANCH also predicted future cognitive slippage and worsened learning problems, albeit with a relatively weak R-value of 0.54. People who had learned the least after a week of using the app declined the most on the PACC5 one year later. “Not only can we see differences in seven days of performance, but learning curves may also change from baseline among those with elevated AD biomarkers,” noted Rebecca Amariglio of MGH, who co-developed BRANCH.

Roos Jutten, a postdoc in Papp’s lab, found that when amyloid-positive people repeated new versions of the BRANCH tasks six months and a year after their original sessions, they were less accurate each time. “Ongoing research into how these learning curves change over time will determine whether they can be used to track short-term cognitive changes and be useful as an outcome measure in AD secondary prevention trials,” Jutten told Alzforum.

To validate BRANCH in larger, more diverse cohorts, Papps' team added the app to the Alzheimer Prevention Trials webstudy. They plan to give BRANCH to people who finished the U.S. POINTER study and are being followed in an extension, and they are testing a Spanish version of BRANCH.

In-Clinic Tests—No Longer Gold Standard?
In another effort to track cognitive change remotely, Jutten and Rentz are testing how the NIH’s Mobile Toolbox app stacks up against the pencil-and-paper PACC5. Through eight tasks, the MTB tests episodic memory, processing speed, executive function, spelling, and vocabulary. The PACC5 uses five tasks to gauge episodic memory, processing speed, semantic memory, and global cognition. The scientists thought four MTB tasks—face-name associations, picture sequence memory, number-symbol match, and a vocabulary test—might make for a fair PACC5-MTB comparison.

At CTAD, Jessa Burling of MGH presented preliminary results from 80 cognitively healthy HABS participants who spent 30 minutes taking the entire MTB on their smartphones. She compared their results to their PACC5 scores and amyloid and tau PET scans. PACC5 scores modestly correlated with composite scores of the four “PACC-like” MTB tasks and those plus combinations of the other tasks, clocking coefficients of 0.29 to 0.40. The PACC5 and all MTB composites correlated weakly with tangles, but not with plaques. This aligns with Papp’s prior finding that the PACC5 does not discern a person's amyloid status.

Not all efforts to digitize in-person assessments have worked out. Rachel Nosheny of the University of California, San Francisco, reported that the electronic Clinical Dementia Rating (eCDR) underperformed relative to its in-clinic counterpart (Nosheny et al., 2023). She compared scores from 163 cognitively healthy older adults and 43 with MCI from UCSF's Brain Health Registry and three clinical sites across the U.S., who had taken both versions within two weeks.

Of 52 items shared between the in-person and eCDR, 46 correlated with greater than 70 percent concordance. Almost all discordant items fell into the memory domain, a key component of assessing incipient AD. Yan Li at Washington University had previously reported similar results (Dec 2021 conference news). This discordance translated to high rates of false positives and negatives, 8 and 55 percent, respectively. The CDR's detailed interviews with the participant and his or her study partner may be challenging to render online, Nosheny concluded.

The eCDR's global score modestly correlated with its in-person counterpart, with an area under the curve of 0.79 out of 1. Nosheny cautioned that an optimized version would be needed before it is ready for clinical use.

Scary Story Time
The way you tell a tale can reveal to an astute assessor whether your verbal cognition is starting to slip. Researchers at London-based Novoic have developed a web app that aims to capture this. Called Storyteller, it uses an automated recall task, where a proband listens to a 120-word story and immediately retells it, does the same with a different story, and then recalls the first story. Storyteller records and transcribes the narrations, scores them based on how closely they match the original version, and sums the recalls into a final score.

In an early study called Amyloid Prediction in Early Stage Alzheimer's Disease From Acoustic and Linguistic Patterns of Speech (AMYPRED-UK), Storyteller pegged people diagnosed with mild cognitive impairment as well as did the in-clinic PACC5 (Dec 2021 conference news).

Since then, Novoic has deployed new AI methods to improve the app. At CTAD, Caroline Skirrow of Novoic pitted man versus machine at scoring an in-person verbal story recall task. Participants included 84 cognitively normal adults and 89 people with MCI from AMYPRED-UK and AMYPRED-US, a sister study based in California. For each, she compared a consensus score set by a group of raters—considered the gold standard for this study—to the score given by one of 15 highly trained raters and an automated score generated by the new Novoic software. Called AccuRater, it rates the participant’s recorded, uploaded, and auto-transcribed response based on standardized scoring instructions.

Compared to the individual rater scores, AccuRater scores correlated a bit more tightly with the consensus score, bumping the correlation coefficient from 0.94 to 0.98 (image below). This was because raters tended to make more mistakes and give slightly lower scores than the software.

AI Beats Human. For a story recall task, automated scores calculated by Novoic’s AccuRater software (right) correlated more tightly with scores assigned by a committee of raters than did scores by individual raters (left). [Courtesy of Caroline Skirrow, Novoic.]

Skirrow thinks AccuRater will enable consistent detection of cognitive impairment for screening clinical trial participants or tracking changes over time. She said Novoic is incorporating the AccuRater algorithm into Storyteller to improve its scoring. The app is already being used to screen 20,000 participants into the latest iteration of the Alzheimer’s Disease Neuroimaging Initiative (ADNI; Oct 2022 news).

Rentz asked if AccuRater works on audio from people speaking with accents or in languages other than English. Skirrow replied that Novoic's transcription software is multilingual and performs better than Google Speech-to-Text. As to applying the scoring system to other languages, Skirrow said this is possible but hasn't been done yet. “We lack validation data,” she said.

A different speech analysis storytelling app by Roche is starting to fill this gap. Story Time is a smartphone app where people narrate a story depicted in cartoon strips, then retell it immediately, and again a day later without seeing the pictures. The app measures word count, noun rate, and speech duration. In a study shown at CTAD, 32 cognitively normal adults, 31 with subjective cognitive decline but no amyloid plaques on PET, 30 with amyloid-PET-positive SCD, and 30 with early AD completed nine Story Time sessions over a month. Half of the participants spoke Spanish.

Irma Kurniawan and colleagues at Roche assessed how well the app detected early AD and how storytelling performance mapped to brain atrophy on MRI. Compared to controls, people with AD used more words, and paused less, while telling a story the first time, yet spoke fewer words when recalling it. Participants who used the most words during the first narration had the most atrophy in their left angular gyrus and middle temporal lobe. People who recalled little of the story the next day had atrophy in different areas, such as their left perirhinal cortex and hippocampi (image below). Notably, the left hemisphere regions are involved in language processing, while the hippocampus is crucial for memory, possibly giving a functional explanation for the storytelling deficits.

Anatomy of Narration. Talking more and pausing less while telling a story correlated with atrophy in the left angular gyrus and middle temporal lobe (left) while poor recall associated with atrophy in other areas (right). [Courtesy of Thanneer Perumal, Roche.]

Thanneer Perumal of Roche told Alzforum that the company plans to use the Story Time app in upcoming clinical trials, but would not disclose which ones.

Cognition isn’t the only component of dementia that can be better measured by technology, Rentz noted. She also wants to see digital measures of passive activities, such as sleep, gait, and of physical activity, with the ultimate goal of using them as outcome measures in clinical trials. Cognition drives function but the reverse can also be true. For example, restored physical energy, perhaps thanks to better sleep, in turn can benefit cognition. “Cognition is important, but function is the be-all and end-all,” she told Alzforum.—Chelsea Weidman Burke

Comments

  1. The evaluation of cognitive tests for their predictive abilities requires, and seems to routinely receive, close attention to the testing context. Subjects are all familiar enough with the computing device to rule out computer sophistication effects, especially when using the simplest touching method on an easily viewed screen. Generalizing from the Boston intellectual volunteer community to the general public seems feasible. Another, more insidious, confounding variable is the social context of home testing without researchers present. Granted, it is burdensome to get in for a clinic visit for testing, but the enhanced validity of results may be worth it. For example, our evidence shows that even for computer-administered tests in a controlled office environment, spouses cannot be present for valid testing.

    Full disclosure: I developed and commercialized the Computer-Administered Neuropsychological Screen for Mild Cognitive Impairment, CANS-MCI, acknowledged for geriatric usability (Wild et al., 2008) and biomarker predictive ability (Barber et al., 2018).

    References:

    . Status of computerized cognitive testing in aging: a systematic review. Alzheimers Dement. 2008 Nov;4(6):428-37. PubMed.

    . CSF Markers of Preclinical Alzheimer's and Deficits on a Self-Administered Computerized Test Battery, the CANS-MCI. AAIC 2018, Chicago. Abstract ID: 25554 Research Gate

Make a Comment

To make a comment you must login or register.

References

News Citations

  1. Bringing Alzheimer’s Detection into the Digital Age
  2. The Rain in Spain: Move Over Higgins, AI Spots Speech Patterns
  3. In Its Latest Iteration, ADNI Broadens Diversity

Paper Citations

  1. . Unsupervised mobile cognitive testing for use in preclinical Alzheimer's disease. Alzheimers Dement (Amst). 2021;13(1):e12243. Epub 2021 Sep 30 PubMed.
  2. . Capturing learning curves with the multiday Boston Remote Assessment of Neurocognitive Health (BRANCH): Feasibility, reliability, and validity. Neuropsychology. 2024 Feb;38(2):198-210. Epub 2023 Nov 16 PubMed.
  3. . Evaluation of the Electronic Clinical Dementia Rating for Dementia Screening. JAMA Netw Open. 2023 Sep 5;6(9):e2333786. PubMed.

External Citations

  1. Alzheimer Prevention Trials webstudy
  2. U.S. POINTER
  3. Mobile Toolbox

Further Reading

No Available Further Reading