Could how a person speaks imply early cognitive trouble? My Fair Lady’s Eliza Doolittle would bristle at the question, but artificial intelligence might say it can, at least in people with frontotemporal dementia and Alzheimer’s disease. At the Clinical Trials on Alzheimer's Disease conference held online and in Boston, November 9-12, researchers presented apps based on artificial intelligence that detect subtle speech differences in the cognitively impaired.
- Speech composite tracks subtle language decline in FTD.
- AI-based recall task might detect mild cognitive impairment.
- Automated phone task to detect AD will be tested in 1,200 people.
- Scientists will build a database of speech files, cognitive scores, biomarkers.
One, from the Canadian company Winterlight Labs, picked up deterioration in how people with FTD describe pictures. Another, from U.K.-based Novoic, predicted mild cognitive impairment due to AD based on how well a person recalled a story they had heard. Others scientists devised an automated phone survey that assesses memory and executive function based on speech patterns, and are testing it in early AD. To fuel this type of research, the Alzheimer’s Drug Discovery Foundation will create a database of speech files, cognitive scores, and fluid and imaging biomarker measurements from 3,000 participants in six Alzheimer’s disease research centers.
“Speech biomarkers is an exciting field—imagine understanding if a person is at higher risk of FTD or AD just from the way they talk,” Foteini Orfaniotou, F. Hoffmann-La Roche, Basel, Switzerland, told Alzforum. Hiroko Dodge of Oregon Health & Science University, Portland, who is analyzing language patterns on recorded video chats with people 75 and older, agreed. “Speech biomarkers are very promising,” she said. Eric Siemers of Siemers Integration LLC thinks that figuring out who is amyloid-positive without a PET scan is the real promise of digital biomarkers, be they speech or cognitive tests.
Some people with FTD develop aphasia. They struggle to recall words, often resort to basic vocabulary, and construct simple sentences devoid of non-essential words. They pause or studder. Jessica Robin from Winterlight Labs in Toronto said clinicians notice subtle language problems early in disease but have no tools to investigate what this means. Her company’s speech-assessment app aims to fill that gap, supplying an objective measure to complement cognitive rating scales.
Robin and colleagues developed algorithms that scored more than 500 vocal features across five categories. Language variables include: acoustics, such as pitch, pitch variability, and speed; lexical features, such as the types of words used, how common or evocative the words are, and the proportion of nouns to verbs; and syntax, including type and complexity of sentence construction. Other categories deal with the content of speech, including higher-order features such as how well a person stays on theme rather than bounces around verbally or repeats themselves, and fluency.
The researchers combined these scores into a composite that included vocal acoustics, how talkative people were, how often they paused, how often they used nouns, their sentence complexity, and the amount of correctly described picture content.
At CTAD, Robin presented results from an observational study of 36 older people with FTD. About two-thirds had the behavioral subtype; in them, language problems develop late in the disease. With the help of a caregiver, participants accessed the Winterlight app on a tablet or smartphone, and the app asked them to describe pictures in detail. They did this once a month for three months, then again at six, nine, and 12 months (see image below). The volunteers could take as much time as they liked, and provide as much or as little information as they wanted. “The open-ended format provides a rich snapshot of how people speak naturally,” Robin said.
Over the year, participants’ explanations of the pictures they were shown became briefer. They also paused more in their descriptions. This, along with changes in the other variables that make up the composite, lowered their scores (see image at right). “We were pleasantly surprised to pick up a number of speech changes over time, even in people with the behavioral variant,” Robin told Alzforum. Composite scores for 41 age-matched healthy controls held steady over six months.
Robin said she will study this app in larger cohorts, and compare scores to the “CDR plus NACC-FTLD,” a validated and publicly available composite measure that combined the original cognitive and functional domains of the CDR with behavior and language domains tailored to track FTD symptoms (Mar 2021 conference news). Robin also plans to correlate the speech composite with fluid and imaging biomarker changes. The Winterlight app is already used as an exploratory endpoint in a few AD clinical trials.
Do You Sound Amyloid Positive?
Researchers at London-based Novoic are developing an AI-driven speech app to detect not only MCI, but also amyloid positivity. The company’s Emil Fristed presented results from a study dubbed Amyloid Prediction in Early Stage Alzheimer's Disease From Acoustic and Linguistic Patterns of Speech. AMYPRED-UK pits the Novoic app against in-clinic cognitive assessments and amyloid biomarker analyses to see if the digital task can peg who has memory troubles and who has plaques (Fristed et al., 2021). The team recruited 71 cognitively normal old people and 62 with MCI or mild AD. Half of the participants had had a positive amyloid PET scan or CSF analysis within the past five years, while the rest had been negative within the previous 2.5 years.
During a baseline telemedicine call, participants took the Preclinical Alzheimer’s Cognitive Composite 5 (PACC5) test and completed an automated story recall task (ASRT) created by Novoic. For the ASRT, they listened to a 100- to 200-word story, immediately retold it, and repeated this with two new stories. The app recorded and auto-transcribed the reiterations, then scored them based on how closely they matched the original (Skirrow et al., 2021). Average scores from the three recalls correlated with MCI diagnosis about as well as did the PACC5 scores.
What about amyloid status? In the entire sample and in MCI participants, both the app and the PACC5 poorly correlated with amyloid no better than did demographics such as age, sex, and years of education. However, in cognitively normal people, the algorithm better correlated with amyloid positivity than did the PACC5 or demographics, though the effect was modest (see image below).
App Beats Doc, PET? The Novoic app story recall task (blue) predicted MCI (top left) as well as baseline PACC5 score (pink), while demographics alone (gray) did no better than chance. App and PACC5 similarly predicted amyloid positivity in the entire sample (top right) or MCI participants (bottom left), but the app better predicted positivity in the cognitively normal (bottom right). [Courtesy of Fristed et al., medRxiv, 2021.]
Fristed said they will wrap up analysis of all U.K. data and data from AMYPRED-US, a sister study being conducted in Santa Ana, California, by the end of this year.
You Rang, AD?
How early in Alzheimer's disease does a person's speech start to suffer? Alexandra König, University Hospital Center in Nice, France, Nicklas Linz at the start-up ki:elements in Saarbruecken, Germany, and colleagues want to answer this question over the phone. At CTAD, Linz described the Population-based Screening Over Speech for Clinical Trials in AD study. PROSPECT-AD uses a speech-based machine learning model to tease out subtle vocal changes and will test how they correlate with cognition, biomarkers, genetics, and family history of AD. The goal is to eventually use the phone as a remote screening tool for AD trials.
The researchers plan to enroll 1,200 participants 50 years and older who have subjective cognitive impairment or MCI and known amyloid and/or tau status. They will come from four longitudinal observational cohorts: BioFINDER in Sweden, INSIGHT-preAD in France, DESCRIBE in Germany, and the EPAD site in Scotland. Enrollment is slated to begin early next year.
Participants will undergo in-clinic cognitive testing at baseline and after one year. Every quarter for 18 months, volunteers will receive an automated phone call that will guide them through the Rey auditory-learning and semantic-fluency tasks to assess memory and executive function. Answers will be recorded, and software will analyze more than 200 variables toward a composite score. Linz anticipates interim analysis in fall 2022 and final analysis by fall 2023. He said the phone test is already being used to prescreen participants in a Phase 2 AD trial, though he did not say which one.
Why a phone test rather than an app on a mobile device? “To capture speech, a phone call makes sense,” König said. The phone also enables wider screening because older people are likelier to pick up an occasional phone call and chat for 10 minutes than download and use an app regularly.
Building That Comprehensive Dataset
All these new speech-marker strategies are currently hampered by a lack of data on which to test their wares. Scientists are trying to build AI-driven algorithms capable of analyzing and picking out patterns among thousands of variables, but existing cohort studies do not come close to satisfying AI’s insatiable appetite for data on well-characterized study participants.
To plug this hole, the Alzheimer’s Drug Discovery Foundation will collect voice files through their Diagnostics Accelerator’s (DxA) Speech and Language Consortium. This longitudinal, observational study will recruit participants from ADRCs at Boston University, Emory University in Atlanta, and the University of Pennsylvania in Philadelphia, and from the Barcelona Brain Health Initiative and Barcelonaβeta in Spain, and Lund University in Sweden. Scientists will annotate the speech files with clinical, demographic, and biomarker data from the volunteers, collected annually. “We will have one common database for use as a single reference point,” Lampros Kourtis of ADDF said.
On a tablet provided by the study, 3,000 older people ranging from cognitively normal to dementia—either due to AD, FTD, vascular dementia, or Parkinson’s disease—will complete Novoic’s ASRT, complete a picture-description task, and answer a list of open-ended questions every quarter for at least three years. Responses will be in English, Spanish, Catalan, and Swedish, hence software will eventually be able to detect speech variables that change with cognition and are common across languages and accents.
A pilot study at BU and Emory is set to start up this year still; all sites should be enrolling by next year, said WHO. The dataset will be available in three to five years. “This is exactly what we need. The field as a whole will benefit from having a large dataset,” Linz said.—Chelsea Weidman Burke
- Dubois B, Epelbaum S, Nyasse F, Bakardjian H, Gagliardi G, Uspenskaya O, Houot M, Lista S, Cacciamani F, Potier MC, Bertrand A, Lamari F, Benali H, Mangin JF, Colliot O, Genthon R, Habert MO, Hampel H, INSIGHT-preAD study group. Cognitive and neuroimaging features and brain β-amyloidosis in individuals at risk of Alzheimer's disease (INSIGHT-preAD): a longitudinal observational study. Lancet Neurol. 2018 Apr;17(4):335-346. Epub 2018 Feb 27 PubMed.
- Fristed E, Skirrow C, Meszaros M, Lenain R, Meepegama U, Cappa S, Aarsland D, Weston J. Evaluation of a speech-based AI system for early detection of Alzheimer’s disease remotely via smartphones. medRxiv, October 24, 2021.
- Skirrow C, Meszaros M, Meepegama U, Lenain R, Papp KV, Weston J, Fristed E. Validation of a novel fully automated story recall task for repeated remote high-frequency administration. medRxiv October 18, 2021.