As the mountain of single-nucleus RNA-sequencing data grows taller, how can scientists extract meaning from it? One way is pseudotime analysis. In essence, this algorithm orders cells on a virtual timeline based on the similarity of their gene-expression patterns. “Cells that are alike are placed near each other along the spectrum of transcriptional changes,” explained Laura Heath of Sage Bionetworks in Seattle. Heath presented one of several pseudotime analyses currently being done in the Alzheimer's field at the Alzheimer's Association International Conference, held last month in San Diego, California.
- "Later" pseudotime correlates with worse neuropathology, rise in disease-associated glia.
- Pseudotime says microglia change before astrocytes do during AD pathogenesis.
- Relating pseudotime to multi-omics data illuminates pathways of progression.
The resulting diagrams look like trees. Scientists call branches healthy or diseased based on their expression of known markers. This, in turn, places each cell along the health to disease trajectory, exposing sequential gene expression patterns.
Pseudotime analysis allows scientists to turn cross-sectional data into “faux” longitudinal data to understand how cells change over time. This is important for Alzheimer’s, a disease that unfolds over the course of 30 years. Postmortem tissue offers but a snapshot of one time point, making it hard to discern when and how disease markers develop. Most brain transcriptomic data come from postmortem samples and are likewise difficult to interpret because it is hard to know if gene-expression changes are due to AD pathogenesis or organ damage associated with the end of life.
“Pseudotime trajectories offer a computational approach to model [transcriptomic changes], which can serve as a starting point for more detailed studies,” wrote Gregory Carter, Jackson Lab, Bar Harbor, Maine. Maria Wörheide agreed. She works at the Helmholtz Center Institute of Bioinformatics and Systems Biology in Munich. “Manifold learning algorithms, such as pseudotime, applied to cross-sectional data, have shown potential to provide novel insights into AD, although their robustness and scalability will require further investigation,” she wrote (full comments below).
At AAIC, four scientists showed how they use pseudotime analysis to wrangle RNA-Seq data. Two followed transcriptomic changes in astrocytes, describing a continuum of homeostatic to reactive cells in both healthy aging and AD. One plotted entire prefrontal cortex transcriptomes onto pseudotime trees, and one tied transcriptomic pseudotime to metabolomic changes in the AD brain.
As a test of the methodology, Heath and colleagues first performed a pseudotime analysis on bulk RNA-Seq data from postmortem brain tissue of healthy and AD cases in the Religious Orders Study and Memory and Aging Project (ROSMAP) cohort (Mukherjee et al., 2020). “The modeled trajectory beautifully recapitulated neuropathology and clinical disease states, such that control samples were near the beginning and AD samples at the end,” she explained at AAIC.
Next, Heath used the algorithm on published single-nucleus RNA-Seq data from 80,500 prefrontal cortex cells of 48 ROSMAP participants, half controls and half AD, as well as 1.2 million cells from the middle temporal gyrus of 84 people ranging from healthy to AD in the new Seattle Alzheimer’s Disease Brain Cell Atlas (May 2019 news; see Part 17 of this series).
Aiming to draw pseudotime trajectories for each cell type, Heath first focused on astrocytes. She collected the transcriptomes of 3,400 astrocytes from the 48 ROSMAP participants and 47,000 astrocytes from nine controls and 47 AD cases in SEA-AD. Heath said she didn’t analyze all 84 SEA-AD participants to avoid bogging down the algorithm with too much data.
Heath identified about 3,000 differentially expressed genes in astrocytes from AD samples compared to controls, then plugged these DEGs into the pseudotime algorithm to generate proxy disease trajectories for both datasets. “Late” pseudotime correlated with a high degree of AD pathology via Braak and CERAD scores, though only in the ROSMAP dataset. Despite its much larger size, the SEA-AD data contained many more AD cases than controls, and Heath believes this might have obscured the initial branching off of a temporal pattern.
That said, astrocytic changes were consistent in both datasets. As pseudotime “went by,” the glia progressed through six distinct phenotypes (see image below). Heath called the first homeostatic and the sixth reactive because the former highly expressed genes essential to astrocytic function, such as APOE, clusterin, and glutamine synthetase-encoding GluL, while the latter barely expressed those genes. Also, astrocytes from three SEA-AD participants without AD pathology, men aged 29, 42, and 50, matched the first group, supporting the categorization.
Sprouting Subtypes. Pseudotime trajectories (left to right) of astrocyte transcriptomes (dots) from ROSMAP (top) and SEA-AD (bottom) datasets. Colors denote six states; the first is called homeostatic (red circle). [Courtesy of Laura Heath, Sage Bionetworks]
Notably, astrocytes from controls and AD cases seemed much the same. Though controls had slightly more homeostatic astrocytes and cases had a few more reactive ones, each participant had astrocytes in all six states at widely varying proportions (see image below). At first, this surprised Heath. “Given how essential astrocytes are to maintaining neuronal health, and how responsive they are to all kinds of signals occurring during aging, there must be a need for multiple types of reactive astrocytes in all or most aging brains regardless of overt neuropathology,” she reasoned.
Everyone Has Every Type. All six astrocyte subpopulations (colors) were present in varying proportions among controls (left) and AD cases with low (middle) and high (right) pathology. [Courtesy of Laura Heath, Sage Bionetworks]
Sudeshna Das, Massachusetts General Hospital, Charlestown, reinforced Heath’s findings. Her pseudotime analysis also rendered a continuous spectrum of change from homeostatic to reactive astrocytes. Her MGH colleagues, in collaboration with Abbvie Inc., sequenced single nuclei from the prefrontal, entorhinal, visual cortex, and inferior temporal gyrus of 32 participants from the Massachusetts Alzheimer’s Disease Research Center. Averaging 80 years old, they ranged from Braak stages 0 to VI; controls were defined as Braak 0, I, or II without amyloid plaques, intermediates as Braak II or III with plaques, and AD cases as Braak V or VI with plaques. The scientists did not present the pre-mortem clinical diagnosis in their study, only the neuropathology data.
About those “intermediate” astrocytes. Do they represent cells in a continuum between homeostatic and reactive? Das ran a pseudotime analysis organizing the cells from the former to the latter. Six clusters of genes with similar expression patterns appeared: two whose expression rose together from homeostatic to reactive, one whose expression fell, and three whose expression peaked somewhere in between. “This suggests that the different astrocyte subpopulations may not be specialized cells, but rather transcriptional states in a trajectory from homeostatic to reactive,” Das said. Again, Heath agreed, noting that she sees similar intermediate astrocyte clusters in her data.
Grouping People, Not Cells
Gilad Green, Hebrew University of Jerusalem, took a different approach. He ran pseudotime analyses on the combined transcriptomes of all brain cells from each participant. Heath noted that pseudotime modeling is flexible enough to work on noisy data like that.
First, Green created a sizable snRNA-Seq database of 1.6 million cells from prefrontal cortex tissue of 478 ROSMAP participants and defined 96 distinct cell populations (see Part 17). Then he calculated the proportion of each cell population for each person, combined them into one composite value, and used a pseudotime algorithm to plot each value based on how similar it is to others. This created a forked trajectory, much like Heath’s above, where each data point represents a person, rather than a cell. Green declined Alzforum's request to share a representative image.
In Green’s trajectory, a single mass of points, which he believes represent homeostatic cells, diverged into two, presumably disease-related, paths. As pseudotime “passed” in each path, the proportion of homeostatic glia decreased in each person, just as Heath and Das had found. One path became enriched with reactive, GFAP-positive astrocytes, the other with disease-associated microglia (DAM) and disease-associated astrocytes (DAA). Green named these cells in this way because of their strong upregulation in the presence of amyloid plaques and neurofibrillary tangles.
To relate these paths to AD, Green consulted neuropathological and clinical data. He matched each person's degree of amyloid or tau pathology, as determined by immunohistochemistry, and their rate of decline on the ROSMAP cognitive composite, with their placement on the pseudotime paths.
People situated on the DAM/DAA path had more plaques and tangles, and faster slippage, than those on the reactive astrocyte path. The farther along in pseudotime a person was, the worse their neuropathology and cognition had been. Moreover, the proportion of disease-associated microglia was highest at “early” pseudotime, while that of disease-associated astrocytes was highest at “later” pseudotime. This aligned with Green’s previous data that a “DAM” microglial response precedes “DAA” astrocytes (see Part 17). He concluded that this path modeled AD progression.
Then what is the reactive astrocyte branch? Green assumes it is not normal aging, as people on it comprise those with AD and a wide range of pathologies. He hypothesizes that it may be people with slow progression or mixed dementia.
To Pseudotime and Beyond
If pseudotime is not sci-fi enough, watch Wörheide take exploration of omics space a step further. Wörheide used Heath’s published pseudotime analysis of bulk RNA-Seq and related it to metabolic change. This identified how metabolites link up with disease progression.
Wörheide analyzed mass-spectrometry concentrations of 667 metabolites from prefrontal cortex tissue of 154 ROSMAP participants in Heath’s RNA-Seq pseudotime analysis. The metabolites ranged from lipids and carbohydrates to amino acids and nucleotides.
Because “later” pseudotime meant worse AD, Wörheide correlated pseudotime to metabolite level. The concentration of 89 molecules rose or fell in lockstep with pseudotime. The majority, 36, were amino acids and their metabolites, followed by 17 types of lipid and 10 nucleotides and their metabolites.
Wörheide then correlated each metabolite to data in the AD Atlas, a database her group created from genomic, transcriptomic, proteomic, metabolomic, and clinical data from ROSMAP (Wörheide et al., 2021). This atlas boasts such AD phenotyping on more than 20,000 protein-coding genes, 8,000 proteins, and nearly 1,000 metabolites.
Of the 89 statistically significant metabolites, the AD Atlas already contained 50. Thirty-four correlated with 619 genes mapped onto pathways such as amino acid metabolism and neurotransmitter transport. Wörheide then searched the atlas's transcriptomic and proteomic data for differential expression of those genes and their resulting proteins. She dredged up 193 DEGs and 39 differentially expressed proteins in AD cases versus controls. Differences in transcription were biggest in the temporal cortex.
Twenty-seven of these 34 AD-linked metabolites have already been correlated with plaque and tangle pathology, brain glucose uptake, or cognition (Batra et al., 2022). All in all, Wörheide believes that relating pseudotime to other omics data can shed new light on pathways that lead to AD dementia.
Heath agreed. “It is exciting when you see a convergence of signals toward a similar pathway among the different data types, because it strengthens the biological relevance of the pathway,” she said.—Chelsea Weidman Burke
- When It Comes to Alzheimer’s Disease, Do Human Microglia Even Give a DAM?
- RNA-Seq from 2.8 Million Cells Yields New Clues About Alzheimer's
- Mukherjee S, Heath L, Preuss C, Jayadev S, Garden GA, Greenwood AK, Sieberts SK, De Jager PL, Ertekin-Taner N, Carter GW, Mangravite LM, Logsdon BA. Molecular estimation of neurodegeneration pseudotime in older brains. Nat Commun. 2020 Nov 13;11(1):5781. PubMed. Correction.
- Wörheide MA, Krumsiek J, Nataf S, Nho K, Greenwood AK, Wu T, Huynh K, Weinisch P, Römisch-Margl W, Lehner N, The AMP-AD Consortium, The Alzheimer’s Disease Neuroimaging Initiative, The Alzheimer’s Disease Metabolomics Consortium, Baumbach J, Meikle PJ, Saykin AJ, Doraiswamy PM, van Duijn C, Suhre K, Kaddurah-Daouk R, Kastenmüller G, Arnold M. An Integrated Molecular Atlas of Alzheimer’s Disease. medRxiv 2021.09.14.21263565 medRxiv
- Batra R, Arnold M, Wörheide MA, Allen M, Wang X, Blach C, Levey AI, Seyfried NT, Ertekin-Taner N, Bennett DA, Kastenmüller G, Kaddurah-Daouk RF, Krumsiek J, Alzheimer's Disease Metabolomics Consortium (ADMC). The landscape of metabolic brain alterations in Alzheimer's disease. Alzheimers Dement. 2022 Jul 13; PubMed.
No Available Further Reading