For amyloid imaging to become widely useful in clinics beyond a small number of research settings, a nuclear medicine physician should be able to look at a person’s scan and know if the scan is positive or negative. No number crunching, no data plots—just a quick, so-called binary read. That instantaneous interpretation comes with an intangible human element. Can it be done correctly and reliably? This question came up last January at an FDA advisory committee meeting. There, concern over variability among readers prompted the FDA to direct the radiotracer’s sponsor first to develop a reader training program and prove that it works before requesting that florbetapir be approved for clinical use (see ARF related news story). At the 6th Annual Human Amyloid Imaging Conference held 12-13 January 2012 in Miami, Florida, Mark Mintun of Avid Radiopharmaceuticals in Philadelphia, Pennsylvania, presented data to address this charge (Mintun et al., 2012).
Minton and colleagues developed a binary read method for florbetapir scan images displayed in black and white. The method is based on visually judging the extent of contrast at the brain’s white matter-gray matter boundary. For a scan to be called positive, it has to have reduced contrast along an inverted gray scale in at least two brain areas. Mintun presented a visual read and testing program where the goal was to call correctly whether the scan is positive or negative. The scientists used several series of scans; some had predetermined right or wrong answers based on postmortem pathology read with CERAD criteria for the frequency of neuritic plaques, while other series were from clinical practice and judged against the clinical diagnosis.
The training starts with a lecture; then, the physicians practice on five demonstration cases and seven practice cases in an interactive session, and then the physicians assess what they have learned on 20 more cases. This takes three hours, either in person with a trainer present or remotely with a DVD. Mintun and colleagues tested this training regimen in three studies, one using 35 autopsy cases and nine readers, one using 59 autopsy cases and five other readers, and one using 59 autopsy cases on five more readers. The median sensitivity and specificity was in the low 90s, Mintun reported. How well one reader agrees with another is typically expressed by a measure called Fleiss’ kappa; 1.00 signifies perfect agreement, and in this series of tests it came in at 0.85 to 0.75.
In-person training yielded slightly better results than did remote training, where the physicians clicked through the material but did not have an expert in the room to ask questions. The nuclear medicine physicians came from academic and private practice backgrounds and did not have to have experience with brain PET, Mintun told the audience. When readers called a scan incorrectly, it tended to happen on the same brains. For example, one positive case had died two years after the scan, and most readers read it as negative; other readers had trouble with atrophy or when the signal seemed to them to be right on the border, Mintun said during audience discussion.
Minton noted that when non-autopsy cases of clinically diagnosed Alzheimer’s, mild cognitive impairment, and controls were mixed into an autopsy series, agreement among the readers rose. He addressed a concern that had arisen a number of times in previous sessions at HAI, that is, whether MCI might be harder to read and generate more borderline results. “In our reader training, that was not true,” Mintun said. Reader agreement for 92 clinical MCI cases was 98 percent, Mintun reported, adding that the readers also expressed more confidence reading clinical MCI cases than interpreting the autopsy cases. Autopsy cases can seem ambiguous because they show brains of people who were very ill at the time of their scans, Mintun said in discussion. Others countered that, while clinical research cases of MCI may indeed be less ambiguous than autopsy cases, day-to-day patients in routine clinical settings may be trickier to read because they may have strokes, white matter disease, and other comorbidities. Overall, though, the audience at HAI saw the results of this training program as reassuring.
Importantly, nuclear medicine researchers tend to pursue a different goal than the FDA, which acts on behalf of the general public being treated by non-specialist clinicians. Researchers prefer quantitative, nuanced measurements, whereas the FDA wants to be convinced that a robust, thumbs up-down binary read will serve the public, said Keith Johnson, who co-organizes the HAI Conference. At the conference, several presentations explored ways of setting appropriate thresholds above which to call a scan positive. For example, Ann Cohen of the University of Pittsburgh Medical School compared threshold-setting methods and tested them with a handful of independent, blinded readers (Cohen et al., 2012). Gil Rabinovici of the University of California, Berkeley, compared more liberal and more stringent published thresholds in a pathology series of early-stage patients to check whether any might be set too low. He reported that even with the liberal threshold, once people’s scans read positive, they already had abundant β amyloid in their brains, alleviating concern about potential false positives (Rabinovici et al., 2012).
Relationships among postmortem CERAD diagnosis, quantitative PIB threshold (blue line = liberal, red line = conservative), and visual reads. All scans read as positive showed frequent CERAD plaques. Image credit: Gil Ravinovici, William Jagust
A slew of posters showcased academic-industry collaborations to formally standardize amyloid PET for robust performance in multicenter studies. Overall, scientists agreed with Robert Koeppe’s advice that the field will need both visual reads and quantitative analysis. The former allows an up-down determination of whether a scan is positive, while the latter can pick up early-stage deposition in individual regions and subtle changes over time or in response to drug.—Gabrielle Strobel.
- Committee Shoots Down Florbetapir, Raising Bar for Field at Large
- News Focus: 2012 Human Amyloid Imaging Conference
- Miami: Amyloid PET in the Clinic: What Are the Issues?
- Miami: Scan and Tell? Amyloid Imaging Confronts Disclosure Dilemma
- Miami: When Does Amyloid Deposition Start in Familial Alzheimer’s?
- Miami: Age and Amyloid—What Has ApoE Got to Do With It?
- Miami: Longitudinal Amyloid PET Data Start Converging
- Miami: Diagnosis and Amyloid Scan Can Be at Odds
- Miami: Scientists Angle for Way to Image Tangle
- Mintun M, Clark C, Pontecorvo M, Lu M, Krautkramer M, Arora A, Joshi A, Veeraraj C, Skovronsky D. Evaluation of a Binary Read Methodology for Florbetapir-PET Images. Human Amyloid Imaging Abstract. 2012 Jan 1;
- Cohen A, Bi W, Weissfeld L, Aizenstein H, McDade E, Mountz J, Nebes R, Saxton J, Snitz B, Price J. Classification of Amyloid-Positivity in Controls: Comparison of Approaches. Human Amyloid Imaging Abstract. 2012 Jan 1;
- Rabinovici G, Ghosh P, Madison C, Corbetta C, Mormino E, Miller B, Grinberg L, Seeley W, Jagust W. PIB+ Scans in Dementia Patients are Associated with High Post-Mortem Amyloid Burden. Human Amyloid Imaging Abstract. 2012 Jan 1;