|
Mice on Trial? Issues in the Design of Drug Studies

Sean Scott
|
|
|
|
Tiny and timid though it may be, the humble house mouse is not only one of the most successful mammalian species on Earth, but is also the dominant model organism for medical research, especially because of the ease with which it can be genetically manipulated to express human disease-causing genes. Yet time and again, “cures” effected in mice have failed when tried in human patients. Is this simply because “mice just aren’t humans,” as researchers are wont to say, or is there something else going on?
|
|
A recent study by a team of investigators at the ALS Therapy Development Institute (ALS-TDI) suggests another reason why mouse studies fail to replicate in humans: uncontrolled biological variables in underpowered studies. The authors recommend a minimum study design to manage the inherent noise in the system. Similar issues may arise in mouse studies in other fields.
Sean Scott, president of ALS-TDI, presented the findings from the study. Other featured participants included Ben Barres, Mike Sasner, Lucie Bruijn, Greg Cox, Stan Appel, Cathy Lutz, Gene Johnson, Jeff Rothstein, and Jonathan Glass. We thank Melanie Leitner of Prize4Life for helping to organize this discussion.
View/Listen to the Webinar
|
View Transcript of Live Discussion — Posted 16 June 2008 View Comments By:
John Trojanowski, Virginia Lee — Posted 27 May 2008
Patrizia Fanara — Posted 29 May 2008
Tennore Ramesh — Posted 30 May 2008
Background Text
By Melanie Leitner, Prize4Life
This forum built on a recent study by Sean Scott and his colleagues at the ALS Therapy Development Institute, published in the journal Amyotrophic Lateral Sclerosis, which indicated that failure to control for biological variables, common in the design of mouse drug efficacy studies, can explain why promising drug studies in mice have resulted in dashed hopes when the compounds reached clinical trials in ALS patients. These observations raise the possibility that these common issues in study design may pose similar problems for other neurodegenerative diseases.
In the 1990s, a mutation in superoxide dismutase 1 (SOD1) was identified as the cause of a significant subset of familial amyotrophic lateral sclerosis (FALS) cases. This discovery led to the generation of transgenic rodent models of autosomal dominant SOD1 FALS. Mice carrying 23 copies of the human SOD1G93A transgene have become the standard model for FALS and ALS therapeutic studies. To date, there have been at least 50 publications describing therapeutic agents that extend the lifespan of this mouse. However, no therapeutic agent besides riluzole has shown corresponding clinical efficacy.
Using computer modeling and statistical analysis of over 5,000 SOD1G93A mice, Scott et al. quantified the impact of several critical confounding biological variables frequently present in transgenic mouse studies, and developed an optimal study design that controlled for these variables. When the authors retested various compounds previously reported to be efficacious in major animal studies using this optimal study design, the authors found no survival benefit in the SOD1G93A mouse for any of these compounds (including riluzole), all of which were administered by their previously reported routes and doses. The compounds retested in this way included minocycline, creatine, celecoxib, and sodium phenylbutyrate, all of which were followed up in ultimately unsuccessful human clinical trials.
The results of this paper suggest that historically there has been a profound and widespread problem in the design and therefore interpretation of many drug efficacy studies in the most commonly used and widely accepted mouse model of ALS. The primary aim of this discussion forum is to invite experts and interested researchers to examine the implications of this study both for the ALS field itself as well as for related fields of neurodegenerative disease research. This discussion seeks to 1) raise the question of whether this problem is not unique to the G93A SOD1 model of disease but may rather be endemic to other transgenic mouse studies, particularly in overexpression paradigms in transgenic mice on hybrid backgrounds, and 2) if the community deems this to be a widespread problem, develop ideas to minimize the impact of this problem and to come up with approaches that might facilitate more significant and reproducible results among all laboratories employing mouse models of neurodegenerative disease.
Questions for discussion:
1. Do the findings from the ALS-focused study presented here translate to other mouse models of neurodegenerative disease? What, if any, are the implications of these findings for other mouse models of neurodegenerative disease (specifically APP, α-synuclein, and other overexpression models)?
2. What are the implications of this study for earlier and ongoing mouse studies that do not follow these rigorous guidelines?
3. What, if any, are the obligations of the research community a) when reviewing articles for publication that do not follow these strict criteria for design, and b) when reviewing grant applications that do not follow these strict criteria for design? Should the NIH and/or other funders take a position on this issue?
Reference:
Scott S, Kranz JE, Cole J, Lincecum JM, Thompson K, Kelly N, Bostrom A, Theodoss J, Al-Nakhala BM, Vieira FG, Ramasubbu J, Heywood JA. Design, power, and interpretation of studies in the standard murine model of ALS. Amyotroph Lateral Scler. 2008;9(1):4-15.
Abstract
Related news from the Dana Foundation
 |
Comments on Live Discussion |
 |
  |
| |
Comment by: Virginia Lee, ARF Advisor, John Trojanowski, ARF Advisor
|
 |
 |
Submitted 27 May 2008
| Permalink
|
Posted 27 May 2008
|
 |
 |
Picking the Right Model of Neurodegeneration for Drug Discovery for Patients With Sporadic Amyotrophic Lateral Sclerosis
Comment by John Q. Trojanowski and Virginia M.-Y. Lee
Despite significant heterogeneity within frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS), TDP-43 has emerged as the common pathological substrate linking FTLD with ubiquitin inclusions (FTLD-U) and ALS since the initial report describing ALS and FTLD-U as TDP-43 proteinopathies in 2006 (1). Subsequent studies support the hypothesis that FTLD-U and ALS represent two extremes of a clinico-pathological spectrum of TDP-43 proteinopathies. However, pathological TDP-43 inclusions are absent in familial ALS (FALS) with SOD1 mutations (SOD1-FALS) yet are present in all cases of sporadic ALS (SALS) and some cases of non-SOD1-dependent FALS. This...
Read more
Picking the Right Model of Neurodegeneration for Drug Discovery for Patients With Sporadic Amyotrophic Lateral Sclerosis
Comment by John Q. Trojanowski and Virginia M.-Y. Lee
Despite significant heterogeneity within frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS), TDP-43 has emerged as the common pathological substrate linking FTLD with ubiquitin inclusions (FTLD-U) and ALS since the initial report describing ALS and FTLD-U as TDP-43 proteinopathies in 2006 (1). Subsequent studies support the hypothesis that FTLD-U and ALS represent two extremes of a clinico-pathological spectrum of TDP-43 proteinopathies. However, pathological TDP-43 inclusions are absent in familial ALS (FALS) with SOD1 mutations (SOD1-FALS) yet are present in all cases of sporadic ALS (SALS) and some cases of non-SOD1-dependent FALS. This implies that SOD1-FALS is not the familial counterpart of SALS nor of FALS cases caused by other genetic abnormalities (2).
Indeed, despite some early skepticism about this view, a flurry of recent reports establish that point mutations in the TDP-43 gene (TARDBP), especially in the glycine-rich region that is essential for RNA splicing, cause FALS and SALS (3-7), and that TARDBP variants may be genetic risk factors for disease (8). Moreover, recent studies suggest that ALS is a multi-system TDP-43 proteinopathy rather than being a disorder restricted to the pyramidal motor system. That is because neuronal and glial TDP-43 inclusions are present throughout the CNS, not just in upper and lower motor neurons; these TDP-43 lesions are always associated with loss of nuclear TDP-43, thereby resulting in a loss of TDP-43 nuclear functions (9,10).
In light of these and the more than 90 studies published on TDP-43 in the last 20 months, it is reasonable to ask whether transgenic SOD1 mice are models only of FALS due to SOD1 gene mutations. Do they perhaps fail to model other forms of ALS including SALS and FALS caused by mutations in genes other than SOD1, because the underlying disease mechanisms are different between SOD1 FALS and other forms of ALS. Thus, proof-of-concept studies of potential ALS therapies that target SOD1-mediated neurodegeneration may yield effective therapies for SOD1-dependent FALS, but not other forms of ALS. This may be why such therapies have not shown efficacy in patients with SALS and are unlikely to work in patients with SOD1-independent forms of FALS.
Recognition that TDP-43 pathology underlies FTLD-U and ALS opens up new avenues for drug discovery focusing on TDP-43-related targets to develop mechanistically based therapies for these disorders. Many of us who work on ALS research are actively pursuing efforts to develop TDP-43 transgenic mouse models of ALS that all of us hope will accelerate the pace of drug discovery for this disorder and other TDP-43 proteinopathies.
References: 1. Neumann M, Sampathu DM, Kwong LK, Truax AC, Micsenyi MC, Chou TT, Bruce J, Schuck T, Grossman M, Clark CM, McCluskey LF, Miller BL, Masliah E, Mackenzie IR, Feldman H, Feiden W, Kretzschmar HA, Trojanowski JQ, Lee VM. Ubiquitinated TDP-43 in frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Science. 2006 Oct 6;314(5796):130-3. Abstract
2. Mackenzie IR, Bigio EH, Ince PG, Geser F, Neumann M, Cairns NJ, Kwong LK, Forman MS, Ravits J, Stewart H, Eisen A, McClusky L, Kretzschmar HA, Monoranu CM, Highley JR, Kirby J, Siddique T, Shaw PJ, Lee VM, Trojanowski JQ. Pathological TDP-43 distinguishes sporadic amyotrophic lateral sclerosis from amyotrophic lateral sclerosis with SOD1 mutations. Ann Neurol. 2007 May;61(5):427-34. Abstract
3. Gitcho MA, Baloh RH, Chakraverty S, Mayo K, Norton JB, Levitch D, Hatanpaa KJ, White CL, Bigio EH, Caselli R, Baker M, Al-Lozi MT, Morris JC, Pestronk A, Rademakers R, Goate AM, Cairns NJ. TDP-43 A315T mutation in familial motor neuron disease. Ann Neurol. 2008 Apr;63(4):535-8. Abstract
4. Kabashi E, Valdmanis PN, Dion P, Spiegelman D, McConkey BJ, Vande Velde C, Bouchard JP, Lacomblez L, Pochigaeva K, Salachas F, Pradat PF, Camu W, Meininger V, Dupre N, Rouleau GA. TARDBP mutations in individuals with sporadic and familial amyotrophic lateral sclerosis. Nat Genet. 2008 May;40(5):572-4. Abstract
5. Sreedharan J, Blair IP, Tripathi VB, Hu X, Vance C, Rogelj B, Ackerley S, Durnall JC, Williams KL, Buratti E, Baralle F, de Belleroche J, Mitchell JD, Leigh PN, Al-Chalabi A, Miller CC, Nicholson G, Shaw CE. TDP-43 mutations in familial and sporadic amyotrophic lateral sclerosis. Science. 2008 Mar 21;319(5870):1668-72. Abstract
6. Van Deerlin VM, Leverenz JB, Bekris LM, Bird TD, Yuan W, Elman LB, Clay D, Wood EM, Chen-Plotkin AS, Martinez-Lage M, Steinbart E, McCluskey L, Grossman M, Neumann M, Wu IL, Yang WS, Kalb R, Galasko DR, Montine TJ, Trojanowski JQ, Lee VM, Schellenberg GD, Yu CE. TARDBP mutations in amyotrophic lateral sclerosis with TDP-43 neuropathology: a genetic and histopathological analysis. Lancet Neurol. 2008 May;7(5):409-16. Abstract
7. Yoseki A, et al. TDP-43 mutation in familial amyotrophic lateral sclerosis. Ann Neurol. 2008.
8. Winton, M.J., Van Deerlin,V.M., Kwong, L.K., Yuan,W., McCarty-Wood, E., Yu, Chang-En, Schellenberg, G.D., Rademakers, R., Caselli, R., Karydas, A., M., Trojanowski, J.Q., Miller, B.I., and Lee, V.M.-Y. A90V TDP-43 variant results in aberrant nuclear localization of TDP-43. FEBS Lett., In press, 2008.
9. Geser F, Brandmeir NJ, Kwong LK, Martinez-Lage M, Elman L, McCluskey L, Xie SX, Lee VM, Trojanowski JQ. Evidence of multisystem disorder in whole-brain map of pathological TDP-43 in amyotrophic lateral sclerosis. Arch Neurol. 2008 May;65(5):636-41. Abstract
10. Nishihira Y, Tan CF, Onodera O, Toyoshima Y, Yamada M, Morita T, Nishizawa M, Kakita A, Takahashi H. Sporadic amyotrophic lateral sclerosis: two pathological patterns shown by analysis of distribution of TDP-43-immunoreactive neuronal and glial cytoplasmic inclusions. Acta Neuropathol. 2008 May 15; Abstract
View all comments by Virginia Lee
View all comments by John Trojanowski
|
 |

|
| |
Comment by: Patrizia Fanara
|
 |
 |
Submitted 28 May 2008
| Permalink
|
Posted 29 May 2008
|
 |
 |
Thank you, Sean, for presenting yesterday to address the implications of your study.
I agree with you that the SOD1 mouse model can still be used to eventually achieve control over this disease.
We have recently published the use of “authentic,” i.e., disease-specific biomarkers in preclinical animal models to reveal the actual dynamics of a biological system affected by disease and to develop novel mechanistic-based therapies. Through this approach, we were the first to publish the null efficacy for Riluzole. Our studies used Riluzole as a negative control for what a non-effect on mechanistically based therapy looks like.
Questions I'd suggest for further discussion:
- In addition to establishing more rigorous constraints and thus ensuring minimal variability in future preclinical studies, are we also considering the impact of using biological-based biomarkers (metrics that are intrinsically linked to the pathogenesis, progression, and reversal of the disease) to bridge the animal neurological score with molecular changes underlying this disease?
- Should we...
Read more
Thank you, Sean, for presenting yesterday to address the implications of your study.
I agree with you that the SOD1 mouse model can still be used to eventually achieve control over this disease.
We have recently published the use of “authentic,” i.e., disease-specific biomarkers in preclinical animal models to reveal the actual dynamics of a biological system affected by disease and to develop novel mechanistic-based therapies. Through this approach, we were the first to publish the null efficacy for Riluzole. Our studies used Riluzole as a negative control for what a non-effect on mechanistically based therapy looks like.
Questions I'd suggest for further discussion:
- In addition to establishing more rigorous constraints and thus ensuring minimal variability in future preclinical studies, are we also considering the impact of using biological-based biomarkers (metrics that are intrinsically linked to the pathogenesis, progression, and reversal of the disease) to bridge the animal neurological score with molecular changes underlying this disease?
- Should we include in-vivo readouts in animal models to 1) maximize certainty of therapeutic effect before human trials begin, 2) improve the validity of the SOD1 mouse model, and 3) develop novel mechanistically based therapeutic interventions?
References: Fanara P, Banerjee J, Hueck RV, Harper MR, Awada M, Turner H, Husted KH, Brandt R, Hellerstein MK. Stabilization of hyperdynamic microtubules is neuroprotective in amyotrophic lateral sclerosis. J Biol Chem. 2007 Aug 10;282(32):23465-72. Abstract View all comments by Patrizia Fanara
|
 |

|
| |
Comment by: Tennore Ramesh
|
 |
 |
Submitted 30 May 2008
| Permalink
|
Posted 30 May 2008
|
 |
 |
Based on the large scale analysis of survival in the Bl6/SJL mixed hybrid strain of SOD1 G93A (gur hi copy) mice it is clear that survival as a measure of drug efficacy is dogged by issues of appropriate N, gender and litter matching. With all the necessary controls in place it is possible to reliably detect a 3 percent effect on survival using N = 20 animals/group (litter matched). This is based on the only positive control in the study, which is the effect of gender on survival. This effect is reliable and reproducible in almost every study with N = 60/group and is statistically significant.
Regarding the author’s statement that most of the published studies are primarily due to inherent noise in the system: I disagree with this blanket statement. The figures on apparent effect are misleading and measure frequency of overall effect (like tossing a coin), rather than frequency of statistically significant overall effect (representative of most published studies). Although I agree that T-test is not the appropriate test and is not stringent, I just wanted to see if a simple...
Read more
Based on the large scale analysis of survival in the Bl6/SJL mixed hybrid strain of SOD1 G93A (gur hi copy) mice it is clear that survival as a measure of drug efficacy is dogged by issues of appropriate N, gender and litter matching. With all the necessary controls in place it is possible to reliably detect a 3 percent effect on survival using N = 20 animals/group (litter matched). This is based on the only positive control in the study, which is the effect of gender on survival. This effect is reliable and reproducible in almost every study with N = 60/group and is statistically significant.
Regarding the author’s statement that most of the published studies are primarily due to inherent noise in the system: I disagree with this blanket statement. The figures on apparent effect are misleading and measure frequency of overall effect (like tossing a coin), rather than frequency of statistically significant overall effect (representative of most published studies). Although I agree that T-test is not the appropriate test and is not stringent, I just wanted to see if a simple T-test would give a large number of false positives. When I reanalyzed data with historic controls and randomly assorted them into control and treatment groups, I found very different outcome. Analysis by randomization of data as performed by SimLIMS (with 974 randomization experiment), shows that while the chance of getting a positive effect in study can happen 48 percent of the time, it is very unlikely that one would see statistically significant effect repeatedly in the same direction (Table 1, Figure 1 [.pdf]). In fact the random distribution of control animals into untreated and untreated group resulted in statistically significant (P <0.05) positive effect 1 percent of the time with an N of 20 animals/group. Among all the mock experiments that gave statistically significant p-value, only 23 percent were in the positive direction, indicating that these random effects would not be repeatable. Finally, the average treatment effect that one would see by chance by averaging the effects in all studies is close to zero (table-1). When the N was increased, this random statistically significant positive effect became even rare (data not shown). This indicates that 1 in 100 experiments can give a statistically significant positive drug effect by chance in an experiment with 20 animals/group and seldom are they repeatable.
Sod1 G93A High copy mice obtained from Jackson labs show gender differences in survival (1). The magnitude of this effect is small ~3 percent. This small but consistent effect was tested in this SimLIMS randomization model. When a realistic gender effect was measured, 12 percent of the experiments showed significant gender effects (p <0.05), almost 12-fold increase from the noise level of 1 percent seen by random chance. Most importantly, 99 percent of the experiments that gave a significant p-value <0.05 were in the positive direction, indicating that although getting a p-value of p <0.05 is difficult (due to the small gender effect), any experiment that give a significant effect would show identical direction of effect. Thus, even a real 3 percent effect would be easily detectable, if it is a real effect.
I do not believe that all drug effects seen to date are just noise. For example drug studies with Celebrex showed a 15-20 percent effect and these studies had high N (N = 40). Yet, ALSTDI was unable to reproduce this effect. This brings into a fundamental question of the genetic background of each lab's mouse colony. Many researchers that do drug studies in ALS have their own colony that they themselves breed. One interpretation is that this closed breeding among mice in each laboratory can be enough to bring about such differences in drug effects.
There are other reasons some published studies are not reproducible. They are
1. segregation of drug metabolism genes and disease modulating genes among different house bred lines;
2. basing papers on just one experiment, rather than performing repeat experiments and demonstrating similar trend in survival;
3. using simple T-test instead of appropriate survival analysis;
4. not using appropriate variables in statistical analysis (as discussed in the paper).
I am hoping that the guidelines in this discussion would be of value in designing and evaluating future studies.
See Table 1, Figure 1 [.pdf]
References: 1. Heiman-Patterson TD, Deitch JS, Blankenhorn EP, Erwin KL, Perreault MJ, Alexander BK, Byers N, Toman I, Alexander GM. Background and gender effects on survival in the TgN(SOD1-G93A)1Gur mouse model of ALS. J Neurol Sci. 2005 Sep 15;236(1-2):1-7. Abstract
View all comments by Tennore Ramesh
|
 |
 |
|
|
|
Submit a Comment on this Live Discussion
|
|
|
|
|
|
|
|
|
|