. Genomic analysis of the hierarchical structure of regulatory networks. Proc Natl Acad Sci U S A. 2006 Oct 3;103(40):14724-31. PubMed.


Please login to recommend the paper.


  1. This study clearly demonstrates what we have thought all along, and what various large-scale approaches have already shown: that genomic data from reference standards of known mechanism or phenotype are vital in order to fully extract value out of novel expression patterns. Two of the major obstacles in realizing this value were the resources required to generate a sizable reference dataset and the ability to adequately compare results from alternative expression platforms, cell types, models, or species. With the introduction of a simple, non-parametric test, well-conserved expression changes can be compared. With the large public expression databases, such as the NCBI Gene Expression Omnibus and others, additional mechanistic insight should be systematically extracted from these datasets. What this capability brings to the field is the opportunity to use gene expression profiles to test hypotheses, rather than using them in an exploratory mode, or in fishing expeditions, as they are cynically referred to.

    As highlighted in this paper, many gems have already been found. However, care must be taken, as many misleading connections are likely to be made, particularly as the size of the connectivity map grows. It’s not clear from the paper how many misleading or incorrect connections were identified by the approach, particularly given the lack of any probabilistic approach to evaluate potential connections. With further development of statistical methods, these errors should be reduced, but not entirely eliminated. A more rigorous approach using supervised classification models has already been developed to overcome these limitations and provide greater classification accuracy than ranking methods (Natsoulis et al., 2005). These methods should prove of greater value in dissecting diagnostic signatures of drug action, pathology, or disease states.

    As suggested by the authors, a more comprehensive database composed of more cell types should broaden the scope of the connectivity map for capturing more mechanisms that may be context-dependent. However, while cell-based models are higher throughput and more cost-effective than in vivo models, single cells won’t best represent complex pathologies or disease states that encompass the interaction between multiple cell types or organs, so we should be cautiously optimistic about the scope of what can be identified with such an in vitro connectivity map. I think the greatest value will be in understanding drug action at the molecular level—a laudable goal. However, predicting complex phenotypes, such as adverse side effects in humans, should be approached with caution and a weight-of-evidence approach.


    . Classification of a large microarray data set: algorithm comparison and analysis of drug signatures. Genome Res. 2005 May;15(5):724-36. PubMed.

    View all comments by Mark Fielden

Make a Comment

To make a comment you must login or register.

This paper appears in the following:


  1. New Database Connects Gene Expression, Disease