Like any model of the world, our view of the cell is inescapably bound by the time and place in which we live. Over the years different schools have fashioned the cell in a variety of forms, from bags of enzymes, to metabolic channels, to feedback circuits, to complex systems, to gels, to self-modifying programs in software. A model that has pervaded cell biology for the past fifteen years is the so-called “network” view (Figure 1A), which has bloomed in parallel with the emergence of man-made networks such as the Internet and Facebook. This view treats cells as containers for vast networks of “nodes” (genes, gene products, metabolites, or other biomolecules) connected by “links” (physical interactions or functional associations). Network representations of the cell flow directly from the ability to characterize not only genes and proteins in isolation, but also their functional similarities and physical binding partners— a major outcome of transcriptomics and proteomics approaches. Analysis of network information, whether biological or man-made, is an active field leading to algorithms that detect nodes with strategic positions within a network or that analyze networks to identify modular structures (a topic of earlier research in my lab).
While incredibly influential, the network is likely not the ultimate representation of a cell, for two reasons. First, network diagrams do not visually resemble the contents of cells. Nowhere in the cell do we observe actual wires running between genes and proteins– unlike the Internet, which is truly a network of wires among processing units. Rather, the cell involves a multi-scale hierarchy of components that is not readily captured by basic network representations. For example, the proteasome has been mapped extensively to identify its key genes and interactions, but the network visualization of these data (Figure 1A) is very different from the proteasome’s spatial appearance (Figure 1B). The interactions making up the proteasome factor into a regulatory particle and a core, which, in turn, factor into a base and a lid, and an alpha and beta subunit, respectively. This hierarchical structure is obscured by the network visualization of pairwise relationships between gene products. We shall address this shortcoming by using molecular networks and other ‘omics data to build hierarchical models of the cell parallel to the Gene Ontology.
Second, many of the molecular networks published to date, including many from my lab, are descriptive maps of physical or functional connectivity rather than predictive models. For example, technologies such as yeast two hybrid, protein affinity purification, and chromatin immunoprecipitation are often used to define and draw large networks of protein-protein and protein-DNA interactions, but these static maps do not, by themselves, predict cell behavior. Although we and many others in the field of systems biology have inferred networks capable of predicting gene function or phenotypic responses, these efforts have tended to focus on a specific class of predictions, i.e. gene expression level or cell growth rate. Assembling a model of the cell that would predict a range of phenotypes, rather than only one type of outcome, requires understanding how cellular phenotypes are interrelated with each other. Here again a hierarchy comes into play, since cellular organization involves a multi-scale hierarchy not only in structure but also in function. For example, the proteasome is a central component of ubiquitin-mediated protein degradation, which, depending on an intricate set of inputs and rules can result in cellular homeostasis, differentiation, death, and other fates. This multi-scale hierarchy of processes is, again, simply not exposed by a standard pairwise network representation. We will address this shortcoming by developing methods to ‘functionalize’ the Gene Ontology, so that it is not merely a static description of the contents of cells, but an active framework for predicting phenotype from genotype.
The most direct representations of data are not always the most desirable for meaningful interpretation of those data. In x-ray crystallography, the most direct representations of x-ray diffraction patterns are two- dimensional images. However, when many such images are integrated and analyzed, exquisite 3D structural models of proteins emerge which, in turn, enable accurate predictions of protein dynamics and function. Similarly, from many molecular measurements and interaction data sets the higher order structure and function of the cell might emerge, if only we could figure out how to assemble these images properly.