Constructing the Physiome :

Introduction

Transport in Biological Systems covers an essential element for integrating the knowledge we have of biology. We are in the midst of a evolutionary phase, emerging from the tumult of newly discovered and still mainly disconnected data at the molecular level, into a phase of solidification and organization. The idea of “one gene – one disease” isn’t wrong, as exemplified by diseases like cystic fibrosis or long Q-T syndrome, but is no longer central to the gene-phenotype relationship that we now see as just part of the picture. There are undoubtedly many more “single gene anomalies” to be pin-pointed, and these nicely defined “experiments of nature” serve beautifully to illustrate cause and effect relationships. But our societies modern plagues are more complex diseases; cancer, diabetes, heart disease, Alzheimer’s disease, and hypertension have diverse causes and presumably not accountable to mutations in single genes. Auto-immune diseases are “abnormal” responses to “injuries” to the “system”, in the sense that existent pathways are put out of kilter and provide responses that hurt more than help at some point. Understanding such diseases is excruciatingly difficult, requiring ever deeper knowledge of the functions of intricate mazes of metabolic, signaling and regulatory pathways. (Surely there are neural pathways involved in these as well.)

A quantitative and integrative portrayal of a functioning biological organism must start with elements that are not obvious in the genome: the anatomy, the functional physiology, the responses to interventions, the experiments of nature (mutations and diseases and responses to injury). Mathematical descriptions of function have often aided elucidation of function: Harvey’s calculations of blood volume per heart beat forced the conclusion that the blood circulates; Krebs’ observations that tricarboxylic intermediates were being transformed at identical rates demonstrated the cyclic nature of the reactions (another circulation, this one carrying substrates to create energy rather than oxygen from air to organs). Because it is invaluable to have a quantitative understanding of the biology, this text emphasizes mechanistic characterizations of processes and illustrates how to frame them in computational terms for clarity, reproducibility and as targets for refutation. Each “model” is an hypothesis, set forth so that it can be examined qualitatively as defined process, and quantitatively as a reasoned and useful descriptor as a working definition.

Life requires transport. All organisms are parasitic, in that the means of sustenance come from outside. Diffusion as the transport mechanism may suffice for bacteria and single cell Eukaryotes, but multicellular organisms require transport systems and information systems. But first and foremost, all organisms need to move materials across cell membranes; without this there is no life. And now a fundamental belief is being challenged: while we have blindly accepted the idea that most small solutes (water, oxygen, fatty acids, metabolic end products) diffuse across cell membranes freely, this is now an issue, and one that has to be considered if figuring out how cells and organisms function. Nothing crosses membranes without a transporter! This can’t be 100% true, but Douglas Kell (2015) makes the case that for drugs it is probably a good assumption that “phospholipid bilayer diffusion is negligible –‘PBIN’ – (i.e. may be neglected, relative to transporter-mediated transmembrane fluxes).” He details the case in a series of strong papers over the last decade showing that therapeutic chemicals, drugs, generally cross membranes via existent transporters. This suggests potential applications for improving drug targeting by both constructing a drug to use a specific transporter or to block transport into specific organs. Thus, Section I of this text outlines the roles of, and similarities among binding sites, transporters, enzymes, ion channels, receptors, and reaction networks in the transfer of both materials and information.

Other kinds of transport mechanisms bring materials to the transmembrane transporters: fluid and gas convection, mechanical convection outside and inside cells (e.g. kinesin), diffusion and facilitated diffusion, transformation (transferring high energy via the PCr shuttle rather than by ATP diffusion), signaling cascades, electrical propagation, contraction and propulsion, verbal, auditory, visual, tactile, neural information flow. Integrative modeling of a cell system or an organ must therefore be rather finely detailed in order to provide a truly mechanistic picture of the functionality. The molecular behavior, protein folding, molecular “breathing” or the fluctuations into many conformal states, while critical to understanding binding, docking, reaction facilitation and so on, needs to be placed in its position in determining how the cell functions. Three papers in the current issue of Science (19 December 2014) illustrate novel aspects of single molecules that impact overall cell and system behavior: (1) Fried et al (2014) find that the enzyme ketosteroid isomerase exerts and extremely strong electric field at the active site rearranging a C=O bond, the enzyme’s rate determining step, and that the magnitude of the field correlates with the catalytic rate; (2) Joh et al (2014) constructed a four-helical bundle as a Zn transporter, tested it in micelles and a lipid bilayer, showed that both Zn and Co were transported (with some protons antiported), while Ca++ was excluded; (3) Song and Tezcan (2015) guided the construction in E coli, starting from a monmeric redox protein, a tetrameric assembly that catalyzes ampicillin at a rate 104 times the normal hydrolysis rate and enabling the bacterium to survive ampicillin. Such achievements at the molecular level exemplify the need to have an understanding of the whole cell and whole organism level to determine what else happens, for safety as well as for science.

Because of such issues it has gradually become recognized that a purposeful approach to integrative biology would be needed. However it is structured it is sure to be a long-term project. Following a phase of definition and discussion initiated by the Bioengineering Commission of the International Union of Physiological Sciences (IUPS) in 1993, the Physiome Project was organized and formally launched at a satellite symposium of the IUPS Congress in St Petersburg in 1997 (Bassingthwaighte 2000). A variety of funding agencies have supported developments in multi-scale analysis and modularity in biological systems, various approaches to mathematical analysis in biology, and some moderately large scale efforts in particular areas of medical science. Funding efforts by the NIH/NSF/DOE on multiscale modeling, the coordinating efforts by the Interagency Modeling and Analysis Group of US Federal agencies, the framework provided by the EU-supported IUPS Physiome Project, and the MEXT effort in Japan, were all designed to help with the understanding of complex physiological systems through the use of biophysically based mathematical models that link genes to organisms.

Organism models are inevitably Multi-scale

One of the central principles is that complex systems like the heart or the renal excretory system are inevitably multi-scalar, composed of elements of diverse nature, constructed spatially in hierarchical fashion. This requires linking together different types of modelling at the various levels. It is neither possible nor explanatory to attempt to model at the organ and system levels in the same way as at the molecular and cellular levels. To represent the folding, within microseconds, of a single protein using quantum mechanical calculations requires days to weeks of computation on the fastest parallel computers (such as IBM’s Blue Gene). To analyse the internal dynamics of a single cell at this degree of detail is far out of sight. Even if we could do it, we would still need to abstract from the mountain of computation some explanatory principles of function at the cellular level. Furthermore, we would be completely lost within that mountain of data if we did not include the constraints that the cell as a whole exerts on the behaviour of its molecules. In multi-scalar systems with feedback and feed-forward loops between the scale levels, there may be no single level of causation (Noble, 2008a).

The impressive developments in epigenetics over the last decade (Bird, 2007) have re-inforced this conclusion by revealing the nature and extent of some of the molecular mechanisms by which the higher-level constraints are exerted. In addition to regulation by transcription factors, the genome is extensively marked by methylation and binding to histone tails. It is partly through this marking that a heart cell achieves, with precisely the same genome, the distinctive pattern of gene expression that makes it a heart cell rather than, e.g. a bone cell or a liver cell. The marking is transmitted down the cell lines as they divide to form more cells of the same kind. The feedbacks between physiological function and gene expression that must be responsible are still to be discovered. Since fine gradations of expression underlie important regional characteristics of cardiac cells, making a pacemaker cell different from a ventricular cell, and making different ventricular cells have different repolarization times, this must be one of the important targets of future work on the cardiac physiome. We need to advance beyond annotating those gradients of expression to understanding how they arise during development and how they are maintained in the adult. This is one of the ways in which quantitative physiological analysis will be connected to theories of development and of evolution. The logic of these interactions in the adult derives from what made them important in the process of natural selection. Such goals of the physiome project may lie far in the future, but they will ultimately be important in deriving comprehensive theories of the ‘logic of life’.

A second reason why multi-scale analysis is essential is that a goal of systems analysis must be to discover at which level each function is integrated. Thus pacemaker activity is integrated at the cell level – single sinus node cells show all the necessary feedback loops that are involved. Below this level, it doesn’t even make sense to speak of cardiac rhythm. At another level, understanding fibrillation requires analysis at least at the level of large volumes of tissue and even of the whole organ. Likewise, understanding the function of the heart as a mechanical pump is, in the end, an organ-level property. Another way of expressing this point is to say that high-level functions are emergent properties that require integrative analysis and a systems approach. The word ‘emergent’ is itself problematic. These properties do not ‘emerge’ blindly from the molecular events; they were originally guided by natural selection and have become hard-wired into the system. Perhaps ‘system properties’ would be a better description. They exist as a property of the system, not just of its components.

A third reason why multi-scale analysis is necessary is that there is no other way to circumvent the ‘genetic differential effect problem’ (Noble, 2008b). This problem arises because most interventions at the level of genes, such as gene knockouts and mutations, do not have phenotypic effects. The system as a whole is very effective in buffering genetic manipulations at the level of DNA, through a variety of back-up systems. This is one of the bases of the robustness of biological systems. Moreover, when we manipulate a gene, e.g. through a mutation, even when phenotypic effects do result they reveal simply the consequences of the difference at the genetic level; they do not reveal all the effects of that gene that are common to both the wild and mutated gene. This is the reason for calling this the ‘genetic differential effect problem’. Reverse engineering through modelling at a high level that takes account of all the relevant lower-level mechanisms enables us to assign quantitatively the relative roles of the various genes/proteins involved. Thus, a model of pacemaker activity allows absolute quantitative assignment of contributions of different protein transporters to the electric current flows involved in generating the rhythm. Only a few models within the cardiac physiome project are already detailed enough to allow this kind of reverse engineering that succeeds in connecting down to the genetic level, but it must be a goal to achieve this at all levels.

This is the reason why top-down analysis, on its own, is insufficient, and is therefore another justification for the middle-out approach.

This is the fundamental reason for employing the middle-out approach.

Modularity in the modeling of biological systems

A major principle is in describing of constructing systems is to use modularity. A module represents a component of a system that can be relatively cleanly separated from other components. An example is a model of a time- and voltage-dependent ion channel, where the model represents kinetically the behaviour of a large number of identical channel proteins opening more or less synchronously under the same conditions. A model for a cellular action potential would be composed of an assemblage of such modules, each providing the current flow through a different channel type for different ions. Each module is linked to the same environment, but the modules interact with that environment each in their own way. The key to the separability of the modules is that they should be relatively independent of one another, though dependent on their common environment though the effects of each module’s behaviour on the environment itself. The separation of modular elements at the same level in the hierarchy works best when the changes in the extramodular environments (concentrations, temperature, pH) do not change too rapidly, that is, more slowly than do the individual channel conductances. The reason is that, when the environmental conditions also change rapidly, the computational ‘isolation’ of a module becomes less realistic: the kinetic processes represented must extend beyond the module. Choosing the boundaries of modules is important since a major advantage of modularization is that that a limited number of variables are needed to define the interface between models relative to the number required to capture function within the module.

At another level, one might consider the heart, the liver and the lung, etc., as individual modules within a functioning organism, while their common environment (body temperature, blood composition and pressure) is relatively stable (homeostasis in Claude Bernard’s terms (Bernard, 1865, 1984). At an intermediate level, a module might be composed to represent a part of an organ with a different functional state than other parts, for example, an ischemic region of myocardium having compromised metabolism and contractile function. Such a module, in an acute phase of coronary flow reduction, might be parametrically identical to the other, normal regions, but have a reduced production of ATP. At a later stage, the regional properties might change, stiffening with increasing collagen deposition, and requiring a different set of equations, so that there would be a substitution for the original module.