PSE Research Projects

Click for full size
Frog egg kinetic model
The modeling cycle

The Living Cell is a miniature, membrane-bound, biochemical machine that harvests material and energy from its environment and uses them for maintainence, growth, and reproduction. These processes are carried out by macromolecular machines (enzymes, ribosomes, transport proteins, structural proteins, motor proteins) whose structures are encoded in nucleotide sequences (DNA and mRNA). The activities of these macromolecules are controlled and coordinated by regulatory networks of great complexity and exquisite effectiveness. These networks collect information from inside and outside the cell, process the data, and direct cellular responses that foster the survival and reproduction of the cell. How these regulatory systems work is no more or less apparent from their network diagrams than is a complex piece of electronics from its schematic wiring diagram. Whereas electrical engineers create accurate mathematical representations of wiring diagrams and use these equations to design new devices, molucular biologists are not accustomed to this kind of approach. To employ quantitative modeling as a means for deeper scientific understanding and for more rational engineering of cellular responses, there needs to be a paradigm shift in the field.

To this end, DARPA's Bio-Computation Program is supporting development of BioSPICE (Simulation Program for Intra-Cellular Evaluation), a collection of interoperable programs for model development, simulation, analysis, and comparison to experimental data. The tools are developed by expert computer scientists, in close collaboration with experienced modelers, and used to build sophisticated models that accurately represent molecular control systems. The models are based on experimental results retrieved from data repositories, and specific predictions of the models are tested experimentally by molecular biologists within the program.

The Virginia Tech group is led by experienced modelers (John Tyson and Kathy Chen of Virginia Tech, Bela Novak of the Budapest University of Technology and Economics), experimentalists (Jill Sible of Virginia Tech, Fred Cross of The Rockefeller University, and Michael Mendenhall of the University of Kentucky Medical School), and computer scientists (Cliff Shaffer, Layne Watson, and Naren Ramakrishnan of Virginia Tech). The group's goals are to develop accurate models of cell growth and division, to conduct novel experimental tests of some of these models, and to create general-purpose software tools.

The cell division cycle, the sequence of events by which a cell replicates all its components and divides them equally between two daughter cells, is an ideal test case for the BioSPICE program. The cell cycle is a regulatory system of fundamental biological significance, governed (in eukaryotes) by a universal mechanism that has been characterized in great detail both genetically and biochemically. Realistic and accurate models are available, which make specific predictions that can be tested experimentally. However, cell cycle modeling has now reached the limit of what can be "hand-crafted," and the next level of computer simulation will require the type of tools envisaged by BioSPICE.

The modeling efforts center around two specific experimental systems: frog eggs and budding yeast cells. Frog eggs provide a convenient system for biochemical studies, especially in cytoplasmic extracts. By supplementing egg extracts with recombinant proteins, one can manipulate the regulatory network to almost any specifications. Budding yeast is an ideal organism for genetic characterization of molecular regulatory systems, and most of the genes encoding its cell cycle control system are now known.

JigCell is a domain-specific modelling support environment for biological pathway modeling, intended ultimately to become a problem solving environment (PSE). JigCell is intended to support users in biology and related fields who do not have significant experience in formal modeling (but who are domain experts). It incorporates off-the-shelf components such as numerical libraries, visualization tools, and communications protocols where quality implementations exist and technical specifics about the component can be hidden from the user.

JigCell consists of a Model Builder, a Run Manager, a Comparator, and a Paramter Estimator. The Model Builder creates a model specification that incorporates the wiring diagram, kinetic information, and discrete event model. A spreadsheet interface organizes the information of the wiring diagram and kinetics as a collection of chemical reaction equations. Each row of the Model Builder spreadsheet specifies a chemical reaction equation including substrates, products, kinetic rate law, and kinetic rate constants. Chemical equations are a natural representation for many biological processes of interest and are applicable to a wide variety of fields outside biological modeling.

The Run Manager translates a model specification into an executable form. Each row in the the Run Manager spreadsheet specifies how to simulate a certain experiment including the model to use, parameter and initial condition sets, and the appropriate simulator settings. Parameter sets can contain a value for every parameter in the model or contain only the values changed in relation to another parameter set.

The Comparator and Compare are tools for model testing and evaluation. Tests in the Comparator are assertions about a model or comparisons between model performance and experimental data. A test evaluates either operational accuracy or the accuracy in transforming the model. Performance on each test is scored according to a user-defined objective function that represents the goodness-of-fit between the expected result and the model result.

Tests in Compare compare performance between the currently proprosed model and the collection of other models. Models come from past revisions of the current model, independent models of the same system, and models with subsystems in common with the current model. Compare performs ranking and selection among these models based on the same criteria defined in the Comparator.

The Parameter Estimator finds unknown rate constants by fitting the model to experimental data. The data are typically not a solution to a differential equation, but rather a complicated, nonlinear functional of the differential equation solution. Furthermore, both the dependent and independent variables involved in these functionals are subject to experimental error. The Parameter Estimator performs both global and local searches during optimization.

Last modified: December 14, 2003