Brian Munsky’s groundbreaking advancements to predictive biological modeling published in PNAS

The Munsky group of the Department of Chemical and Biological Engineering and their collaborators at Vanderbilt University are currently at the forefront of predictive modeling in Systems Biology research. Brian Munsky, School of Biomedical Engineering student Zachary Fox, and their co-investigators, Gregor Neuert and Guoliang Li of Vanderbilt University and Douglas Shepherd of the University of Colorado Anschutz Medical Campus, recently published new advances to predictive biological models for gene regulation in the Proceedings of the National Academy of Sciences. These high-profile results published to PNAS were made possible by new computational approaches developed in the Munsky Lab under Munsky’s National Institutes of Health – Maximizing Investigator Research Award granted last fall.

Figure 1. Experimental design.
Figure 1. Experimental design. Yeast cells are exposed to osmotic stress through addition of salt. This triggers the Hog1 kinase to move to the nucleus, induce changes in chromatin, and activate transcription of target genes. Single-cell images over time measure transcription initiation, mRNA export from nucleus to cytoplasm, and mRNA degradation. Figure reproduced with permission from B. Munsky, et al., Proc. Natl. Acad. Sci. U.S.A., (2018). Copyright 2018, B. Munsky, et al.

Systems biology researchers have been using engineering methodologies to reproduce biological behaviors for more than two decades. Despite great strides to improve models’ abilities to capture existing data, researchers continually encounter major flaws in the abilities of these models to predict gene regulation behaviors in new situations. The origin of this fundamental unsolved discrepancy in biology is not fully understood, and the lack of predictions leaves biological researchers in medicine or engineering no choice but to guess what may happen under different drug treatments or genetic manipulations. The Munsky group and its collaborators seek to create more rational and systematic approaches to discover predictive models, a desire that drove Munsky and Neuert to investigate the fundamental principles for the comparison of experiments and models.

Experts in the field have speculated that the only ways to find predictive models would be to collect more and better data or to create more complex and detailed models.. Munsky and his team considered another possibility – that even when data and models are sufficient, the issue may lie within flawed mathematical assumptions at the foundation of most model identification strategies.

To test their hypothesis, Munsky and Neuert applied multiple conventional statistical techniques as well as an entirely different technique known as the finite state projection (FSP), in their efforts to capture and predict multiple different spatial and temporal aspects of transcription regulation for a highly variable set of stress-response genes in yeast. Conventional techniques assume that the dataset collected is symmetrical or normal and use only simple statistical descriptions of the data, such as the means and variances. In contrast, the FSP approach (which Munsky discovered during his Ph.D. studies), doesn’t make such assumptions, and utilizes all the data collected to create the model. All techniques provided excellent fits to the original data, but only the FSP approach could both fit the original data and predict new behaviors.

Figure 2. Examples of data used for fitting.
Figure 2. Examples of data used for fitting. Each image shows single molecule fluorescence in situ hybridization of cells at the indicated times after exposure to 0.2 M NaCl. Nuclei are stained blue and outlined in white. Cells are outlined in gray. STL1 mRNA is in green, and CTT1 mRNA is in red. Bright nuclear spots indicate transcription site. Figure reproduced with permission from B. Munsky, et al., Proc. Natl. Acad. Sci. U.S.A., (2018). Copyright 2018, B. Munsky, et al.

Munsky and the team explored the experimental data and FSP-derived model more closely, and they discovered that both exhibited highly asymmetric distributions, a fact that violates standard modeling assumptions and which explains the failure of standard methods to discover predictive models. This discovery demonstrated that it is not enough to fit a model to data and that not all computational approaches to compare models and data are created equal – models and data must be integrated in the appropriate manner to generate accurate quantitative predictions.

By developing advanced computational tools to recover precise, reproducible, and predictive biological models, Munsky and his collaborators have advanced the field of Systems Biology. Their approaches allow the discovery of new predictive understanding for gene regulation and will help elevate the standards of data interpretation throughout the biological sciences.

Click here to view the entire PNAS abstract: