Data-Driven Modeling and Analysis in Mathematical Biology

Thursday, June 17 at 09:30am (PDT)
Thursday, June 17 at 05:30pm (BST)
Friday, June 18 01:30am (KST)

SMB2021 SMB2021 Follow Thursday (Friday) during the "MS19" time block.
Note: this minisymposia has multiple sessions. The second session is MS20-DDMB (click here).

Share this


Tomas Carino-Bazan (University of California, Santa Barbara, United States), Daniel Wilson (Boston University, United States)


Recent advances in data science and machine learning are providing novel ways to learn models and perform analysis of biological systems. This session brings together researchers to discuss recent developments in the field, advances in methodology and computational methods, and emerging application domains in the biological sciences. Topics include data-driven development of mechanistic and mechanical models in cell biology, analysis of genomic data with applications to disease progression and precision medicine, and statistical methods for investigating protein structure. The session aims to discuss both general topics concerning methodology as well as specific motivating application domains.

Julie Hussin

(Université de Montréal, Montreal Heart Institute, Canada)
"Evolutionary approaches to detect epistasis in large-scale genomic data"
Whether gene-gene interactions, or epistasis, plays a major or minor role for any given human trait in any population remains an open question, and analytical methods to detect epistasis have become very popular in the last decade. However, there are important computational and statistical challenges for identifying novel epistatic interactions in human genetics. To help solve the paucity of uncovered epistasis in humans, we propose new approaches to characterize gene-by-gene interactions, in studying signatures of co-evolution. The underlying model is that interacting genes will undergo compensatory genetic mutations to maintain their interaction, which will result in correlation of allelic frequencies between physically unlinked loci. In this talk, I will present data-driven projects on two distinct systems, interactions among Cytochrome P450 genes and co-evolution involving the cholesterol metabolism gene CETP, and their implications for precision medicine. Our studies also demonstrate how data from diverse human populations in genetic studies can be leveraged to uncover biological mechanisms of importance for world-wide population health.

Elana Fertig

(Johns Hopkins University, United States)
"Identifying therapeutic resistance mechanisms in cancer with single-cell data and transfer learning"
Tumors employ complex, multi-scale cellular and molecular interactions that evolve over the course of therapeutic response. The changes in these pathways enables tumors to overcome therapeutic regimens, and ultimately acquire resistance. New molecular profiling technologies, including notably single cell technologies, provide an unprecedented opportunity to characterize these molecular relationships. However, interpreting the specific cellular and molecular pathways in therapeutic response requires complementary computational analysis methods. We developed an unsupervised learning method, CoGAPS, that employs Bayesian non-negative matrix factorization to disentangle distinct biological processes from high-throughput molecular data. Notably, this algorithm discovers dynamic compensatory signaling in acquired therapeutic resistance from time course bulk RNA-seq data and novel NK cell activation in anti-CTLA4 response from post-treatment scRNA-seq data. To further demonstrate that the inferred pathways are biological rather than computational artifacts, we developed a complementary transfer learning method to relate learned patterns between datasets. We demonstrate that this approach identifies robust molecular processes between model systems and human tumors and enables multi-platform data integration to delineate the drivers of therapeutic response and resistance.

Tomas Carino-Bazan

(University of California, Santa Barbara, United States)
"Machine learning methods for fluid mechanics for learning low dimensional representations"
Many empirical studies and experiments in applications ranging from biophysics to engineering design yield partial information of the flow fields and related hydrodynamic responses. We develop data-driven methods for learning representations of hydrodynamic responses for inference tasks. From an analytic perspective the field of fluid mechanics traditionally has used transformations such as vorticity to represent localized flow structures and for numerical simulations. For example for inviscid flows this often yields a sparse representation. We seek to develop related machine learning methods that learn more general non-linear transformations that can featurize hydrodynamic flow data for making inferences about flow structure and dynamics. We discuss our progress toward studying hydrodynamic flows using auto-encoders with associated regularizations to learn smooth low dimensional representations of flow structures.

Lorin Crawford

(Microsoft Research New England, United States)
"Statistical Frameworks for Discovering Biophysical Signatures in 3D Shapes and Images"
The recent curation of large-scale databases with 3D surface scans of shapes has motivated the development of tools that better detect global patterns in morphological variation. Studies which focus on identifying differences between shapes have been limited to simple pairwise comparisons and rely on pre-specified landmarks (that are often known). In this talk, we present SINATRA: a statistical pipeline for analyzing collections of shapes without requiring any correspondences. Our method takes in two classes of shapes and highlights the physical features that best describe the variation between them. We develop a rigorous simulation framework to assess our approach, which themselves are a novel contribution to 3D image and shape analyses. Lastly, as case studies, we use SINATRA to (1) analyze mandibular molars from four different suborders of primates and (2) facilitate the visual identification of structural signatures differentiating between two protein ensembles.

Hosted by SMB2021 Follow
Virtual conference of the Society for Mathematical Biology, 2021.