From Machine Learning to Deep Learning Methods in Biology

Tuesday, June 15 at 09:30am (PDT)
Tuesday, June 15 at 05:30pm (BST)
Wednesday, June 16 01:30am (KST)

SMB2021 SMB2021 Follow Tuesday (Wednesday) during the "MS07" time block.
Note: this minisymposia has multiple sessions. The second session is MS08-MFBM (click here).

Share this


Erica Rutter (University of California, Merced, United States), Suzanne Sindi (University of California, Merced, United States)


As biological data becomes more detailed and ubiquitous, statistical and machine learning methods are needed to process and understand relationships in big data or to incorporate this data into existing mechanistic modeling frameworks. Here we present recent advances for machine learning and deep learning methodologies applied to a variety of biological processes, from single cell genomic analysis to population-wide disease spread. Methods of interest include biomedical image analysis via convolutional neural networks (CNNs), learning equations from data, and many more. The methods developed and discussed in this minisymposium span the range from purely statistical and machine learning models to hybridized mechanistic/machine learning models to data-driven mechanistic modeling.

Emilia Kozlowska

(Departement of Systems Biology and Engineering, Silesian University of Technology, Poland)
"Application of mechanistic and machine learning modeling to predict long-term response to treatment for cancer patients"
The most common subtype of lung cancer is non-small cell lung cancer (NSCLC) that constitutes 80% of all lung cancer cases. NSCLC is usually diagnosed at an advanced stage because of non-specific symptoms, leading to high mortality. The standard treatment for NSCLC patients is a combination of chemotherapy and radiotherapy and, as an emerging type of treatment, immunotherapy. We collected clinical data from over 500 patients with non-small cell lung cancer. From the cohort, we extracted 50 patients who were treated only with platinum-based chemotherapy with palliative intent i.e., under the assumption of failed cure. The clinical data including, among others, short and long-term response to chemotherapy and amount of chemotherapy cycles, were applied to calibrate the mechanistic model using a machine learning approach. We developed a computational platform to find the best protocol for the administration of platinum-doublet chemotherapy in the palliative setting. The core of the platform is the mathematical model, in the form of a system of ordinary differential equations, describing dynamics of platinum-sensitive and platinum-resistant cancer cells and interactions reflecting competition for space and resources. The model is simulated stochastically by sampling the parameter values from a joint probability distribution. The model simulations faithfully reproduce the clinical cohort at three levels, long-term response (OS), initial response, and the relationship between the number of chemotherapy cycles and time between two consecutive chemotherapy cycles. In addition, we investigated the relationship between initial and long-term responses. We showed that these two variables do not correlate, which means that we cannot predict patient survival based solely on the initial response. We also tested several chemotherapy schedules to find the best one for patients treated with palliative intent. We found that optimal treatment schedule depends, among others, on the strength of competition among sensitive and resistant subclones in a tumor.

Sara Ranjbar

(Mathematical NeuroOncology Lab, Precision Neurotherapeutics Program, Mayo Clinic, Arizona, United States)
"MRI-based estimation of the abundance of immunohistochemistry markers in GBM brain using deep learning"
Glioblastoma (GBM) is a devastating primary brain tumor known for its heterogeneity and invasion. Despite uniformly aggressive therapies including surgery, radiation, and chemotherapy, the median survival rate remains about 15 months. There are many targeted therapies in clinical trials; however, the eloquence of the location makes both the drug delivery and the local efficacy of any drug difficult to assess. Clinical imaging remains the primary modality to assess tumor response, but it is an obscured lens through which it is nearly impossible to distinguish between actual tumor growth and tumor cell death from a large immune response. Over the past decade, MRI has been suggested by many studies to reflect the underlying tumor biology. In this talk, we will discuss our groups’ approach to building a robust deep learning model to connect MRI patterns at GBM biopsied locations with cell proliferation abundance measured by immunohistochemistry staining. If successful, this model can provide a non-invasive readout of cell proliferation and reveal the effectiveness of a given cytotoxic therapy including standard-of-care radiotherapy that targets cell proliferation.

Joan Ponce

(UCLA, United States)
"An integrated framework for building trustworthy data-driven epidemiological models: Application to the COVID-19 outbreak in New York City"
Epidemiological models can provide the dynamic evolution of a pandemic but they are based on many assumptions and parameters that have to be adjusted over the time when the pandemic lasts. However, often the available data are not sufficient to identify the model parameters and hence infer the unobserved dynamics. Here, we develop a general framework for building a trustworthy data-driven epidemiological model, consisting of a workflow that integrates data acquisition and event timeline, model development, identifiability analysis, sensitivity analysis, model calibration, model robustness analysis, and forecasting with uncertainties in different scenarios. In particular, we apply this framework to propose a modified susceptible–exposed–infectious–recovered (SEIR) model, including new compartments and model vaccination in order to forecast the transmission dynamics of COVID-19 in New York City (NYC). We find that we can uniquely estimate the model parameters and accurately predict the daily new infection cases, hospitalizations, and deaths, in agreement with the available data from NYC's government's website. In addition, we employ the calibrated data-driven model to study the effects of vaccination and timing of reopening indoor dining in NYC.

Emily Zhang

(North Carolina State University, USA)
"Deep Learning and Regression Approaches to Forecasting Blood Glucose Levels for Type 1 Diabetes"
Controlling blood glucose in the euglycemic range is the main goal of developing sensor-augmented pump therapy for type 1 diabetes patients. The pump therapy delivers the amount of insulin dose determined by glucose predictions through the use of computational algorithms. A computationally efficient and accurate model that can capture the physiological nonlinear dynamics is critical for developing an accurate therapy device. Four data-driven models are compared, including different neural network architectures, a reservoir computing model, and a novel linear regression approach. Model predictions are evaluated over continuous 30 and 60 minute time horizons using real-world data from wearable sensor measurements, a continuous glucose monitor, and self-reported events through mobile applications. The four data-driven models are trained on 12 data contributors for around 32 days, 8 days of data are used for validation, with an additional 10 days of data for out-of-sample testing. Model performance was evaluated by the root mean squared error and the mean absolute error. A neural network model using an encoder-decoder architecture has the most stable performance and is able to recover missing dynamics in short time intervals. Regression models performed better at long-time prediction horizons (i.e., 60 minutes) and with lower computational costs. The performance of several distinct models was tested for individual-level data from a type 1 diabetes data set. These results may enable a feasible solution with low computational costs for the time-dependent adjustment of pump therapy for diabetes patients.

Hosted by SMB2021 Follow
Virtual conference of the Society for Mathematical Biology, 2021.