Predicting Early Brain Development via Deep Learning of DNA

2018 Award: $39,742

The state-of-art data science with the most advanced machine learning algorithm such as deep learning provides us new opportunity to uncover the genetic code underlying in millions of genetic markers across the human genome. Our research project will build the most advanced deep convolutional neural network utilizing DNA markers to predict cognitive function of early age. We will also adapt this deep learning framework to predict other psychiatric disorders such as autism, schizophrenia, and bipolar disorder so that the DNA code of human genome about mental disorders can be decoded.

Need/Problem: Innovative methodological strategies are urgently needed to predict short-term and long-term cognitive outcome at early life so that early intervention can be implemented before any cognitive delay appears.

Grant Summary: This grant is to develop deep convolutional neural networks to build a predictive model for early brain development through genome-wide DNA markers and developmental trajectories of cortical thickness and surface area.

Goals and Projected Outcomes: The goal of this study is to develop the first deep learning procedure to predict early brain development using data from two well-characterized population cohorts studied from birth through 2 years of age. The success of this study will present strong proof that deep learning predicting neurodevelopment and cognitive functions is effective and powerful using genome-wide DNA markers, while none of other competing methods can at this moment.

Kai Xia, PhD

Grant Details: Genome-wide association studies (GWAS) of adolescents and adults are transforming our understanding of how genetic variants impact brain structure and psychiatric risk. Neurodevelopmental trajectories and cognitive maturation across the first two years of life, the most dynamic phase of postnatal brain development and one which is likely critical for understanding neurodevelopmental disorders including autism and schizophrenia.

One significant challenge in addressing this research gap is identifying appropriate strategies for integrating data from multiple domains (genomic, neuroimaging, and cognitive assessments) which are, in and of themselves, high-dimensional. Another challenge is moving beyond descriptions at the population level to the arena of individualized risk prediction and personalized medicine. The proposed research will address these issues using deep learning with convolutional neural network approaches. Deep learning combines lower-level representation (DNA markers) to yield higher-level representations (psychiatric phenotype) of the input information. Unlike conventional machine learning techniques that require feature engineering, deep learning models learn the features automatically through tens or hundreds of layers of hierarchical representations, without handcrafted features. Deep learning has achieved state-of-the-art results in several areas, such as computer vision, natural language processing and speech recognition, with applications in fields such as genomics8 and astronomy as well.

The proposed research will combine the state‐of‐the‐art deep convolutional neural networks with biological indicators to predict psychiatric phenotypes. We believe this technique will achieve the best accuracy in predicting cognitive outcome, while substantially outperforming other conventional machine learning techniques. After elaborating on the framework, we will focus on explaining the features learnt by the model to decipher the biological indicators of the psychiatric phenotype. Then, through an understanding of the features learnt by the model, we will carry out post analysis studies to identify the key contributors to the disorder. Through effective regularization and cross validation, this deep learning model will predict cognitive performance of early age through a group of genetic variants and neuroimaging phenotypes. This contribution will be significant because deep learning approach will be proved as a revolutionary approach in predicting short term and long term psychiatric outcome using genetic and imaging data in early life, which is an essential step in developing therapeutic interventions to correct adverse developmental trajectories and various cognitive delays, ultimately preventing the onset of these disorders and/or reducing their severity.