Journal of Capital Medical University ›› 2019, Vol. 40 ›› Issue (5): 731-737.doi: 10.3969/j.issn.1006-7795.2019.05.013

Previous Articles     Next Articles

Building a prediction model of brain tissues gene expression based on whole blood gene expression profiles

Xu Wenjian, Li Wei   

  1. Beijing Key Laboratory for Genetics of Birth Defects, Beijing Pediatric Research Institute;MOE Key Laboratory of Major Diseases in Children;Genetics and Birth Defects Control Center, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing 100045, China
  • Received:2019-03-13 Online:2019-09-21 Published:2019-12-16
  • Supported by:
    This study was supported by Ministry of Science and Technology of China (2016YFC1000306)。

Abstract: Objective Gene expression analysis is a powerful tool for explaining biological phenotypes and assisting disease diagnosis. It is difficult, risky and expensive to collect the samples of gene expression experiment of brain tissue. It is urgent to find an alternative method to detect the expression profiles of brain tissue from other available samples. Methods Using the whole blood gene expression profiles matched with brain tissue samples from GTEx (Genotype-Tissue Expression) database as input features and 13 brain tissue expression profiles as targets, we mined many-to-many correlations between the gene expression level in whole blood and that of specific brain tissue. Then we construct a predictive regression model of gene expression level of unavailable brain tissue based on the expression level of whole blood gene expression profiles. Results A new low-dimensional feature dataset for each gene in each brain tissue was constructed by extracting 15 most relevant gene expression features from whole blood, and a linear regression prediction model for all gene in 13 brain tissues was constructed. The mean absolute error (MAE) of the prediction model is between 0.406 and 0.542, and the root mean square error (RMSE) is between 0.558 and 0.941. Conclusion A prediction model of gene expression in brain tissues based on whole blood gene expression profile is proposed. It is proved that the gene expression in unsampled brain tissue can be predicted relatively accurately only by using whole blood expression profile data. It is possible that the surgical sampling of brain tissue samples can be avoided in transcriptome research, thus providing an alternative for the study of gene expression profiles of brain tissue-related diseases.

Key words: brain tissue, whole blood, gene expression, prediction model, feature selection

CLC Number: