Journal of Capital Medical University ›› 2022, Vol. 43 ›› Issue (4): 584-591.doi: 10.3969/j.issn.1006-7795.2022.04.012

• Medical Informatics:Application and Development • Previous Articles     Next Articles

A systematic approach to constructing DAG from knowledge graph

Bai Yongmei1,2,3, Sun Huage4, Du Jian2*   

  1. 1. Institute of Medical Technology, Peking University Health Science Center, Peking University, Beijing 100191, China;
    2. National Institute of Health Data Science, Peking University, Beijing 100191, China;
    3. School of Public Health, Peking University, Beijing 100191, China;
    4. School of Mathematics and Statistics, The University of Melbourne, Melbourne 3010, Australia
  • Received:2022-02-24 Online:2022-08-21 Published:2022-10-28
  • Contact: *E-mail:dujian@bjmu.edu.cn
  • Supported by:
    This study was supported by National Natural Science Foundation of China (72074006), Young Elite Scientists Sponsorship Program by China Association for Science and Technology (2017QNRC001), Peking University Health Science Center(BMU2021YJ008).

Abstract: Causal inference is the primary goal of observational research based on big data, as opposed to correlation analysis. Causal graph often visualize complex causal relationships by integrating large amount of priori knowledge through directed acyclic graph(DAG). The directed acyclic graph has become an important tool for developing causal inference strategies. However, the construction of causal graph for specific research questions currently relies heavily on expert knowledge and local experience. Medical knowledge extraction from existing publications is the basis for the systematic construction of DAG. In this paper, we will systematically introduce the structured medical knowledge system platform developed based on the SemMedDB database of the National Institutes of Health. This study attempts to provide a new strategy for systematically constructing DAG by defining causal graphs from an interdisciplinary perspective as complex networks between the concepts involved in a research problem (head and tail concepts) and all their third-party variables. There are two main approaches for the current systematic construction of causal graphs: (1) Prune the knowledge graph into causal graph; (2) Combine the evidence claims based on population-interventions/exposure-comparisons-outcomes (PI/ECO) framework into causal graphs.

Key words: directed acyclic graph, knowledge graph, evidence synthesis, confounder, mediator, collider

CLC Number: