首都医科大学学报 ›› 2024, Vol. 45 ›› Issue (5): 900-906.doi: 10.3969/j.issn.1006-7795.2024.05.023

• 基础研究 • 上一篇    下一篇

鉴别早发型与晚发型子痫前期特征遗传标志物生物信息学分析及验证研究

赵轩宇,姜艳,隋峰*   

  1. 首都医科大学附属北京妇产医院/北京妇幼保健院重症监护室,北京 100006
  • 收稿日期:2024-02-29 出版日期:2024-10-21 发布日期:2024-10-18
  • 通讯作者: 隋峰 E-mail:suifeng@mail.ccmu.edu.cn

Identification and verification study of early-onset and late-onset preeclampsia genetic biomarkers using bioinformatics analysis

Zhao Xuanyu, Jiang Yan, Sui Feng*   

  1. Department of Maternal Intensive Care Unit, Beijing Obstetrics and Gynecology Hospital, Capital Medical University / Beijing Maternal and Child Health Care Hospital, Beijing 100006, China
  • Received:2024-02-29 Online:2024-10-21 Published:2024-10-18

摘要: 目的  本研究旨在筛选出能够区分早发型与晚发型子痫前期的遗传标志物,并评估这些标志物的区分能力。方法  从GEO数据库获取早发型及晚发型子痫前期数据集(GSE74341、 GSE190639及GSE4707数据集),以GSE74341和GSE190639数据集为实验组,GSE4707数据集为验证组,筛选并验证早发型子痫前期与晚发型子痫前期间的差异表达基因。应用最小绝对值收缩与选择算子(least absolute shrinkage and selection operator,LASSO)及支持向量机-递归特征消除(support vector machines-recursive feature elimination,SVM-RFE)两种机器学习方法筛选特征遗传标志物,并通过受试者工作特征(receiver operating characteristic,ROC)曲线评估特征遗传标志物的区分能力。结果  与早发型子痫前期相比,检出7个显著上调基因和3个显著下调基因。通过两种机器学习的方法及在验证组中进行基因表达差异分析,共同筛选出1个特征遗传标志物(MME)。ROC曲线的曲线下面积(area under the curve,AUC)在实验组及验证组中分别为0.975(95% CI: 0.921~1.000)及1(95% CI: 1.000~1.000)。结论  MME可能作为区分早-晚发型子痫前期的特征遗传标志物。

关键词: 早发型与晚发型子痫前期, 生物信息学分析, 机器学习, 遗传标志物

Abstract: Objective  The aim of this study was to identify characteristic genetic biomarkers that can differentiate between early-onset and late-onset preeclampsia patients and to investigate the differentiating ability of these genes. Methods  Data sets for early-onset and late-onset preeclampsia(GSE74341, GSE190639 and GSE4707 data sets) were obtained from the GEO database. The GSE74341 and GSE190639 data sets were used as the experimental group, and the GSE4707 data set was used as the verification group to screen and verify the differentially expressed genes during early-onset preeclampsia and late-onset preeclampsia. Two machine learning methods, namely  least absolute shrinkage and selection operator (LASSO) and support vector machines-recursive feature elimination (SVM-RFE), were employed to select characteristic genes. The discriminative ability of these genes was evaluated using receiver operating characteristic (ROC) curves. Results  Compared to early-onset preeclampsia, we identified seven significantly upregulated genes and three significantly downregulated genes in late-onset preeclampsia. By utilizing the two machine learning methods and analyzing gene expression differences in the validation group, one characteristic gene (MME) was selected. The area under the ROC curve (AUC) for the experimental group and validation group was 0.975 (95% CI: 0.921-1.000) and 1 (95% CI:1.000–1.000), respectively. Conclusions  Our findings suggest that MME may serve as a potential characteristic gene for distinguishing between early-onset and late-onset preeclampsia.

Key words: early-onset and late-onset preeclampsia, bioinformatics analysis, machine learning, characteristic genetic markers

中图分类号: