Identification of diagnostic candidate genes in COVID-19 patients with sepsis.
Coronavirus Disease 2019 (COVID-19) and sepsis are closely related. This study aims to identify pivotal diagnostic candidate genes in COVID-19 patients with sepsis.
We obtained a COVID-19 data set and a sepsis data set from the Gene Expression Omnibus (GEO) database. Identification of differentially expressed genes (DEGs) and module genes using the Linear Models for Microarray Data (LIMMA) and weighted gene co-expression network analysis (WGCNA), functional enrichment analysis, protein-protein interaction (PPI) network construction, and machine learning algorithms (least absolute shrinkage and selection operator (LASSO) regression and Random Forest (RF)) were used to identify candidate hub genes for the diagnosis of COVID-19 patients with sepsis. Receiver operating characteristic (ROC) curves were developed to assess the diagnostic value. Finally, the data set GSE28750 was used to verify the core genes and analyze the immune infiltration.
The COVID-19 data set contained 3,438 DEGs, and 595 common genes were screened in sepsis. sepsis DEGs were mainly enriched in immune regulation. The intersection of DEGs for COVID-19 and core genes for sepsis was 329, which were also mainly enriched in the immune system. After developing the PPI network, 17 node genes were filtered and thirteen candidate hub genes were selected for diagnostic value evaluation using machine learning. All thirteen candidate hub genes have diagnostic value, and 8 genes with an Area Under the Curve (AUC) greater than 0.9 were selected as diagnostic genes.
Five core genes (CD3D, IL2RB, KLRC, CD5, and HLA-DQA1) associated with immune infiltration were identified to evaluate their diagnostic utility COVID-19 patients with sepsis. This finding contributes to the identification of potential peripheral blood diagnostic candidate genes for COVID-19 patients with sepsis.
Li J
,Pu S
,Shu L
,Guo M
,He Z
... -
《Immunity Inflammation and Disease》
The significance of long chain non-coding RNA signature genes in the diagnosis and management of sepsis patients, and the development of a prediction model.
Sepsis is a life-threatening organ dysfunction condition produced by dysregulation of the host response to infection. It is now characterized by a high clinical morbidity and mortality rate, endangering patients' lives and health. The purpose of this study was to determine the value of Long chain non-coding RNA (LncRNA) RP3_508I15.21, RP11_295G20.2, and LDLRAD4_AS1 in the diagnosis of adult sepsis patients and to develop a Nomogram prediction model.
We screened adult sepsis microarray datasets GSE57065 and GSE95233 from the GEO database and performed differentially expressed genes (DEGs), weighted gene co-expression network analysis (WGCNA), and machine learning methods to find the genes by random forest (Random Forest), least absolute shrinkage and selection operator (LASSO), and support vector machine (SVM), respectively, with GSE95233 as the training set and GSE57065 as the validation set. Differentially expressed genes (DEGs), weighted gene co-expression network analysis (WGCNA), boxplot statistical analysis, and ROC analysis by Random Forest, Least Absolute Shrinkage and Selection Operator (LASSO), and Support Vector Machine (SVM) machine learning methods were used to identify characteristic genes and build the Nomogram Prediction model.
GSE95233 yielded a total of 1069 genes, 102 of which were sepsis-related and 22 of which were non-sepsis controls. GSE57065 yielded a total of 899 genes, with 467 up-regulated and 432 down-regulated, including 82 sepsis-related genes and 25 non-sepsis control genes. WGCNA analysis excluded outlier samples, leaving 2,029 genes for relationship analysis between sepsis- and non-sepsis patient-associated LncRNA network representation modules, as well as Wein plots of differential genes versus genes in key modules of weighted co-expression network analysis to analyze gene intersections. Machine Learning found the sepsis-related characteristic LncRNAs RP3-508I15.21, RP11-295G20.2, LDLRAD4-AS1, and CTD-2542L18.1. The datasets GSE95233 and GSE57065 were analyzed using Boxplot against the screened genes listed above, respectively. The p-value between the sepsis and non-sepsis groups was less than 0.05, indicating that anomalies were statistically significant. CTD-2542L18.1 in dataset GSE57065 had an AUC value of 0.638, which was less than 0.7 and did not indicate diagnostic significance, but RP3-508I15.21, RP11-295G20.2, and LDLRAD4-AS1 had AUC values more than 0.7 after ROC analysis. All four sepsis-associated LncRNA ROC analyses in dataset GSE95233 exhibited AUC values more than 0.7, indicating diagnostic significance.
LncRNAs RP3_508I15.21, RP11_295G20.2, and LDLRAD4_AS1 have some utility in the diagnosis and treatment of adult sepsis patients, as well as some reference importance in guiding the diagnosis and treatment of clinical sepsis.
Bai Y
,Gao J
,Yan Y
,Zhao X
... -
《Frontiers in Immunology》
Identification of immune-related mitochondrial metabolic disorder genes in septic shock using bioinformatics and machine learning.
Mitochondria are involved in septic shock and inflammatory response syndrome, which severely affects the life security of patients. It is necessary to recognize and explore the immune-mitochondrial genes in septic shock.
The GSE57065 dataset was acquired from the Gene Expression Omnibus (GEO) database and filtered by limma and the weighted correlation network analysis (WGCNA) to identify mitochondrial-related differentially expressed genes (MitoDEGs) in septic shock. The function of MitoDEGs was analyzed using the Gene Ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Set Enrichment Analysis (GSEA), respectively. The Protein-Protein Interaction (PPI) network composed of MitoDEGs was established using Cytoscape. Support Vector Machine Recursive Feature Elimination (SVM-RFE), Random Forest (RF), and Least Absolute Shrinkage and Selection Operator (LASSO) were used to identify diagnostic MitoDEGs, which were validated using receiver operating characteristic (ROC) analysis and Quantitative Real-time Reverse Transcription Polymerase Chain Reaction (qRT-PCR). Furthermore, the infiltration of immunocytes was analyzed using CIBERSORT, and the correlation between diagnostic MitoDEGs and immunocytes was explored using Spearman.
A total of 44 MitoDEGs were filtered, and functional enrichment analysis showed they were associated with mitochondrial function, and the PPI network had 457 nodes and 547 edges. Four diagnostic genes, MitoDEGs, PGS1, C6orf136, THEM4, and EPHX2, were identified by three machine learning algorithms, and qRT-PCR results obtained similar expression levels as bioinformatics analysis. Furthermore, the diagnostic model constructed by the diagnostic genes had fine diagnostic efficacy. Immunocyte infiltration analysis showed that activated immunocytes were abundant and correlated with hub genes, with neutrophils accounting for the largest proportion in septic shock.
In this study, we recognized four immune-mitochondrial key genes (PGS1, C6orf136, THEM4, and EPHX2) in septic shock and designed a novel gene diagnosis model that provided a new and meaningful way for the diagnosis of septic shock.
Cui YH
,Wu CR
,Huang LO
,Xu D
,Tang JG
... -
《HEREDITAS》
Identification of immune-related genes in diagnosing atherosclerosis with rheumatoid arthritis through bioinformatics analysis and machine learning.
Increasing evidence has proven that rheumatoid arthritis (RA) can aggravate atherosclerosis (AS), and we aimed to explore potential diagnostic genes for patients with AS and RA.
We obtained the data from public databases, including Gene Expression Omnibus (GEO) and STRING, and obtained the differentially expressed genes (DEGs) and module genes with Limma and weighted gene co-expression network analysis (WGCNA). Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analysis, the protein-protein interaction (PPI) network, and machine learning algorithms [least absolute shrinkage and selection operator (LASSO) regression and random forest] were performed to explore the immune-related hub genes. We used a nomogram and receiver operating characteristic (ROC) curve to assess the diagnostic efficacy, which has been validated with GSE55235 and GSE73754. Finally, immune infiltration was developed in AS.
The AS dataset included 5,322 DEGs, while there were 1,439 DEGs and 206 module genes in RA. The intersection of DEGs for AS and crucial genes for RA was 53, which were involved in immunity. After the PPI network and machine learning construction, six hub genes were used for the construction of a nomogram and for diagnostic efficacy assessment, which showed great diagnostic value (area under the curve from 0.723 to 1). Immune infiltration also revealed the disorder of immunocytes.
Six immune-related hub genes (NFIL3, EED, GRK2, MAP3K11, RMI1, and TPST1) were recognized, and the nomogram was developed for AS with RA diagnosis.
Liu F
,Huang Y
,Liu F
,Wang H
... -
《Frontiers in Immunology》