A 3-Gene Random Forest Model to Diagnose Non-obstructive Azoospermia Based on Transcription Factor-Related Henes.
Non-obstructive azoospermia (NOA) is one of the most severe forms of male infertility, but its diagnosis biomarkers with high sensitivity and specificity are largely unknown. Transcription factors (TFs) play essential roles in many pathological processes in different diseases. Herein, we aimed to identify the TFs showing high diagnosis ability for NOA through machine learning algorithms. The transcriptome data of the testicular tissue from 11 control and 47 NOA subjects were set as the training dataset; meanwhile, 1665 TFs were retrieved from the HumanTFDB. Through the feature extraction methods, including genomic difference analysis, Lasso, Boruta, SVM-RFE, and logistic regression, ETV2, TBX2, and ZNF689 were ultimately screened and then were included in the random forest (RF) diagnosis model. The RF model displayed high predictive power in the training (F-measure = 1) and two external validation (n = 31, F-measure = 0.902; n = 20, F-measure = 0.941) cohorts. The seminal plasma and testicular biopsy samples of 20 control and 20 NOA patients were collected from the local hospital, and the expression levels of ETV2, TBX2, and ZNF689 were measured via RT-qPCR and immunohistochemistry. The RF model could also distinguish the NOA samples in the local cohort (F-measure = 0.741). Single-cell RNA sequencing analysis, which was based on the 432 testicular cell samples from an NOA patient, showed that ETV2, TBX2, and ZNF689 were all significantly associated with spermatogenesis. In all, a 3-TF random forest diagnosis model was successfully established, providing novel insights into the latent mechanisms of NOA.
Zhou R
,Liang J
,Chen Q
,Tian H
,Yang C
,Liu C
... -
《-》
Identification and validation of SHC1 and FGFR1 as novel immune-related oxidative stress biomarkers of non-obstructive azoospermia.
Non-obstructive azoospermia (NOA) is a major contributor of male infertility. Herein, we used existing datasets to identify novel biomarkers for the diagnosis and prognosis of NOA, which could have great significance in the field of male infertility.
NOA datasets were obtained from the Gene Expression Omnibus (GEO) database. CIBERSORT was utilized to analyze the distributions of 22 immune cell populations. Hub genes were identified by applying weighted gene co-expression network analysis (WGCNA), machine learning methods, and protein-protein interaction (PPI) network analysis. The expression of hub genes was verified in external datasets and was assessed by receiver operating characteristic (ROC) curve analysis. Gene set enrichment analysis (GSEA) was applied to explore the important functions and pathways of hub genes. The mRNA-microRNA (miRNA)-transcription factors (TFs) regulatory network and potential drugs were predicted based on hub genes. Single-cell RNA sequencing data from the testes of patients with NOA were applied for analyzing the distribution of hub genes in single-cell clusters. Furthermore, testis tissue samples were obtained from patients with NOA and obstructive azoospermia (OA) who underwent testicular biopsy. RT-PCR and Western blot were used to validate hub gene expression.
Two immune-related oxidative stress hub genes (SHC1 and FGFR1) were identified. Both hub genes were highly expressed in NOA samples compared to control samples. ROC curve analysis showed a remarkable prediction ability (AUCs > 0.8). GSEA revealed that hub genes were predominantly enriched in toll-like receptor and Wnt signaling pathways. A total of 24 TFs, 82 miRNAs, and 111 potential drugs were predicted based on two hub genes. Single-cell RNA sequencing data in NOA patients indicated that SHC1 and FGFR1 were highly expressed in endothelial cells and Leydig cells, respectively. RT-PCR and Western blot results showed that mRNA and protein levels of both hub genes were significantly upregulated in NOA testis tissue samples, which agree with the findings from analysis of the microarray data.
It appears that SHC1 and FGFR1 could be significant immune-related oxidative stress biomarkers for detecting and managing patients with NOA. Our findings provide a novel viewpoint for illustrating potential pathogenesis in men suffering from infertility.
Pan Y
,Chen X
,Zhou H
,Xu M
,Li Y
,Wang Q
,Xu Z
,Ren C
,Liu L
,Liu X
... -
《Frontiers in Endocrinology》
Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms.
The cause and mechanism of non-obstructive azoospermia (NOA) is complicated; therefore, an effective therapy strategy is yet to be developed. This study aimed to analyse the pathogenesis of NOA at the molecular biological level and to identify the core regulatory genes, which could be utilised as potential biomarkers.
Three NOA microarray datasets (GSE45885, GSE108886, and GSE145467) were collected from the GEO database and merged into training sets; a further dataset (GSE45887) was then defined as the validation set. Differential gene analysis, consensus cluster analysis, and WGCNA were used to identify preliminary signature genes; then, enrichment analysis was applied to these previously screened signature genes. Next, 4 machine learning algorithms (RF, SVM, GLM, and XGB) were used to detect potential biomarkers that are most closely associated with NOA. Finally, a diagnostic model was constructed from these potential biomarkers and visualised as a nomogram. The differential expression and predictive reliability of the biomarkers were confirmed using the validation set. Furthermore, the competing endogenous RNA network was constructed to identify the regulatory mechanisms of potential biomarkers; further, the CIBERSORT algorithm was used to calculate immune infiltration status among the samples.
A total of 215 differentially expressed genes (DEGs) were identified between NOA and control groups (27 upregulated and 188 downregulated genes). The WGCNA results identified 1123 genes in the MEblue module as target genes that are highly correlated with NOA positivity. The NOA samples were divided into 2 clusters using consensus clustering; further, 1027 genes in the MEblue module, which were screened by WGCNA, were considered to be target genes that are highly correlated with NOA classification. The 129 overlapping genes were then established as signature genes. The XGB algorithm that had the maximum AUC value (AUC=0.946) and the minimum residual value was used to further screen the signature genes. IL20RB, C9orf117, HILS1, PAOX, and DZIP1 were identified as potential NOA biomarkers. This 5 biomarker model had the highest AUC value, of up to 0.982, compared to other single biomarker models; additionally, the results of this biomarker model were verified in the validation set.
As IL20RB, C9orf117, HILS1, PAOX, and DZIP1 have been determined to possess the strongest association with NOA, these five genes could be used as potential therapeutic targets for NOA patients. Furthermore, the model constructed using these five genes, which possessed the highest diagnostic accuracy, may be an effective biomarker model that warrants further experimental validation.
Tang Q
,Su Q
,Wei L
,Wang K
,Jiang T
... -
《Frontiers in Endocrinology》