-
A clinical prediction model for predicting the risk of liver metastasis from renal cell carcinoma based on machine learning.
Renal cell carcinoma (RCC) is a highly metastatic urological cancer. RCC with liver metastasis (LM) carries a dismal prognosis. The objective of this study is to develop a machine learning (ML) model that predicts the risk of RCC with LM, which is used to assist clinical treatment.
The retrospective study data of 42,547 patients with RCC were extracted from the Surveillance, Epidemiology, and End Results (SEER) database. ML includes algorithmic methods and is a fast-rising field that has been widely used in the biomedical field. Logistic regression (LR), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGB), random forest (RF), decision tree (DT), and naive Bayesian model [Naive Bayes Classifier (NBC)] were applied to develop prediction models to predict the risk of RCC with LM. The six models were 10-fold cross-validated, and the best-performing model was selected based on the area under the curve (AUC) value. A web online calculator was constructed based on the best ML model.
Bone metastasis, lung metastasis, grade, T stage, N stage, and tumor size were independent risk factors for the development of RCC with LM by multivariate regression analysis. In addition, the correlation of the relative proportions of the six clinical variables was shown by a heat map. In the prediction models of RCC with LM, the mean AUC of the XGB model among the six ML algorithms was 0.947. Based on the XGB model, the web calculator (https://share.streamlit.io/liuwencai4/renal_liver/main/renal_liver.py) was developed to evaluate the risk of RCC with LM.
This XGB model has the best predictive effect on RCC with LM. The web calculator constructed based on the XGB model has great potential for clinicians to make clinical decisions and improve the prognosis of RCC patients with LM.
Wang Z
,Xu C
,Liu W
,Zhang M
,Zou J
,Shao M
,Feng X
,Yang Q
,Li W
,Shi X
,Zang G
,Yin C
... -
《Frontiers in Endocrinology》
-
Development and validation of a machine learning model to predict the risk of lymph node metastasis in renal carcinoma.
Studies have shown that about 30% of kidney cancer patients will have metastasis, and lymph node metastasis (LNM) may be related to a poor prognosis. Our retrospective study aims to provide a reliable machine learning-based model to predict the occurrence of LNM in kidney cancer. We screened the pathological grade, liver metastasis, M staging, primary site, T staging, and tumor size from the training group (n=39016) formed by the SEER database and the validation group (n=771) formed by the medical center. Independent predictors of LNM in cancer patients. Using six different algorithms to build a prediction model, it is found that the prediction performance of the XGB model in the training group and the validation group is significantly better than any other machine learning model. The results show that prediction tools based on machine learning can accurately predict the probability of LNM in patients with kidney cancer and have satisfactory clinical application prospects.
Lymph node metastasis (LNM) is associated with the prognosis of patients with kidney cancer. This study aimed to provide reliable machine learning-based (ML-based) models to predict the probability of LNM in kidney cancer.
Data on patients diagnosed with kidney cancer were extracted from the Surveillance, Epidemiology and Outcomes (SEER) database from 2010 to 2017, and variables were filtered by least absolute shrinkage and selection operator (LASSO), univariate and multivariate logistic regression analyses. Statistically significant risk factors were used to build predictive models. We used 10-fold cross-validation in the validation of the model. The area under the receiver operating characteristic curve (AUC) was used to assess the performance of the model. Correlation heat maps were used to investigate the correlation of features using permutation analysis to assess the importance of predictors. Probability density functions (PDFs) and clinical utility curves (CUCs) were used to determine clinical utility thresholds.
The training cohort of this study included 39,016 patients, and the validation cohort included 771 patients. In the two cohorts, 2544 (6.5%) and 66 (8.1%) patients had LNM, respectively. Pathological grade, liver metastasis, M stage, primary site, T stage, and tumor size were independent predictive factors of LNM. In both model validation, the XGB model significantly outperformed any of the machine learning models with an AUC value of 0.916.A web calculator (https://share.streamlit.io/liuwencai4/renal_lnm/main/renal_lnm.py) were built based on the XGB model. Based on the PDF and CUC, we suggested 54.6% as a threshold probability for guiding the diagnosis of LNM, which could distinguish about 89% of LNM patients.
The predictive tool based on machine learning can precisely indicate the probability of LNM in kidney cancer patients and has a satisfying application prospect in clinical practice.
Feng X
,Hong T
,Liu W
,Xu C
,Li W
,Yang B
,Song Y
,Li T
,Li W
,Zhou H
,Yin C
... -
《Frontiers in Endocrinology》
-
Establishment and Validation of a Machine Learning Prediction Model Based on Big Data for Predicting the Risk of Bone Metastasis in Renal Cell Carcinoma Patients.
Since the prognosis of renal cell carcinoma (RCC) patients with bone metastasis (BM) is poor, this study is aimed at using big data to build a machine learning (ML) model to predict the risk of BM in RCC patients.
A retrospective study was conducted on 40,355 RCC patients in the SEER database from 2010 to 2017. LASSO regression and multivariate logistic regression analysis was performed to determine independent risk factors of RCC-BM. Six ML algorithm models, including LR, GBM, XGB, RF, DT, and NBC, were used to establish risk models for predicting RCC-BM. The prediction performance of ML models was weighed by 10-fold cross-validation.
The study investigated 40,355 patients diagnosed with RCC in the SEER database, where 1,811 (4.5%) were BM patients. Independent risk factors for BM were tumor grade, T stage, N stage, liver metastasis, lung metastasis, and brain metastasis. Among the RCC-BM risk prediction models established by six ML algorithms, the XGB model showed the best prediction performance (AUC = 0.891). Therefore, a network calculator based on the XGB model was established to individually assess the risk of BM in patients with RCC.
The XGB risk prediction model based on the ML algorithm performed a good prediction effect on BM in RCC patients.
Xu C
,Liu W
,Yin C
,Li W
,Liu J
,Sheng W
,Tang H
,Li W
,Zhang Q
... -
《-》
-
An External-Validated Prediction Model to Predict Lung Metastasis among Osteosarcoma: A Multicenter Analysis Based on Machine Learning.
Lung metastasis greatly affects medical therapeutic strategies in osteosarcoma. This study aimed to develop and validate a clinical prediction model to predict the risk of lung metastasis among osteosarcoma patients based on machine learning (ML) algorithms.
We retrospectively collected osteosarcoma patients from the Surveillance Epidemiology and End Results (SEER) database and from four hospitals in China. Six ML algorithms, including logistic regression (LR), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), random forest (RF), decision tree (DT), and multilayer perceptron (MLP), were applied to build predictive models for predicting lung metastasis using patient's demographics, clinical characteristics, and therapeutic variables from the SEER database. The model was internally validated using 10-fold cross-validation to calculate the mean area under the curve (AUC) and the model was externally validated using the Chinese multicenter osteosarcoma data. Relative importance ranking of predictors was plotted to understand the importance of each predictor in different ML algorithms. The correlation heat map of predictors was plotted to understand the correlation of each predictor, selecting the 10-fold cross-validation with the highest AUC value in the external validation ROC curve to build a web calculator.
Of all enrolled patients from the SEER database, 17.73% (194/1094) developed lung metastasis. The multiple logistic regression analysis showed that sex, N stage, T stage, surgery, and bone metastasis were all independent risk factors for lung metastasis. In predicting lung metastasis, the mean AUCs of the six ML algorithms ranged from 0.711 to 0.738 in internal validation and 0.697 to 0.729 in external validation. Among the six ML algorithms, the extreme gradient boosting (XGBoost) model had the highest AUC value with an average internal AUC of 0.738 and an external AUC of 0.729. The best performing ML algorithm model was used to build a web calculator to facilitate clinicians to calculate the risk of lung metastasis for each patient.
The XGBoost model may have the best prediction effect and the online calculator based on this model can help doctors to determine the lung metastasis risk of osteosarcoma patients and help to make individualized medical strategies.
Li W
,Liu W
,Hussain Memon F
,Wang B
,Xu C
,Dong S
,Wang H
,Hu Z
,Quan X
,Deng Y
,Liu Q
,Su S
,Yin C
... -
《-》
-
A Machine Learning-Based Predictive Model for Predicting Lymph Node Metastasis in Patients With Ewing's Sarcoma.
In order to provide reference for clinicians and bring convenience to clinical work, we seeked to develop and validate a risk prediction model for lymph node metastasis (LNM) of Ewing's sarcoma (ES) based on machine learning (ML) algorithms.
Clinicopathological data of 923 ES patients from the Surveillance, Epidemiology, and End Results (SEER) database and 51 ES patients from multi-center external validation set were retrospectively collected. We applied ML algorithms to establish a risk prediction model. Model performance was checked using 10-fold cross-validation in the training set and receiver operating characteristic (ROC) curve analysis in external validation set. After determining the best model, a web-based calculator was made to promote the clinical application.
LNM was confirmed or unable to evaluate in 13.86% (135 out of 974) ES patients. In multivariate logistic regression, race, T stage, M stage and lung metastases were independent predictors for LNM in ES. Six prediction models were established using random forest (RF), naive Bayes classifier (NBC), decision tree (DT), xgboost (XGB), gradient boosting machine (GBM), logistic regression (LR). In 10-fold cross-validation, the average area under curve (AUC) ranked from 0.705 to 0.764. In ROC curve analysis, AUC ranged from 0.612 to 0.727. The performance of the RF model ranked best. Accordingly, a web-based calculator was developed (https://share.streamlit.io/liuwencai2/es_lnm/main/es_lnm.py).
With the help of clinicopathological data, clinicians can better identify LNM in ES patients. Risk prediction models established in this study performed well, especially the RF model.
Li W
,Zhou Q
,Liu W
,Xu C
,Tang ZR
,Dong S
,Wang H
,Li W
,Zhang K
,Li R
,Zhang W
,Hu Z
,Shibin S
,Liu Q
,Kuang S
,Yin C
... -
《Frontiers in Medicine》