Radiomics-based machine learning in the differentiation of benign and malignant bowel wall thickening radiomics in bowel wall thickening.
To distinguish malignant and benign bowel wall thickening (BWT) by using computed tomography (CT) texture features based on machine learning (ML) models and to compare its success with the clinical model and combined model.
One hundred twenty-two patients with BWT identified on contrast-enhanced abdominal CT and underwent colonoscopy were included in this retrospective study. Texture features were extracted from CT images using LifeX software. Feature selection and reduction were performed using the Least Absolute Shrinkage and Selection Operator (LASSO). Six radiomic features were selected with LASSO. In the clinical model, six features (age, gender, thickness, fat stranding, symmetry, and lymph node) were included. Six radiomic and six clinical features were used in the combined model. Classification was done using two machine learning algorithms: Support Vector Machine (SVM) and Logistic Regression (LR). The data sets were divided into 80% training set and 20% test set. Then, training took place with all three datasets. The model's success was tested with the test set consisting of features not used during training.
In the training set, the combined model had the best performance with the area under the curve (AUC) value of 0.99 for SVM and 0.95 for LR. In the radiomic-derived model, the AUC value is 0.87 in SVM and 0.79 in LR. In the clinical model, SVM made this distinction with 0.95 AUC and LR with 0.92 AUC value. In the test set, the classifier with the highest success distinguishing malignant wall thickening is SVM in the radiomic-derived model with an AUC value of 0.90. In other models, the AUC value is in the range of 0.75-0.86, and the accuracy values are in the range of 0.72-0.84.
In conclusion, radiomic-based machine learning has shown high success in distinguishing malignant and benign BWT and may improve diagnostic accuracy compared to clinical features only. The results of our study may help ensure early diagnosis and treatment of colorectal cancers by facilitating the recognition of malignant BWT.
Bülbül HM
,Burakgazi G
,Kesimal U
,Kaba E
... -
《-》
An arterial spin labeling-based radiomics signature and machine learning for the prediction and detection of various stages of kidney damage due to diabetes.
The aim of this study was to assess the predictive capabilities of a radiomics signature obtained from arterial spin labeling (ASL) imaging in forecasting and detecting stages of kidney damage in patients with diabetes mellitus (DM), as well as to analyze the correlation between texture feature parameters and biological clinical indicators. Additionally, this study seeks to identify the imaging risk factors associated with early renal injury in diabetic patients, with the ultimate goal of offering novel insights for predicting and diagnosing early renal injury and its progression in patients with DM.
In total, 42 healthy volunteers (Group A); 68 individuals with diabetes (Group B) who exhibited microalbuminuria, defined by a urinary albumin-to-creatinine ratio (ACR)< 30 mg/g and an estimated glomerular filtration rate (eGFR) within the range of 60-120 mL/min/1.73m²; and 53 patients with diabetic nephropathy (Group C) were included in the study. ASL using magnetic resonance imaging (MRI) at 3.0T was conducted. The radiologist manually delineated regions of interest (ROIs) on the ASL maps of both the right and left kidney cortex. Texture features from the ROIs were extracted utilizing MaZda software. Feature selection was performed utilizing a range of methods, such as the Fisher coefficient, mutual information (MI), probability of classification error, and average correlation coefficient (POE + ACC). A radiomics model was developed to detect early diabetic renal injury, extract imaging risk factors associated with early diabetic renal injury, and examine the relationship between significant texture feature parameters and biological clinical indicators. Patients with DM and kidney injury were followed prospectively. The study utilized seven machine learning algorithms to develop a detective radiomics model and a comprehensive predictive model for assessing the progression of kidney damage in patients with DM. The diagnostic efficacy of the models in detecting variations in diabetic kidney damage over time was evaluated using the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. Empower (R) was used to establish a correlation between clinical biological indicators and texture feature metrics. Statistical analysis was conducted using R, Python, MedCalc 15.8, and GraphPad Prism 8.
A total of 367 texture features were extracted from the ROIs in the kidneys and refined based on selection criteria using MaZda software across groups A, B, and C. The renal blood flow (RBF) values of the renal cortex in groups A, B, and C exhibited a decreasing trend, with values of 256.458 ± 54.256 mL/100g/min, 213.846 ± 52.109 mL/100g/min, and 170.204 ± 34.992 mL/100g/min, respectively. There was a positive correlation between kidney RBF and eGFR (r = 0.439, P<0.001). The negative correlation between RBF and various clinical parameters including urinary albumin-to-creatinine ratio (UACR), body mass index (BMI), diastolic blood pressure (DBP), blood urea nitrogen (BUN), and serum creatinine (SCr) was investigated. Through the use of a least absolute shrinkage and selection operator (LASSO) regression model, the study identified the eight most significant texture features and biological indicators, namely GeoY, GeoRf, GeoRff, GeoRh, GeoW8, GeoW12, S (0, 4) Entropy, and S (5, -5) Entropy. Spearman correlation analysis revealed associations between imaging markers in early diabetic patients with kidney damage and factors such as age, systolic blood pressure (SBP), Alanine Transaminase (ALT), Aspartate Amino Transferase (AST) albumin, uric acid (UA), microalbuminuria (UMA), UACR, 24h urinary protein, fasting blood glucose (FBG), two hours postprandial blood glucose (P2BG), and HbA1c. The study utilized ASL imaging as a detection model to identify renal injury in patients with DM across different stages, achieving a sensitivity of 85.1%, specificity of 65.5%, and an AUC of 0.865. Additionally, a comprehensive prediction model combining imaging labels and biological indicators, with the naive Bayes machine learning algorithm as the best model, demonstrated an AUC of 0.734, accuracy of 0.74, and precision of 0.43.
ASL imaging sequences demonstrated the ability to accurately detect alterations in kidney function and blood flow in patients with DM. Strong associations were observed between renal blood flow values in ASL imaging and established clinical biomarkers. These values show promise in detecting early microstructural changes in the kidneys of diabetic patients. Utilizing image markers in conjunction with clinical indicators was effective in identifying early renal dysfunction and its progression in individuals with DM. Furthermore, the integration of imaging texture feature parameters with clinical biomarkers holds significant potential for predicting early renal damage and its progression in patients with diabetes.
Ma F
,Shao X
,Zhang Y
,Li J
,Li Q
,Sun H
,Wang T
,Liu H
,Zhao F
,Chen L
,Chen J
,Zhou S
,Ji Q
,Yu P
... -
《Frontiers in Endocrinology》
Machine learning constructs a diagnostic prediction model for calculous pyonephrosis.
In order to provide decision-making support for the auxiliary diagnosis and individualized treatment of calculous pyonephrosis, the study aims to analyze the clinical features of the condition, investigate its risk factors, and develop a prediction model of the condition using machine learning techniques. A retrospective analysis was conducted on the clinical data of 268 patients with calculous renal pelvic effusion who underwent ultrasonography-guided percutaneous renal puncture and drainage in our hospital during January 2018 to December 2022. The patients were included into two groups, one for pyonephrosis and the other for hydronephrosis. At a random ratio of 7:3, the research cohort was split into training and testing data sets. Single factor analysis was utilized to examine the 43 characteristics of the hydronephrosis group and the pyonephrosis group using the T test, Spearman rank correlation test and chi-square test. Disparities in the characteristic distributions between the two groups in the training and test sets were noted. The features were filtered using the minimal absolute value shrinkage and selection operator on the training set of data. Auxiliary diagnostic prediction models were established using the following five machine learning (ML) algorithms: random forest (RF), xtreme gradient boosting (XGBoost), support vector machines (SVM), gradient boosting decision trees (GBDT) and logistic regression (LR). The area under the curve (AUC) was used to compare the performance, and the best model was chosen. The decision curve was used to evaluate the clinical practicability of the models. The models with the greatest AUC in the training dataset were RF (1.000), followed by XGBoost (0.999), GBDT (0.977), and SVM (0.971). The lowest AUC was obtained by LR (0.938). With the greatest AUC in the test dataset going to GBDT (0.967), followed by LR (0.957), XGBoost (0.950), SVM (0.939) and RF (0.924). LR, GBDT and RF models had the highest accuracy were 0.873, followed by SVM, and the lowest was XGBoost. Out of the five models, the LR model had the best sensitivity and specificity is 0.923 and 0.887. The GBDT model had the highest AUC among the five models of calculous pyonephrosis developed using the ML, followed by the LR model. The LR model was considered be the best prediction model when combined with clinical operability. As it comes to diagnosing pyonephrosis, the LR model was more credible and had better prediction accuracy than common analysis approaches. Its nomogram can be used as an additional non-invasive diagnostic technique.
Yang B
,Zhong J
,Yang Y
,Xu J
,Liu H
,Liu J
... -
《-》
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.
Survival estimation for patients with symptomatic skeletal metastases ideally should be made before a type of local treatment has already been determined. Currently available survival prediction tools, however, were generated using data from patients treated either operatively or with local radiation alone, raising concerns about whether they would generalize well to all patients presenting for assessment. The Skeletal Oncology Research Group machine-learning algorithm (SORG-MLA), trained with institution-based data of surgically treated patients, and the Metastases location, Elderly, Tumor primary, Sex, Sickness/comorbidity, and Site of radiotherapy model (METSSS), trained with registry-based data of patients treated with radiotherapy alone, are two of the most recently developed survival prediction models, but they have not been tested on patients whose local treatment strategy is not yet decided.
(1) Which of these two survival prediction models performed better in a mixed cohort made up both of patients who received local treatment with surgery followed by radiotherapy and who had radiation alone for symptomatic bone metastases? (2) Which model performed better among patients whose local treatment consisted of only palliative radiotherapy? (3) Are laboratory values used by SORG-MLA, which are not included in METSSS, independently associated with survival after controlling for predictions made by METSSS?
Between 2010 and 2018, we provided local treatment for 2113 adult patients with skeletal metastases in the extremities at an urban tertiary referral academic medical center using one of two strategies: (1) surgery followed by postoperative radiotherapy or (2) palliative radiotherapy alone. Every patient's survivorship status was ascertained either by their medical records or the national death registry from the Taiwanese National Health Insurance Administration. After applying a priori designated exclusion criteria, 91% (1920) were analyzed here. Among them, 48% (920) of the patients were female, and the median (IQR) age was 62 years (53 to 70 years). Lung was the most common primary tumor site (41% [782]), and 59% (1128) of patients had other skeletal metastases in addition to the treated lesion(s). In general, the indications for surgery were the presence of a complete pathologic fracture or an impending pathologic fracture, defined as having a Mirels score of ≥ 9, in patients with an American Society of Anesthesiologists (ASA) classification of less than or equal to IV and who were considered fit for surgery. The indications for radiotherapy were relief of pain, local tumor control, prevention of skeletal-related events, and any combination of the above. In all, 84% (1610) of the patients received palliative radiotherapy alone as local treatment for the target lesion(s), and 16% (310) underwent surgery followed by postoperative radiotherapy. Neither METSSS nor SORG-MLA was used at the point of care to aid clinical decision-making during the treatment period. Survival was retrospectively estimated by these two models to test their potential for providing survival probabilities. We first compared SORG to METSSS in the entire population. Then, we repeated the comparison in patients who received local treatment with palliative radiation alone. We assessed model performance by area under the receiver operating characteristic curve (AUROC), calibration analysis, Brier score, and decision curve analysis (DCA). The AUROC measures discrimination, which is the ability to distinguish patients with the event of interest (such as death at a particular time point) from those without. AUROC typically ranges from 0.5 to 1.0, with 0.5 indicating random guessing and 1.0 a perfect prediction, and in general, an AUROC of ≥ 0.7 indicates adequate discrimination for clinical use. Calibration refers to the agreement between the predicted outcomes (in this case, survival probabilities) and the actual outcomes, with a perfect calibration curve having an intercept of 0 and a slope of 1. A positive intercept indicates that the actual survival is generally underestimated by the prediction model, and a negative intercept suggests the opposite (overestimation). When comparing models, an intercept closer to 0 typically indicates better calibration. Calibration can also be summarized as log(O:E), the logarithm scale of the ratio of observed (O) to expected (E) survivors. A log(O:E) > 0 signals an underestimation (the observed survival is greater than the predicted survival); and a log(O:E) < 0 indicates the opposite (the observed survival is lower than the predicted survival). A model with a log(O:E) closer to 0 is generally considered better calibrated. The Brier score is the mean squared difference between the model predictions and the observed outcomes, and it ranges from 0 (best prediction) to 1 (worst prediction). The Brier score captures both discrimination and calibration, and it is considered a measure of overall model performance. In Brier score analysis, the "null model" assigns a predicted probability equal to the prevalence of the outcome and represents a model that adds no new information. A prediction model should achieve a Brier score at least lower than the null-model Brier score to be considered as useful. The DCA was developed as a method to determine whether using a model to inform treatment decisions would do more good than harm. It plots the net benefit of making decisions based on the model's predictions across all possible risk thresholds (or cost-to-benefit ratios) in relation to the two default strategies of treating all or no patients. The care provider can decide on an acceptable risk threshold for the proposed treatment in an individual and assess the corresponding net benefit to determine whether consulting with the model is superior to adopting the default strategies. Finally, we examined whether laboratory data, which were not included in the METSSS model, would have been independently associated with survival after controlling for the METSSS model's predictions by using the multivariable logistic and Cox proportional hazards regression analyses.
Between the two models, only SORG-MLA achieved adequate discrimination (an AUROC of > 0.7) in the entire cohort (of patients treated operatively or with radiation alone) and in the subgroup of patients treated with palliative radiotherapy alone. SORG-MLA outperformed METSSS by a wide margin on discrimination, calibration, and Brier score analyses in not only the entire cohort but also the subgroup of patients whose local treatment consisted of radiotherapy alone. In both the entire cohort and the subgroup, DCA demonstrated that SORG-MLA provided more net benefit compared with the two default strategies (of treating all or no patients) and compared with METSSS when risk thresholds ranged from 0.2 to 0.9 at both 90 days and 1 year, indicating that using SORG-MLA as a decision-making aid was beneficial when a patient's individualized risk threshold for opting for treatment was 0.2 to 0.9. Higher albumin, lower alkaline phosphatase, lower calcium, higher hemoglobin, lower international normalized ratio, higher lymphocytes, lower neutrophils, lower neutrophil-to-lymphocyte ratio, lower platelet-to-lymphocyte ratio, higher sodium, and lower white blood cells were independently associated with better 1-year and overall survival after adjusting for the predictions made by METSSS.
Based on these discoveries, clinicians might choose to consult SORG-MLA instead of METSSS for survival estimation in patients with long-bone metastases presenting for evaluation of local treatment. Basing a treatment decision on the predictions of SORG-MLA could be beneficial when a patient's individualized risk threshold for opting to undergo a particular treatment strategy ranged from 0.2 to 0.9. Future studies might investigate relevant laboratory items when constructing or refining a survival estimation model because these data demonstrated prognostic value independent of the predictions of the METSSS model, and future studies might also seek to keep these models up to date using data from diverse, contemporary patients undergoing both modern operative and nonoperative treatments.
Level III, diagnostic study.
Lee CC
,Chen CW
,Yen HK
,Lin YP
,Lai CY
,Wang JL
,Groot OQ
,Janssen SJ
,Schwab JH
,Hsu FM
,Lin WH
... -
《-》