-
Prediction of patient-specific quality assurance for volumetric modulated arc therapy using radiomics-based machine learning with dose distribution.
We sought to develop machine learning models to predict the results of patient-specific quality assurance (QA) for volumetric modulated arc therapy (VMAT), which were represented by several dose-evaluation metrics-including the gamma passing rates (GPRs)-and criteria based on the radiomic features of 3D dose distribution in a phantom.
A total of 4,250 radiomic features of 3D dose distribution in a cylindrical dummy phantom for 140 arcs from 106 clinical VMAT plans were extracted. We obtained the following dose-evaluation metrics: GPRs with global and local normalization, the dose difference (DD) in 1% and 2% passing rates (DD1% and DD2%) for 10% and 50% dose threshold, and the distance-to-agreement in 1-mm and 2-mm passing rates (DTA1 mm and DTA2 mm) for 0.5%/mm and 1.0%.mm dose gradient threshold determined by measurement using a diode array in patient-specific QA. The machine learning regression models for predicting the values of the dose-evaluation metrics using the radiomic features were developed based on the elastic net (EN) and extra trees (ET) models. The feature selection and tuning of hyperparameters were performed with nested cross-validation in which four-fold cross-validation is used within the inner loop, and the performance of each model was evaluated in terms of the root mean square error (RMSE), the mean absolute error (MAE), and Spearman's rank correlation coefficient.
The RMSE and MAE for the developed machine learning models ranged from <1% to nearly <10% depending on the dose-evaluation metric, the criteria, and dose and dose gradient thresholds used for both machine learning models. It was advantageous to focus on high dose region for predicating global GPR, DDs, and DTAs. For certain metrics and criteria, it was possible to create models applicable for patients' heterogeneity by training only with dose distributions in phantom.
The developed machine learning models showed high performance for predicting dose-evaluation metrics especially for high dose region depending on the metric and criteria. Our results demonstrate that the radiomic features of dose distribution can be considered good indicators of the plan complexity and useful in predicting measured dose evaluation metrics.
Ishizaka N
,Kinoshita T
,Sakai M
,Tanabe S
,Nakano H
,Tanabe S
,Nakamura S
,Mayumi K
,Akamatsu S
,Nishikata T
,Takizawa T
,Yamada T
,Sakai H
,Kaidu M
,Sasamoto R
,Ishikawa H
,Utsunomiya S
... -
《Journal of Applied Clinical Medical Physics》
-
Treatment plan complexity quantification for predicting gamma passing rates in patient-specific quality assurance for stereotactic volumetric modulated arc therapy.
To investigate the beam complexity of stereotactic Volumetric Modulated Arc Therapy (VMAT) plans quantitively and predict gamma passing rates (GPRs) using machine learning.
The entire dataset is exclusively made of stereotactic VMAT plans (301 plans with 594 beams) from Varian Edge LINAC. The GPRs were analyzed using Varian's portal dosimetry with 2%/2 mm criteria. A total of 27 metrics were calculated to investigate the correlation between metrics and GPRs. Random forest and gradient boosting models were developed and trained to predict the GPRs based on the extracted complexity features. The threshold values of complexity metric were obtained to predict a given beam to pass or fail from ROC curve analysis.
The three moderately significant values of Spearman's rank correlation to GPRs were 0.508 (p < 0.001), 0.445 (p < 0.001), and -0.416 (p < 0.001) for proposed metric LAAM, the ratio of the average aperture area over jaw area (AAJA) and index of modulation, respectively. The random forest method achieved 98.74% prediction accuracy with mean absolute error of 1.23% using five-fold cross-validation, and 98.71% with 1.25% for gradient boosting regressor method, respectively. LAAM, leaf travelling distance (LT), AAJA, LT modulation complexity score (LTMCS) and index of modulation, were the top five most important complexity features. The LAAM metric showed the best performance with AUC value of 0.801, and threshold value of 0.365.
The calculated metrics were effective in quantifying the complexity of stereotactic VMAT plans. We have demonstrated that the GPRs could be accurately predicted using machine learning methods based on extracted complexity metrics. The quantification of complexity and machine learning methods have the potential to improve stereotactic treatment planning and identify the failure of QA results promptly.
Xue X
,Luan S
,Ding Y
,Li X
,Li D
,Wang J
,Ma C
,Jiang M
,Wei W
,Wang X
... -
《Journal of Applied Clinical Medical Physics》
-
Machine learning-generated decision boundaries for prediction and exploration of patient-specific quality assurance failures in stereotactic radiosurgery plans.
Stereotactic radiosurgery (SRS) is a form of radiotherapy treatment during which high radiation dose is delivered in a single or few fractions. These treatments require highly conformal plans with steep dose gradients, which can result in an increase in plan complexity prompting the need for stringent pretreatment patient-specific quality assurance (QA) measurements to ensure the planned and measured dose distributions agree within clinical standards. Complexity scores and machine learning (ML) techniques may help with prediction of QA outcomes; however interpretability and usability of those results continues to be an area of study. This study investigates the use of plan complexity metrics as input for an ML model to allow for prediction of QA outcomes for SRS plans as measured via three-dimension (3D) phantom dose verification. Explorations into interpretability and predictive ability, as well as a prospective in-clinic implementation using the resulting model were performed.
Four hundred ninety-eight plans (1571 volumetric modulated arc therapy arcs) were processed via in-house script to generate several complexity scores. 3D phantom dose verification measurement results were extracted and classified as pass or failure (with failures defined as below 95% voxel agreement passing 3%/1-mm gamma criteria with 10% threshold,) and 1472 of the arcs were split into training and testing sets, with 99 arcs as a sequential holdout set. A z-score scaler was trained on the training set and used to scale all other sets. Variations of multi-leaf collimator (MLC) leaf movement variability, aperture complexity, and leaf size, and monitor unit (MU) at control point weighted target area scores were used as input to a support vector classifier to generate a series of 1D, 2D, and 5D decision boundaries. The best performing 5D model was then used within a prospective in-clinic study providing predictions to physicists prior to ordering 3D phantom dose verification measurements for 38 patient plans (112 arcs). The decision to order 3D phantom dose verification measurements was recorded before and after prediction.
Best performing 1D threshold and 2D prediction models with best performance produced a QA failure recall and QA passing recall of 1.00 and 0.55, and 0.82 and 0.82, respectively. Best performing 5D prediction model produced a QA failure recall (sensitivity) of 1.00 and QA passing recall (specificity) of 0.72. This model was then used within a prospective in-clinic study providing predictions to physicists prior to ordering 3D phantom dose verification measurements and achieved a QA failure recall of 1.00 and QA passing recall of 0.58. The decision to order 3D phantom dose verification measurements was recorded before and after measurement. A single initially unidentified failing plan of the prospective cohort was successfully predicted to fail by the model.
Implementation of complexity score-based prediction models for SRS would allow for support of a clinician's decision to reduce time spent performing QA measurements and avoid patient treatment delays (i.e., in case of QA failure).
Braun J
,Quirk S
,Tchistiakova E
《-》
-
Pretreatment patient-specific quality assurance prediction based on 1D complexity metrics and 3D planning dose: classification, gamma passing rates, and DVH metrics.
Highly modulated radiotherapy plans aim to achieve target conformality and spare organs at risk, but the high complexity of the plan may increase the uncertainty of treatment. Thus, patient-specific quality assurance (PSQA) plays a crucial role in ensuring treatment accuracy and providing clinical guidance. This study aims to propose a prediction model based on complexity metrics and patient planning dose for PSQA results.
Planning dose, measurement-based reconstructed dose and plan complexity metrics of the 687 radiotherapy plans of patients treated in our institution were collected for model establishing. Global gamma passing rate (GPR, 3%/2mm,10% threshold) of 90% was used as QA criterion. Neural architecture models based on Swin-transformer were adapted to process 3D dose and incorporate 1D metrics to predict QA results. The dataset was divided into training (447), validation (90), and testing (150) sets. Evaluation of predictions was performed using mean absolute error (MAE) for GPR, planning target volume (PTV) HI and PTV CI, mean absolute percentage error (MAPE) for PTV D95, PTV D2 and PTV Dmean, and the area under the receiver operating characteristic (ROC) curve (AUC) for classification. Furthermore, we also compare the prediction results with other models based on either only 1D or 3D inputs.
In this dataset, 72.8% (500/687) plans passed the pretreatment QA under the criterion. On the testing set, our model achieves the highest performance, with the 1D model slightly surpassing the 3D model. The performance results are as follows (combine, 1D, and 3D transformer): The AUCs are 0.92, 0.88 and 0.86 for QA classification. The MAEs of prediction are 0.039, 0.046, and 0.040 for 3D GPR, 0.018, 0.021, and 0.019 for PTV HI, and 0.075, 0.078, and 0.084 for PTV CI. Specifically, for cases with 3D GPRs greater than 90%, the MAE could achieve 0.020 (combine). The MAPE of prediction is 1.23%, 1.52%, and 1.66% for PTV D95, 2.36%, 2.67%, and 2.45% for PTV D2, and 1.46%, 1.70%, and 1.71% for PTV Dmean.
The model based on 1D complexity metrics and 3D planning dose could predict pretreatment PSQA results with high accuracy and the complexity metrics play a leading role in the model. Furthermore, dose-volume metric deviations of PTV could be predicted and more clinically valuable information could be provided.
Chen L
,Luo H
,Li S
,Tan X
,Feng B
,Yang X
,Wang Y
,Jin F
... -
《Radiation Oncology》
-
Using machine learning to predict gamma passing rate in volumetric-modulated arc therapy treatment plans.
This study aims to develop an algorithm to predict gamma passing rate (GPR) in the volumetric-modulated arc therapy (VMAT) technique.
A total of 118 clinical VMAT plans, including 28 mediastina, 25 head and neck, 40 brains intensity-modulated radiosurgery, and 25 prostate cases, were created in RayStation treatment planning system for Edge and TrueBeam linacs. In-house scripts were developed to compute Modulation indices such as plan-averaged beam area (PA), plan-averaged beam irregularity (PI), total monitor unit (MU), leaf travel/arc length, mean dose rate variation, and mean gantry speed variation. Pretreatment verifications were performed on ArcCHECK phantom with SNC software. GPR was calculated with 3%/2 mm and 10% threshold. The dataset was randomly split into a training (70%) and a test (30%) dataset. A random forest regression (RFR) model and support vector regression (SVR) with linear kernel were trained to predict GPR using the complexity metrics as input. The prediction performance was evaluated by calculating the mean absolute error (MAE), R2 , and root mean square error (RMSE).
RMSEs at γ 3%/2 mm for RFR and SVR were 1.407 ± 0.103 and 1.447 ± 0.121, respectively. MAE was 1.14 ± 0.084 for RFR and 1.101 ± 0.09 for SVR. R2 was equal to 0.703 ± 0.027 and 0.689 ± 0.053 for RFR and SVR, respectively. GPR of 3%/2 mm with a 10% threshold can be predicted with an error smaller than 3% for 94% of plans using RFR and SVR models. The most important metrics that had the greatest impact on how accurately GPR can be predicted were determined to be the PA, PI, and total MU.
In terms of its prediction values and errors, SVR (linear) appeared to be comparable with RFR for this dataset. Based on our results, the PA, PI, and total MU calculations may be useful in guiding VMAT plan evaluation and ultimately reducing uncertainties in planning and radiation delivery.
Salari E
,Shuai Xu K
,Sperling NN
,Parsai EI
... -
《Journal of Applied Clinical Medical Physics》