-
Using machine learning to predict gamma passing rate in volumetric-modulated arc therapy treatment plans.
This study aims to develop an algorithm to predict gamma passing rate (GPR) in the volumetric-modulated arc therapy (VMAT) technique.
A total of 118 clinical VMAT plans, including 28 mediastina, 25 head and neck, 40 brains intensity-modulated radiosurgery, and 25 prostate cases, were created in RayStation treatment planning system for Edge and TrueBeam linacs. In-house scripts were developed to compute Modulation indices such as plan-averaged beam area (PA), plan-averaged beam irregularity (PI), total monitor unit (MU), leaf travel/arc length, mean dose rate variation, and mean gantry speed variation. Pretreatment verifications were performed on ArcCHECK phantom with SNC software. GPR was calculated with 3%/2 mm and 10% threshold. The dataset was randomly split into a training (70%) and a test (30%) dataset. A random forest regression (RFR) model and support vector regression (SVR) with linear kernel were trained to predict GPR using the complexity metrics as input. The prediction performance was evaluated by calculating the mean absolute error (MAE), R2 , and root mean square error (RMSE).
RMSEs at γ 3%/2 mm for RFR and SVR were 1.407 ± 0.103 and 1.447 ± 0.121, respectively. MAE was 1.14 ± 0.084 for RFR and 1.101 ± 0.09 for SVR. R2 was equal to 0.703 ± 0.027 and 0.689 ± 0.053 for RFR and SVR, respectively. GPR of 3%/2 mm with a 10% threshold can be predicted with an error smaller than 3% for 94% of plans using RFR and SVR models. The most important metrics that had the greatest impact on how accurately GPR can be predicted were determined to be the PA, PI, and total MU.
In terms of its prediction values and errors, SVR (linear) appeared to be comparable with RFR for this dataset. Based on our results, the PA, PI, and total MU calculations may be useful in guiding VMAT plan evaluation and ultimately reducing uncertainties in planning and radiation delivery.
Salari E
,Shuai Xu K
,Sperling NN
,Parsai EI
... -
《Journal of Applied Clinical Medical Physics》
-
Machine learning models to predict the delivered positions of Elekta multileaf collimator leaves for volumetric modulated arc therapy.
Accurate positioning of multileaf collimator (MLC) leaves during volumetric modulated arc therapy (VMAT) is essential for accurate treatment delivery. We developed a linear regression, support vector machine, random forest, extreme gradient boosting (XGBoost), and an artificial neural network (ANN) for predicting the delivered leaf positions for VMAT plans.
For this study, 160 MLC log files from 80 VMAT plans were obtained from a single institution treated on 3 Elekta Versa HD linear accelerators. The gravity vector, X1 and X2 jaw positions, leaf gap, leaf position, leaf velocity, and leaf acceleration were extracted and used as model inputs. The models were trained using 70% of the log files and tested on the remaining 30%. Mean absolute error (MAE), root mean square error (RMSE), the coefficient of determination R2 , and fitted line plots showing the relationship between delivered and predicted leaf positions were used to evaluate model performance.
The models achieved the following errors: linear regression (MAE = 0.158 mm, RMSE = 0.225 mm), support vector machine (MAE = 0.141 mm, RMSE = 0.199 mm), random forest (MAE = 0.161 mm, RMSE = 0.229 mm), XGBoost (MAE = 0.185 mm, RMSE = 0.273 mm), and ANN (MAE = 0.361 mm, RMSE = 0.521 mm). A significant correlation between a plan's gamma passing rate (GPR) and the prediction errors of linear regression, support vector machine, and random forest is seen (p < 0.045).
We examined various models to predict the delivered MLC positions for VMAT plans treated with Elekta linacs. Linear regression, support vector machine, random forest, and XGBoost achieved lower errors than ANN. Models that can accurately predict the individual leaf positions during treatment can help identify leaves that are deviating from the planned position, which can improve a plan's GPR.
Sivabhaskar S
,Li R
,Roy A
,Kirby N
,Fakhreddine M
,Papanikolaou N
... -
《Journal of Applied Clinical Medical Physics》
-
Modulation indices and plan delivery accuracy of volumetric modulated arc therapy.
We evaluated the performance of various modulation indices (MI) for volumetric modulated arc therapy (VMAT) to predict plan delivery accuracy.
The specific indices evaluated were MI quantifying the mechanical uncertainty (MIt ), MI quantifying the mechanical and dose calculation uncertainties (MIc ), MI for station parameter optimized radiation therapy (MISPORT ), modulation complexity score for VMAT (MCSv ), leaf travel modulation complexity score (LTMCS), plan averaged beam area (PA), plan averaged beam irregularity (PI), plan averaged beam modulation (PM), and plan normalized monitor unit (PMU) to predict VMAT delivery accuracy. By utilizing 240 VMAT plans generated with the Trilogy and TrueBeam STx, Spearman's rank correlation coefficients (r) were calculated between the MIs and measures of conventional methods.
For the Trilogy system, MIc showed the highest r values with gamma passing rates (GPRs) (r = -0.624 with P < 0.001 for MapCHECK2 and r = -0.655 with P < 0.001 for ArcCHECK). For TrueBeam STx, MIc also showed the highest r values with GPRs (r = -0.625 with P < 0.001 for the MapCHECK2 and r = -0.561 with P < 0.001 for the ArcCHECK). The MIt and MIc showed the highest r values to the MLC position errors for the Trilogy and TrueBeam STx systems (r = 0.770 with P < 0.001 and r = 0.712 with P < 0.001, respectively). The PA showed the highest percent of r values (P < 0.05) to differences in the dose-volume parameters between original VMAT plans and actual deliveries for the Trilogy systems (30.9%). Both the MIt and MIc showed the highest percent of r values (P < 0.05) to differences in the dose-volume parameters between original VMAT plans and actual deliveries for the TrueBeam STx systems (31.8%).
To comprehensively review the results, the MIc showed the best performance to predict the VMAT delivery accuracy.
Park JM
,Kim JI
,Park SY
《Journal of Applied Clinical Medical Physics》
-
Evaluation of prediction and classification performances in different machine learning models for patient-specific quality assurance of head-and-neck VMAT plans.
The purpose of this study is to evaluate the prediction and classification performances of the gamma passing rate (GPR) for different machine learning models and to select the best model for achieving machine learning-based patient-specific quality assurance (PSQA).
The measurement verification of 356 head-and-neck volumetric modulated arc therapy plans was performed using a diode array phantom (Delta4 Phantom), and GPR values at 2%/2 mm with global normalization and 3%/2 mm with local normalization were calculated. Machine learning models, including ridge regression (RIDGE), random forest (RF), support vector regression (SVR), and stacked generalization (STACKING), were used to predict the GPR. Each machine learning model was trained using 260 plans, and the prediction accuracy was evaluated using the remaining 96 plans. The prediction error between the measured and predicted GPR was evaluated. For the classification evaluation, the lower control limit for the measured GPR and lower control limit for predicted GPR (LCLp ) was defined to identify whether the GPR values represent a "pass" or a "fail." LCLp values with 99% and 99.9% confidence levels were calculated as the upper prediction limits for the GPR estimated from the linear regression between the measured and predicted GPR.
There was an overestimation trend of the low measured GPR. The maximum prediction errors for RIDGE, RF, SVR, and STACKING were 3.2%, 2.9%, 2.3%, and 2.2% at the global 2%/2 mm and 6.3%, 6.6%, 6.1%, and 5.5% at the local 3%/2 mm, respectively. In the global 2%/2 mm, the sensitivity was 100% for all the machine learning models except RIDGE when using 99% LCLp . The specificity was 76.1% for RIDGE, RF, and SVR and 66.3% for STACKING; however, the specificity decreased dramatically when 99.9% LCLp was used. In the local 3%/2 mm, however, only STACKING showed 100% sensitivity when using 99% LCLp . The decrease in the specificity using 99.9% LCLp was smaller than that in the global 2%/2 mm, and the specificity for RIDGE, RF, SVR, and STACKING was 61.3%, 61.3%, 72.0%, and 66.8%, respectively.
STACKING had better prediction accuracy for low GPR values than other machine learning models. Applying LCLp to a regression model enabled the consistent evaluation of quantitative and qualitative GPR predictions. Adjusting the confidence level of the LCLp helped improve the balance between the sensitivity and specificity. We suggest that STACKING can assist the safe and efficient operation of PSQA.
Kusunoki T
,Hatanaka S
,Hariu M
,Kusano Y
,Yoshida D
,Katoh H
,Shimbo M
,Takahashi T
... -
《-》
-
Prediction of patient-specific quality assurance for volumetric modulated arc therapy using radiomics-based machine learning with dose distribution.
We sought to develop machine learning models to predict the results of patient-specific quality assurance (QA) for volumetric modulated arc therapy (VMAT), which were represented by several dose-evaluation metrics-including the gamma passing rates (GPRs)-and criteria based on the radiomic features of 3D dose distribution in a phantom.
A total of 4,250 radiomic features of 3D dose distribution in a cylindrical dummy phantom for 140 arcs from 106 clinical VMAT plans were extracted. We obtained the following dose-evaluation metrics: GPRs with global and local normalization, the dose difference (DD) in 1% and 2% passing rates (DD1% and DD2%) for 10% and 50% dose threshold, and the distance-to-agreement in 1-mm and 2-mm passing rates (DTA1 mm and DTA2 mm) for 0.5%/mm and 1.0%.mm dose gradient threshold determined by measurement using a diode array in patient-specific QA. The machine learning regression models for predicting the values of the dose-evaluation metrics using the radiomic features were developed based on the elastic net (EN) and extra trees (ET) models. The feature selection and tuning of hyperparameters were performed with nested cross-validation in which four-fold cross-validation is used within the inner loop, and the performance of each model was evaluated in terms of the root mean square error (RMSE), the mean absolute error (MAE), and Spearman's rank correlation coefficient.
The RMSE and MAE for the developed machine learning models ranged from <1% to nearly <10% depending on the dose-evaluation metric, the criteria, and dose and dose gradient thresholds used for both machine learning models. It was advantageous to focus on high dose region for predicating global GPR, DDs, and DTAs. For certain metrics and criteria, it was possible to create models applicable for patients' heterogeneity by training only with dose distributions in phantom.
The developed machine learning models showed high performance for predicting dose-evaluation metrics especially for high dose region depending on the metric and criteria. Our results demonstrate that the radiomic features of dose distribution can be considered good indicators of the plan complexity and useful in predicting measured dose evaluation metrics.
Ishizaka N
,Kinoshita T
,Sakai M
,Tanabe S
,Nakano H
,Tanabe S
,Nakamura S
,Mayumi K
,Akamatsu S
,Nishikata T
,Takizawa T
,Yamada T
,Sakai H
,Kaidu M
,Sasamoto R
,Ishikawa H
,Utsunomiya S
... -
《Journal of Applied Clinical Medical Physics》