-
Toward Foundation Models in Radiology? Quantitative Assessment of GPT-4V's Multimodal and Multianatomic Region Capabilities.
Strotzer QD
,Nieberle F
,Kupke LS
,Napodano G
,Muertz AK
,Meiler S
,Einspieler I
,Rennert J
,Strotzer M
,Wiesinger I
,Wendl C
,Stroszczynski C
,Hamer OW
,Schicho A
... -
《-》
-
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.
Survival estimation for patients with symptomatic skeletal metastases ideally should be made before a type of local treatment has already been determined. Currently available survival prediction tools, however, were generated using data from patients treated either operatively or with local radiation alone, raising concerns about whether they would generalize well to all patients presenting for assessment. The Skeletal Oncology Research Group machine-learning algorithm (SORG-MLA), trained with institution-based data of surgically treated patients, and the Metastases location, Elderly, Tumor primary, Sex, Sickness/comorbidity, and Site of radiotherapy model (METSSS), trained with registry-based data of patients treated with radiotherapy alone, are two of the most recently developed survival prediction models, but they have not been tested on patients whose local treatment strategy is not yet decided.
(1) Which of these two survival prediction models performed better in a mixed cohort made up both of patients who received local treatment with surgery followed by radiotherapy and who had radiation alone for symptomatic bone metastases? (2) Which model performed better among patients whose local treatment consisted of only palliative radiotherapy? (3) Are laboratory values used by SORG-MLA, which are not included in METSSS, independently associated with survival after controlling for predictions made by METSSS?
Between 2010 and 2018, we provided local treatment for 2113 adult patients with skeletal metastases in the extremities at an urban tertiary referral academic medical center using one of two strategies: (1) surgery followed by postoperative radiotherapy or (2) palliative radiotherapy alone. Every patient's survivorship status was ascertained either by their medical records or the national death registry from the Taiwanese National Health Insurance Administration. After applying a priori designated exclusion criteria, 91% (1920) were analyzed here. Among them, 48% (920) of the patients were female, and the median (IQR) age was 62 years (53 to 70 years). Lung was the most common primary tumor site (41% [782]), and 59% (1128) of patients had other skeletal metastases in addition to the treated lesion(s). In general, the indications for surgery were the presence of a complete pathologic fracture or an impending pathologic fracture, defined as having a Mirels score of ≥ 9, in patients with an American Society of Anesthesiologists (ASA) classification of less than or equal to IV and who were considered fit for surgery. The indications for radiotherapy were relief of pain, local tumor control, prevention of skeletal-related events, and any combination of the above. In all, 84% (1610) of the patients received palliative radiotherapy alone as local treatment for the target lesion(s), and 16% (310) underwent surgery followed by postoperative radiotherapy. Neither METSSS nor SORG-MLA was used at the point of care to aid clinical decision-making during the treatment period. Survival was retrospectively estimated by these two models to test their potential for providing survival probabilities. We first compared SORG to METSSS in the entire population. Then, we repeated the comparison in patients who received local treatment with palliative radiation alone. We assessed model performance by area under the receiver operating characteristic curve (AUROC), calibration analysis, Brier score, and decision curve analysis (DCA). The AUROC measures discrimination, which is the ability to distinguish patients with the event of interest (such as death at a particular time point) from those without. AUROC typically ranges from 0.5 to 1.0, with 0.5 indicating random guessing and 1.0 a perfect prediction, and in general, an AUROC of ≥ 0.7 indicates adequate discrimination for clinical use. Calibration refers to the agreement between the predicted outcomes (in this case, survival probabilities) and the actual outcomes, with a perfect calibration curve having an intercept of 0 and a slope of 1. A positive intercept indicates that the actual survival is generally underestimated by the prediction model, and a negative intercept suggests the opposite (overestimation). When comparing models, an intercept closer to 0 typically indicates better calibration. Calibration can also be summarized as log(O:E), the logarithm scale of the ratio of observed (O) to expected (E) survivors. A log(O:E) > 0 signals an underestimation (the observed survival is greater than the predicted survival); and a log(O:E) < 0 indicates the opposite (the observed survival is lower than the predicted survival). A model with a log(O:E) closer to 0 is generally considered better calibrated. The Brier score is the mean squared difference between the model predictions and the observed outcomes, and it ranges from 0 (best prediction) to 1 (worst prediction). The Brier score captures both discrimination and calibration, and it is considered a measure of overall model performance. In Brier score analysis, the "null model" assigns a predicted probability equal to the prevalence of the outcome and represents a model that adds no new information. A prediction model should achieve a Brier score at least lower than the null-model Brier score to be considered as useful. The DCA was developed as a method to determine whether using a model to inform treatment decisions would do more good than harm. It plots the net benefit of making decisions based on the model's predictions across all possible risk thresholds (or cost-to-benefit ratios) in relation to the two default strategies of treating all or no patients. The care provider can decide on an acceptable risk threshold for the proposed treatment in an individual and assess the corresponding net benefit to determine whether consulting with the model is superior to adopting the default strategies. Finally, we examined whether laboratory data, which were not included in the METSSS model, would have been independently associated with survival after controlling for the METSSS model's predictions by using the multivariable logistic and Cox proportional hazards regression analyses.
Between the two models, only SORG-MLA achieved adequate discrimination (an AUROC of > 0.7) in the entire cohort (of patients treated operatively or with radiation alone) and in the subgroup of patients treated with palliative radiotherapy alone. SORG-MLA outperformed METSSS by a wide margin on discrimination, calibration, and Brier score analyses in not only the entire cohort but also the subgroup of patients whose local treatment consisted of radiotherapy alone. In both the entire cohort and the subgroup, DCA demonstrated that SORG-MLA provided more net benefit compared with the two default strategies (of treating all or no patients) and compared with METSSS when risk thresholds ranged from 0.2 to 0.9 at both 90 days and 1 year, indicating that using SORG-MLA as a decision-making aid was beneficial when a patient's individualized risk threshold for opting for treatment was 0.2 to 0.9. Higher albumin, lower alkaline phosphatase, lower calcium, higher hemoglobin, lower international normalized ratio, higher lymphocytes, lower neutrophils, lower neutrophil-to-lymphocyte ratio, lower platelet-to-lymphocyte ratio, higher sodium, and lower white blood cells were independently associated with better 1-year and overall survival after adjusting for the predictions made by METSSS.
Based on these discoveries, clinicians might choose to consult SORG-MLA instead of METSSS for survival estimation in patients with long-bone metastases presenting for evaluation of local treatment. Basing a treatment decision on the predictions of SORG-MLA could be beneficial when a patient's individualized risk threshold for opting to undergo a particular treatment strategy ranged from 0.2 to 0.9. Future studies might investigate relevant laboratory items when constructing or refining a survival estimation model because these data demonstrated prognostic value independent of the predictions of the METSSS model, and future studies might also seek to keep these models up to date using data from diverse, contemporary patients undergoing both modern operative and nonoperative treatments.
Level III, diagnostic study.
Lee CC
,Chen CW
,Yen HK
,Lin YP
,Lai CY
,Wang JL
,Groot OQ
,Janssen SJ
,Schwab JH
,Hsu FM
,Lin WH
... -
《-》
-
Ceftazidime with avibactam for treating severe aerobic Gram-negative bacterial infections: technology evaluation to inform a novel subscription-style payment model.
Harnan S
,Kearns B
,Scope A
,Schmitt L
,Jankovic D
,Hamilton J
,Srivastava T
,Hill H
,Ku CC
,Ren S
,Rothery C
,Bojke L
,Sculpher M
,Woods B
... -
《-》
-
Diagnostic accuracy of vision-language models on Japanese diagnostic radiology, nuclear medicine, and interventional radiology specialty board examinations.
The performance of vision-language models (VLMs) with image interpretation capabilities, such as GPT-4 omni (GPT-4o), GPT-4 vision (GPT-4V), and Claude-3, has not been compared and remains unexplored in specialized radiological fields, including nuclear medicine and interventional radiology. This study aimed to evaluate and compare the diagnostic accuracy of various VLMs, including GPT-4 + GPT-4V, GPT-4o, Claude-3 Sonnet, and Claude-3 Opus, using Japanese diagnostic radiology, nuclear medicine, and interventional radiology (JDR, JNM, and JIR, respectively) board certification tests.
In total, 383 questions from the JDR test (358 images), 300 from the JNM test (92 images), and 322 from the JIR test (96 images) from 2019 to 2023 were consecutively collected. The accuracy rates of the GPT-4 + GPT-4V, GPT-4o, Claude-3 Sonnet, and Claude-3 Opus were calculated for all questions or questions with images. The accuracy rates of the VLMs were compared using McNemar's test.
GPT-4o demonstrated the highest accuracy rates across all evaluations with the JDR (all questions, 49%; questions with images, 48%), JNM (all questions, 64%; questions with images, 59%), and JIR tests (all questions, 43%; questions with images, 34%), followed by Claude-3 Opus with the JDR (all questions, 40%; questions with images, 38%), JNM (all questions, 42%; questions with images, 43%), and JIR tests (all questions, 40%; questions with images, 30%). For all questions, McNemar's test showed that GPT-4o significantly outperformed the other VLMs (all P < 0.007), except for Claude-3 Opus in the JIR test. For questions with images, GPT-4o outperformed the other VLMs in the JDR and JNM tests (all P < 0.001), except Claude-3 Opus in the JNM test.
The GPT-4o had the highest success rates for questions with images and all questions from the JDR, JNM, and JIR board certification tests.
Oura T
,Tatekawa H
,Horiuchi D
,Matsushita S
,Takita H
,Atsukawa N
,Mitsuyama Y
,Yoshida A
,Murai K
,Tanaka R
,Shimono T
,Yamamoto A
,Miki Y
,Ueda D
... -
《-》
-
Trends in Surgical and Nonsurgical Aesthetic Procedures: A 14-Year Analysis of the International Society of Aesthetic Plastic Surgery-ISAPS.
As part of the International Society of Aesthetic Plastic Surgery, we present an analysis of our global aesthetic statistics, fulfilling the role of a worldwide organization of plastic surgeons with a clear mission to disseminate aesthetic education worldwide, promote patient safety, protect high ethical standards, and communicate.
A retrospective analysis of the ISAPS Global Aesthetic Statistics was conducted annually from 2010 to 2023. The design and analysis of each survey was carefully developed and validated by Industry Insights, Inc. prior to distribution. Participants were recruited using an anonymous online questionnaire that focused primarily on the number of surgical and nonsurgical procedures performed in the previous year, as well as questions related to surgeon demographics and the prevalence of medical tourism. ISAPS invited all physicians in their data base who were board-certified plastic surgeons or equivalent and suggested National Societies to encourage their members to participate.
The latest survey reported a global increase in 3.4%, including 34.9 million surgical and nonsurgical aesthetic procedures performed by plastic surgeons in 2023. More than 15.8 million surgical procedures and more than 19.1 million nonsurgical procedures were performed worldwide. During the past decade, a steady increase in aesthetic procedures has been observed, which has been more pronounced since 2021. In the last 4 years, the overall increase in procedures was 40%.
The top five surgical procedures were liposuction, breast augmentation, eyelid surgery, abdominoplasty, and rhinoplasty. This trend has been stable for 14 years, with the exception of 2022, when breast lift surgery temporarily replaced rhinoplasty.
These procedures continue to be the most popular. This group included brow lift, ear surgery, eyelid surgery, facelift, facial bone contouring, facial fat grafting, lip augmentation or frontal surgery, neck lift, and rhinoplasty.
This group included abdominoplasty, buttock augmentation, buttock lift, liposuction, lower body lift, thigh lift, arm lift, upper body lift, labiaplasty, and vaginal rejuvenation. Over the past 14 years, body and extremity procedures have increased, with more than 5.1 million procedures in 2023 compared to 2.6 million in 2009.
The five most popular nonsurgical procedures are botulinum toxin, hyaluronic acid, hair removal, chemical peels, and nonsurgical fat reduction. In 2022, chemical peels will replace nonsurgical skin tightening in the top five.
Procedures performed on men continue to grow, with minimally invasive procedures dominating. The most recent survey reported that they represented 14.5% of the total. The top five surgical procedures were eyelid surgery, gynecomastia, liposuction, rhinoplasty, and facial fat grafting. The most popular nonsurgical procedures for men were botulinum toxin, hyaluronic acid, hair removal, nonsurgical skin tightening, and nonsurgical fat reduction. This trend has held steady for more than a decade.
This study analyzes the most recent data and experience of board-certified aesthetic plastic surgeons in surgical and nonsurgical procedures worldwide over 14 years and provides insight into future trends. More than 60 years have passed since the introduction of liposuction, being one of the most performed aesthetic procedures worldwide over the past 14 years and currently number one procedure performed by plastic surgeons. New trends and technologies have evolved over the years, however, plastic surgeons must be cautious, as history has shown that risks increase when new technologies are introduced. With the popularity of liposuction, other body contouring procedures began to gain interest, and in 2015, gluteal lipoinjections were added to the ISAPS global aesthetic statistics and with them complications arise. In 2018 and 2019, the major patient safety societies, ISAPS, ASERF, ASPS, and ASAPS, began a systematic educational campaign to inform their members about the inherent risks of performing gluteal fat transfer surgery and what techniques or equipment can be used to minimize risks. Another procedure added to the ISAPS statistics in 2010 was vaginal aesthetic surgery. With the new trend of vaginal aesthetics, many believed that they were just changing the appearance of the area, but today it is clear that they are here for much more, to truly empower women with their sexuality. Breast augmentation showed a decline for the first-time last year. However, breast augmentation and liposuction have been the most performed procedures by plastic surgeons worldwide for more than a decade. On the other hand, implant removal has been the fastest growing procedure since 2015, with an overall increase in 46.3% over the past 5 years. In relation to male aesthetic surgery, the number of men undergoing aesthetic procedures has remained stable in recent years at around 14%. Male aesthetics is certainly a growing trend, and our practices should be more inclusive. Another prominent field is regenerative medicine. In relation to plastic surgery, regenerative surgery strategies often involve adipose tissue with stem cells and preadipocytes, alone or in combination with scaffolds. In terms of prevention, regenerative medicine aims to improve the quality of the skin by improving our outcomes and would make it possible to avoid the need for facelifts in the future. Finally, given the increasing popularity of medical procedures abroad ("medical tourism") and the fact that safety regulations and guidelines vary widely from place to place, we encourage patients to choose a board-certified, specialized, trained and experienced plastic surgeon for their procedure and an accredited surgical facility to ensure the procedure in done under the highest patient safety standards.
Despite the obvious cultural and social differences from country to country that make certain procedures more desirable in some geographic areas and less so in other parts of the world, the results of this study show a significant overall increase in all surgical and nonsurgical procedures aimed at improving the aesthetic appearance of the body during14 years. As plastic surgeons, we are open to new possibilities in aesthetic procedures and are responsible for patient safety protocols and procedures.
This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
Triana L
,Palacios Huatuco RM
,Campilgio G
,Liscano E
... -
《-》