ChatGPT for improving postoperative instructions in multiple fields of plastic surgery.
Clear discharge instructions are vital for patients and caregivers to manage postoperative care at home. However, they often exceed the sixth-grade reading level recommended by national associations. It was hypothesized that ChatGPT could help rewrite instructions to this level for increased accessibility and comprehension. This study aimed to assess the readability, understandability, actionability, and safety of ChatGPT rewritten postoperative instructions in four plastic surgery subspecialties: breast, craniofacial, hand, and aesthetic surgery.
Postoperative instructions from four index procedures in plastic surgery were obtained. ChatGPT was used to rewrite at the sixth- and fourth-grade reading levels. Readability was determined by seven readability indexes, understandability and actionability by the Patient Education Materials Assessment Tool for printable materials questionnaire, and safety by the primary surgeons.
Overall, the average readability of the original postoperative instructions ranged between the seventh and eighth grade levels. Only one of the sixth-grade ChatGPT instructions was lowered to the sixth-grade level. Of the fourth-grade ChatGPT instructions, all were reduced to the sixth-grade-level or below, but none achieved the fourth-grade level. Understandability scores increased as reading levels decreased, whereas actionability scores decreased for fourth-grade rewrites. Safety was not compromised in all rewrites.
ChatGPT can adapt postoperative instructions to a more readable sixth-grade level without compromising safety. This study suggests prompting ChatGPT to write one to two grade levels lower than the desired reading level. While understandability increased for all ChatGPT rewrites, actionability decreased for fourth-grade-level instructions. Sixth-grade remains the optimal reading level for postoperative instructions. This study demonstrates that ChatGPT can help improve patient care by improving the readability of postoperative instructions.
Zhang A
,Li CXR
,Piper M
,Rose J
,Chen K
,Lin AY
... -
《-》
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.
Survival estimation for patients with symptomatic skeletal metastases ideally should be made before a type of local treatment has already been determined. Currently available survival prediction tools, however, were generated using data from patients treated either operatively or with local radiation alone, raising concerns about whether they would generalize well to all patients presenting for assessment. The Skeletal Oncology Research Group machine-learning algorithm (SORG-MLA), trained with institution-based data of surgically treated patients, and the Metastases location, Elderly, Tumor primary, Sex, Sickness/comorbidity, and Site of radiotherapy model (METSSS), trained with registry-based data of patients treated with radiotherapy alone, are two of the most recently developed survival prediction models, but they have not been tested on patients whose local treatment strategy is not yet decided.
(1) Which of these two survival prediction models performed better in a mixed cohort made up both of patients who received local treatment with surgery followed by radiotherapy and who had radiation alone for symptomatic bone metastases? (2) Which model performed better among patients whose local treatment consisted of only palliative radiotherapy? (3) Are laboratory values used by SORG-MLA, which are not included in METSSS, independently associated with survival after controlling for predictions made by METSSS?
Between 2010 and 2018, we provided local treatment for 2113 adult patients with skeletal metastases in the extremities at an urban tertiary referral academic medical center using one of two strategies: (1) surgery followed by postoperative radiotherapy or (2) palliative radiotherapy alone. Every patient's survivorship status was ascertained either by their medical records or the national death registry from the Taiwanese National Health Insurance Administration. After applying a priori designated exclusion criteria, 91% (1920) were analyzed here. Among them, 48% (920) of the patients were female, and the median (IQR) age was 62 years (53 to 70 years). Lung was the most common primary tumor site (41% [782]), and 59% (1128) of patients had other skeletal metastases in addition to the treated lesion(s). In general, the indications for surgery were the presence of a complete pathologic fracture or an impending pathologic fracture, defined as having a Mirels score of ≥ 9, in patients with an American Society of Anesthesiologists (ASA) classification of less than or equal to IV and who were considered fit for surgery. The indications for radiotherapy were relief of pain, local tumor control, prevention of skeletal-related events, and any combination of the above. In all, 84% (1610) of the patients received palliative radiotherapy alone as local treatment for the target lesion(s), and 16% (310) underwent surgery followed by postoperative radiotherapy. Neither METSSS nor SORG-MLA was used at the point of care to aid clinical decision-making during the treatment period. Survival was retrospectively estimated by these two models to test their potential for providing survival probabilities. We first compared SORG to METSSS in the entire population. Then, we repeated the comparison in patients who received local treatment with palliative radiation alone. We assessed model performance by area under the receiver operating characteristic curve (AUROC), calibration analysis, Brier score, and decision curve analysis (DCA). The AUROC measures discrimination, which is the ability to distinguish patients with the event of interest (such as death at a particular time point) from those without. AUROC typically ranges from 0.5 to 1.0, with 0.5 indicating random guessing and 1.0 a perfect prediction, and in general, an AUROC of ≥ 0.7 indicates adequate discrimination for clinical use. Calibration refers to the agreement between the predicted outcomes (in this case, survival probabilities) and the actual outcomes, with a perfect calibration curve having an intercept of 0 and a slope of 1. A positive intercept indicates that the actual survival is generally underestimated by the prediction model, and a negative intercept suggests the opposite (overestimation). When comparing models, an intercept closer to 0 typically indicates better calibration. Calibration can also be summarized as log(O:E), the logarithm scale of the ratio of observed (O) to expected (E) survivors. A log(O:E) > 0 signals an underestimation (the observed survival is greater than the predicted survival); and a log(O:E) < 0 indicates the opposite (the observed survival is lower than the predicted survival). A model with a log(O:E) closer to 0 is generally considered better calibrated. The Brier score is the mean squared difference between the model predictions and the observed outcomes, and it ranges from 0 (best prediction) to 1 (worst prediction). The Brier score captures both discrimination and calibration, and it is considered a measure of overall model performance. In Brier score analysis, the "null model" assigns a predicted probability equal to the prevalence of the outcome and represents a model that adds no new information. A prediction model should achieve a Brier score at least lower than the null-model Brier score to be considered as useful. The DCA was developed as a method to determine whether using a model to inform treatment decisions would do more good than harm. It plots the net benefit of making decisions based on the model's predictions across all possible risk thresholds (or cost-to-benefit ratios) in relation to the two default strategies of treating all or no patients. The care provider can decide on an acceptable risk threshold for the proposed treatment in an individual and assess the corresponding net benefit to determine whether consulting with the model is superior to adopting the default strategies. Finally, we examined whether laboratory data, which were not included in the METSSS model, would have been independently associated with survival after controlling for the METSSS model's predictions by using the multivariable logistic and Cox proportional hazards regression analyses.
Between the two models, only SORG-MLA achieved adequate discrimination (an AUROC of > 0.7) in the entire cohort (of patients treated operatively or with radiation alone) and in the subgroup of patients treated with palliative radiotherapy alone. SORG-MLA outperformed METSSS by a wide margin on discrimination, calibration, and Brier score analyses in not only the entire cohort but also the subgroup of patients whose local treatment consisted of radiotherapy alone. In both the entire cohort and the subgroup, DCA demonstrated that SORG-MLA provided more net benefit compared with the two default strategies (of treating all or no patients) and compared with METSSS when risk thresholds ranged from 0.2 to 0.9 at both 90 days and 1 year, indicating that using SORG-MLA as a decision-making aid was beneficial when a patient's individualized risk threshold for opting for treatment was 0.2 to 0.9. Higher albumin, lower alkaline phosphatase, lower calcium, higher hemoglobin, lower international normalized ratio, higher lymphocytes, lower neutrophils, lower neutrophil-to-lymphocyte ratio, lower platelet-to-lymphocyte ratio, higher sodium, and lower white blood cells were independently associated with better 1-year and overall survival after adjusting for the predictions made by METSSS.
Based on these discoveries, clinicians might choose to consult SORG-MLA instead of METSSS for survival estimation in patients with long-bone metastases presenting for evaluation of local treatment. Basing a treatment decision on the predictions of SORG-MLA could be beneficial when a patient's individualized risk threshold for opting to undergo a particular treatment strategy ranged from 0.2 to 0.9. Future studies might investigate relevant laboratory items when constructing or refining a survival estimation model because these data demonstrated prognostic value independent of the predictions of the METSSS model, and future studies might also seek to keep these models up to date using data from diverse, contemporary patients undergoing both modern operative and nonoperative treatments.
Level III, diagnostic study.
Lee CC
,Chen CW
,Yen HK
,Lin YP
,Lai CY
,Wang JL
,Groot OQ
,Janssen SJ
,Schwab JH
,Hsu FM
,Lin WH
... -
《-》
Improving health literacy and stakeholder-directed knowledge of One Health through analysis of readability: a cross sectional infodemiology study.
The One Health approach involves collaboration across several sectors, including public health, veterinary and environmental sectors in an integrated manner. These sectors may be disparate and unrelated, however to succeed, all stakeholders need to understand what the other stakeholders are communicating. Likewise, it is important that there is public acceptance and support of One Health approaches, which requires effective communication between professional and institutional organisations and the public. To help aid and facilitate such communication, written materials need to be readable by all stakeholders, in order to communicate effectively. There has been an exponential increase in the publication of papers involving One Health, with <5 per year, in the 2000s, to nearly 500 published in 2023. To date, readability of One Health information has not been scrutinised, nor has it been considered as an integral intervention of One Health policy communication. The aim of this study was therefore to examine readability of public-facing One Health information prepared by 24 global organisations.
Readability was calculated using Readable software, to obtain four readability scores [(ⅰ) Flesch Reading Ease (FRE), (ⅱ) Flesch-Kincaid Grade Level (FKGL), (ⅲ) Gunning Fog Index and (ⅳ) SMOG Index] and two text metrics [words/sentence, syllables/word] for 100 sources of One Health information, from four categories [One Health public information; PubMed abstracts; Science in One Health (SOH) abstracts (articles); SOH abstracts (reviews)].
Readability of One Health information for the public is poor, not reaching readability reference standards. No information was found that had a readability of less than 9th grade (around 14 years old). Mean values for the FRE and FKGL were (19.4 ± 1.4) (target >60) and (15.6 ± 0.3) (target <8), respectively, with mean words per sentence and syllables per word of 20.5 and 2.0, respectively. Abstracts with "One Health" in the title were more difficult to read than those without "One Health" in the title (FRE: P = 0.0337; FKGL: P = 0.0087). Comparison of FRE and FKGL readability scores for the four categories of One Health information [One Health public information; PubMed abstracts; SOH abstracts (articles); SOH abstracts (reviews)] showed that SOH abstracts from articles were easier to read than those from SOH reviews. No One Health public-facing information from the 100 sources examined met the FKGL target of ≤8. The most easily read One Health information required a Grade Level of 9th grade (14-15 years old), with a mean Grade Level of 15.5 (university/college level).
Considerable work is required in making One Health written materials more readable, particularly for children and adolescents (<14 years of age). It is important that any interventions or mitigations taken to support better public understanding of the One Health approach are not ephemeral, but have longer lasting and legacy value. Authors of One Health information should consider using readability calculators when preparing One Health information for their stakeholders, to check the readability of their work, so that the final material is within recommended readability reference parameters, to support the health literacy and stakeholder-directed knowledge of their readers.
Moore JE
,Millar BC
《-》
Assessing the Readability of English and Spanish Online Patient Educational Materials for Deep Venous Thrombosis.
Online patient educational materials (OPEMs) help patients engage in their health care. The American Medical Association (AMA) recommends OPEM be written at or below the 6th grade reading level. This study assessed the readability of deep venous thrombosis OPEM in English and Spanish.
Google searches were conducted in English and Spanish using "deep venous thrombosis" and "trombosis venosa profunda," respectively. The top 25 patient-facing results were recorded for each, and categorized into source type (hospital, professional society, other). Readability of English OPEM was measured using several scales including the Flesch Reading Ease Readability Formula and Flesch-Kincaid Grade Level. Readability of Spanish OPEM was measured using the Fernández-Huerta Index and INFLESZ Scale. Readability was compared to the AMA recommendation, between languages, and across source types.
Only one (4%) Spanish OPEM was written at an easy level, compared to 7 (28%) English OPEM (P = 0.04). More English (28%) OPEM were easy to read compared to Spanish (4%), with a significant difference in reading difficulty breakdown between languages (P = 0.04). The average readability scores for English and Spanish OPEM across all scales were significantly greater than the recommended level (P < 0.01). Only four total articles (8%) met the AMA recommendation, with no significant difference between English and Spanish OPEM (P = 0.61).
Nearly all English and Spanish deep venous thrombosis OPEM analyzed were above the recommended reading level. English resources had overall easier readability compared to Spanish, which may represent a barrier to care. To limit health disparities, information should be presented at accessible reading levels.
Wang KM
,Ramirez JL
,Iannuzzi JC
,Ulloa JG
... -
《-》