-
Who Benefits From Hip Arthroplasty or Knee Arthroplasty? Preoperative Patient-reported Outcome Thresholds Predict Meaningful Improvement.
Hip arthroplasty (HA) and knee arthroplasty (KA) are high-volume procedures. However, there is a debate about the quality of indication; that is, whether surgery is truly indicated in all patients. Patient-reported outcome measures (PROMs) may be used to determine preoperative thresholds to differentiate patients who will likely benefit from surgery from those who will not.
(1) What were the minimum clinically important differences (MCIDs) for three commonly used PROMs in a large population of patients undergoing HA or KA treated in a general orthopaedic practice? (2) Do patients who reach the MCID differ in important ways from those who do not? (3) What preoperative PROM score thresholds best distinguish patients who achieve a meaningful improvement 12 months postsurgery from those who do not? (4) Do patients with preoperative PROM scores below thresholds still experience gains after surgery?
Between October 1, 2019, and December 31, 2020, 4182 patients undergoing HA and 3645 patients undergoing KA agreed to be part of the PROMoting Quality study and were hence included by study nurses in one of nine participating German hospitals. From a selected group of 1843 patients with HA and 1546 with KA, we derived MCIDs using the anchor-based change difference method to determine meaningful improvements. Second, we estimated which preoperative PROM score thresholds best distinguish patients who achieve an MCID from those who do not, using the preoperative PROM scores that maximized the Youden index. PROMs were Hip Disability and Osteoarthritis Outcome Score-Physical Function short form (HOOS-PS) (scored 0 to 100 points; lower indicates better health), Knee Injury and Osteoarthritis Outcome Score-Physical Function short form (KOOS-PS) (scored 0 to 100 points; lower indicates better health), EuroQol 5-Dimension 5-level (EQ-5D-5L) (scored -0.661 to 1 points; higher indicates better health), and a 10-point VAS for pain (perceived pain in the joint under consideration for surgery within the past 7 days) (scored 0 to 10 points; lower indicates better health). The performance of derived thresholds is reported using the Youden index, sensitivity, specificity, F1 score, geometric mean as a measure of central tendency, and area under the receiver operating characteristic curve.
MCIDs for the EQ-5D-5L were 0.2 for HA and 0.2 for KA, with a maximum of 1 point, where higher values represented better health-related quality of life. For the pain scale, they were -0.9 for HA and -0.7 for KA, of 10 points (maximum), where lower scores represent lower pain. For the HOOS-PS, the MCID was -10, and for the KOOS-PS it was -5 of 100 points, where lower scores represent better functioning. Patients who reached the MCID differed from patients who did not reach the MCID with respect to baseline PROM scores across the evaluated PROMs and for both HA and KA. Patients who reached an MCID versus those who did not also differed regarding other aspects including education and comorbidities, but this was not consistent across PROMs and arthroplasty type. Preoperative PROM score thresholds for HA were 0.7 for EQ-5D-5L (Youden index: 0.55), 42 for HOOS-PS (Youden index: 0.27), and 3.5 for the pain scale (Youden index: 0.47). For KA, the thresholds were 0.6 for EQ-5D-5L (Youden index: 0.57), 39 for KOOS-PS (Youden index: 0.25), and 6.5 for the pain scale (Youden index: 0.40). A higher Youden index for EQ-5D-5L than for the other PROMs indicates that the thresholds for EQ-5D-5L were better for distinguishing patients who reached a meaningful improvement from those who did not. Patients who did not reach the thresholds could still achieve MCIDs, especially for functionality and the pain scale.
We found that patients who experienced meaningful improvements (MCIDs) mainly differed from those who did not regarding their preoperative PROM scores. We further identified that patients undergoing HA or KA with a score above 0.7 or 0.6, respectively, on the EQ-5D-5L, below 42 or 39 on the HOOS-PS or KOOS-PS, or below 3.5 or 6.5 on a 10-point joint-specific pain scale presurgery had no meaningful benefit from surgery. The thresholds can support clinical decision-making. For example, when thresholds indicate that a meaningful improvement is not likely to be achieved after surgery, other treatment options may be prioritized. Although the thresholds can be used as support, patient preferences and medical expertise must supplement the decision. Future studies might evaluate the utility of using these thresholds in practice, examine how different thresholds can be combined as a multidimensional decision tool, and derive presurgery thresholds based on additional PROMs used in practice.
Preoperative PROM score thresholds in this study will support clinicians in decision-making through objective measures that can improve the quality of the recommendation for surgery.
Langenberger B
,Steinbeck V
,Busse R
《-》
-
Discordance Abounds in Minimum Clinically Important Differences in THA: A Systematic Review.
The minimum clinically important difference (MCID) is intended to detect a change in a patient-reported outcome measure (PROM) large enough for a patient to appreciate. Their growing use in orthopaedic research stems from the necessity to identify a metric, other than the p value, to better assess the effect size of an outcome. Yet, given that MCIDs are population-specific and that there are multiple calculation methods, there is concern about inconsistencies. Given the increasing use of MCIDs in total hip arthroplasty (THA) research, a systematic review of calculated MCID values and their respective ranges, as well as an assessment of their applications, is important to guide and encourage their use as a critical measure of effect size in THA outcomes research.
We systematically reviewed MCID calculations and reporting in current THA research to answer the following: (1) What are the most-reported PROM MCIDs in THA, and what is their range of values? (2) What proportion of studies report anchor-based versus distribution-based MCID values? (3) What are the most common methods by which anchor-based MCID values are derived? (4) What are the most common derivation methods for distribution-based MCID values? (5) How do the reported medians and corresponding ranges compare between calculation methods for each PROM?
The EMBASE, MEDLINE, and PubMed databases were systematically reviewed from inception through March 2022 for THA studies reporting an MCID value for any PROMs. Two independent authors reviewed articles for inclusion. All articles calculating new PROM MCID scores after primary THA were included for data extraction and analysis. MCID values for each PROM, MCID calculation method, number of patients, and study demographics were extracted from each article. In total, 30 articles were included. There were 45 unique PROMs for which 242 MCIDs were reported. These studies had a total of 1,000,874 patients with a median age of 64 years and median BMI of 28.7 kg/m 2 . Women made up 55% of patients in the total study population, and the median follow-up period was 12 months (range 0 to 77 months). The overall risk of bias was assessed as moderate using the modified Methodological Index for Nonrandomized Studies criteria for comparative studies (the mean score for comparative papers in this review was 18 of 24, with higher scores representing better study quality) and noncomparative studies (for these, the mean score was 10 of a possible 16 points, with higher scores representing higher study quality). Calculated values were classified as anchor-based, distribution-based, or not reported. MCID values for each PROM, MCID calculation method, number of patients, and study demographics were extracted from each study. Anchor-based and distribution-based MCIDs were compared for each unique PROM using a Wilcoxon rank sum test, given the non-normal distribution of values.
The Oxford Hip Score (OHS) and the Hip Injury and Osteoarthritis Score (HOOS) Pain and Quality of Life subscore MCIDs were the most frequently reported, comprising 12% (29 of 242), 8% (20 of 242), and 8% (20 of 242), respectively. The EuroQol VAS (EQ-VAS) was the next-most frequently reported (7% [17 of 242]) followed by the EuroQol 5D (EQ-5D) (7% [16 of 242]). The median anchor-based value for the OHS was 9 (IQR 8 to 11), while the median distribution-based value was 6 (IQR 5 to 6). The median anchor-based MCID values for HOOS Pain and Quality of Life were 33 (IQR 28 to 35) and 25 (14 to 27), respectively; the median distribution-based values were 10 (IQR 9 to 10) and 13 (IQR 10 to 14), respectively. Thirty percent (nine of 30) of studies used an anchor-based method to calculate a new MCID, while 40% (12 of 30) used a distribution-based technique. Thirty percent of studies (nine of 30) calculated MCID values using both methods. For studies reporting an anchor-based calculation method, a question assessing pain relief, satisfaction, or quality of life on a five-point Likert scale was the most commonly used anchor (30% [eight of 27]), followed by a receiver operating characteristic curve estimation (22% [six of 27]). For studies using distribution-based calculations, the most common method was one-half the standard deviation of the difference between preoperative and postoperative PROM scores (46% [12 of 26]). Most reported median MCID values (nine of 14) did not differ by calculation method for each unique PROM (p > 0.05). The OHS, HOOS JR, and HOOS Function, Symptoms, and Activities of Daily Living subscores all varied by calculation method, because each anchor-based value was larger than its respective distribution-based value.
We found that MCIDs do not vary very much by calculation method across most outcome measurement tools. Additionally, there are consistencies in MCID calculation methods, because most authors used an anchor question with a Likert scale for the anchor-based approach or used one-half the standard deviation of preoperative and postoperative PROM score differences for the distribution-based approach. For some of the most frequently reported MCIDs, however, anchor-based values tend to be larger than distribution-based values for their respective PROMs.
We recommend using a 9-point increase as the MCID for the OHS, consistent with the median reported anchor-based value derived from several high-quality studies with large patient groups that used anchor-based approaches for MCID calculations, which we believe are most appropriate for most applications in clinical research. Likewise, we recommend using the anchor-based 33-point and 25-point MCIDs for the HOOS Pain and Quality of Life subscores, respectively. We encourage using anchor-based MCID values of WOMAC Pain, Function, and Stiffness subscores, which were 29, 26, and 30, respectively.
Deckey DG
,Verhey JT
,Christopher ZK
,Gerhart CRB
,Clarke HD
,Spangehl MJ
,Bingham JS
... -
《-》
-
How do Patient-reported Outcome Scores in International Hip and Knee Arthroplasty Registries Compare?
Patient-reported outcome measures (PROMs) are the only systematic approach through which the patient's perspective can be considered by surgeons (in determining a procedure's efficacy or appropriateness) or healthcare systems (in the context of value-based healthcare). PROMs in registries enable international comparison of patient-centered outcomes after total joint arthroplasty, but the extent to which those scores may vary between different registry populations has not been clearly defined.
(1) To what degree do mean change in general and joint-specific PROM scores vary across arthroplasty registries, and to what degree is the proportion of missing PROM scores in an individual registry associated with differences in the mean reported change scores? (2) Do PROM scores vary with patient BMI across registries? (3) Are comorbidity levels comparable across registries, and are they associated with differences in PROM scores?
Thirteen national, regional, or institutional registries from nine countries reported aggregate PROM scores for patients who had completed PROMs preoperatively and 6 and/or 12 months postoperatively. The requested aggregate PROM scores were the EuroQol-5 Dimension Questionnaire (EQ-5D) index values, on which score 1 reflects "full health" and 0 reflects "as bad as death." Joint-specific PROMs were the Oxford Knee Score (OKS) and the Oxford Hip Score (OHS), with total scores ranging from 0 to 48 (worst-best), and the Hip Disability and Osteoarthritis Outcome Score-Physical Function shortform (HOOS-PS) and the Knee Injury and Osteoarthritis Outcome Score-Physical Function shortform (KOOS-PS) values, scored 0 to 100 (worst-best). Eligible patients underwent primary unilateral THA or TKA for osteoarthritis between 2016 and 2019. Registries were asked to exclude patients with subsequent revisions within their PROM collection period. Raw aggregated PROM scores and scores adjusted for age, gender, and baseline values were inspected descriptively. Across all registries and PROMs, the reported percentage of missing PROM data varied from 9% (119 of 1354) to 97% (5305 of 5445). We therefore graphically explored whether PROM scores were associated with the level of data completeness. For each PROM cohort, chi-square tests were performed for BMI distributions across registries and 12 predefined PROM strata (men versus women; age 20 to 64 years, 65 to 74 years, and older than 75 years; and high or low preoperative PROM scores). Comorbidity distributions were evaluated descriptively by comparing proportions with American Society of Anesthesiologists (ASA) physical status classification of 3 or higher across registries for each PROM cohort.
The mean improvement in EQ-5D index values (10 registries) ranged from 0.16 to 0.33 for hip registries and 0.12 to 0.25 for knee registries. The mean improvement in the OHS (seven registries) ranged from 18 to 24, and for the HOOS-PS (three registries) it ranged from 29 to 35. The mean improvement in the OKS (six registries) ranged from 15 to 20, and for the KOOS-PS (four registries) it ranged from 19 to 23. For all PROMs, variation was smaller when adjusting the scores for differences in age, gender, and baseline values. After we compared the registries, there did not seem to be any association between the level of missing PROM data and the mean change in PROM scores. The proportions of patients with BMI 30 kg/m 2 or higher ranged from 16% to 43% (11 hip registries) and from 35% to 62% (10 knee registries). Distributions of patients across six BMI categories differed across hip and knee registries. Further, for all PROMs, distributions also differed across 12 predefined PROM strata. For the EQ-5D, patients in the younger age groups (20 to 64 years and 65 to 74 years) had higher proportions of BMI measurements greater than 30 kg/m 2 than older patients, and patients with the lowest baseline scores had higher proportions of BMI measurements more than 30 kg/m 2 compared with patients with higher baseline scores. These associations were similar for the OHS and OKS cohorts. The proportions of patients with ASA Class at least 3 ranged across registries from 6% to 35% (eight hip registries) and from 9% to 42% (nine knee registries).
Improvements in PROM scores varied among international registries, which may be partially explained by differences in age, gender, and preoperative scores. Higher BMI tended to be associated with lower preoperative PROM scores across registries. Large variation in BMI and comorbidity distributions across registries suggest that future international studies should consider the effect of adjusting for these factors. Although we were not able to evaluate its effect specifically, missing PROM data is a recurring challenge for registries. Demonstrating generalizability of results and evaluating the degree of response bias is crucial in using registry-based PROMs data to evaluate differences in outcome. Comparability between registries in terms of specific PROMs collection, postoperative timepoints, and demographic factors to enable confounder adjustment is necessary to use comparison between registries to inform and improve arthroplasty care internationally.
Level III, therapeutic study.
Ingelsrud LH
,Wilkinson JM
,Overgaard S
,Rolfson O
,Hallstrom B
,Navarro RA
,Terner M
,Karmakar-Hore S
,Webster G
,Slawomirski L
,Sayers A
,Kendir C
,de Bienassis K
,Klazinga N
,Dahl AW
,Bohm E
... -
《-》
-
What Is the Clinical Benefit of Common Orthopaedic Procedures as Assessed by the PROMIS Versus Other Validated Outcomes Tools?
Patient-reported outcome measures (PROMs), including the Patient-reported Outcomes Measurement Information System (PROMIS), are increasingly used to measure healthcare value. The minimum clinically important difference (MCID) is a metric that helps clinicians determine whether a statistically detectable improvement in a PROM after surgical care is likely to be large enough to be important to a patient or to justify an intervention that carries risk and cost. There are two major categories of MCID calculation methods, anchor-based and distribution-based. This variability, coupled with heterogeneous surgical cohorts used for existing MCID values, limits their application to clinical care.
In our study, we sought (1) to determine MCID thresholds and attainment percentages for PROMIS after common orthopaedic procedures using distribution-based methods, (2) to use anchor-based MCID values from published studies as a comparison, and (3) to compare MCID attainment percentages using PROMIS scores to other validated outcomes tools such as the Hip Disability and Osteoarthritis Outcome Score (HOOS) and Knee Disability and Osteoarthritis Outcome Score (KOOS).
This was a retrospective study at two academic medical centers and three community hospitals. The inclusion criteria for this study were patients who were age 18 years or older and who underwent elective THA for osteoarthritis, TKA for osteoarthritis, one-level posterior lumbar fusion for lumbar spinal stenosis or spondylolisthesis, anatomic total shoulder arthroplasty or reverse total shoulder arthroplasty for glenohumeral arthritis or rotator cuff arthropathy, arthroscopic anterior cruciate ligament reconstruction, arthroscopic partial meniscectomy, or arthroscopic rotator cuff repair. This yielded 14,003 patients. Patients undergoing revision operations or surgery for nondegenerative pathologies and patients without preoperative PROMs assessments were excluded, leaving 9925 patients who completed preoperative PROMIS assessments and 9478 who completed other preoperative validated outcomes tools (HOOS, KOOS, numerical rating scale for leg pain, numerical rating scale for back pain, and QuickDASH). Approximately 66% (6529 of 9925) of patients had postoperative PROMIS scores (Physical Function, Mental Health, Pain Intensity, Pain Interference, and Upper Extremity) and were included for analysis. PROMIS scores are population normalized with a mean score of 50 ± 10, with most scores falling between 30 to 70. Approximately 74% (7007 of 9478) of patients had postoperative historical assessment scores and were included for analysis. The proportion who reached the MCID was calculated for each procedure cohort at 6 months of follow-up using distribution-based MCID methods, which included a fraction of the SD (1/2 or 1/3 SD) and minimum detectable change (MDC) using statistical significance (such as the MDC 90 from p < 0.1). Previously published anchor-based MCID thresholds from similar procedure cohorts and analogous PROMs were used to calculate the proportion reaching MCID.
Within a given distribution-based method, MCID thresholds for PROMIS assessments were similar across multiple procedures. The MCID threshold ranged between 3.4 and 4.5 points across all procedures using the 1/2 SD method. Except for meniscectomy (3.5 points), the anchor-based PROMIS MCID thresholds (range 4.5 to 8.1 points) were higher than the SD distribution-based MCID values (2.3 to 4.5 points). The difference in MCID thresholds based on the calculation method led to a similar trend in MCID attainment. Using THA as an example, MCID attainment using PROMIS was achieved by 76% of patients using an anchor-based threshold of 7.9 points. However, 82% of THA patients attained MCID using the MDC 95 method (6.1 points), and 88% reached MCID using the 1/2 SD method (3.9 points). Using the HOOS metric (scaled from 0 to 100), 86% of THA patients reached the anchor-based MCID threshold (17.5 points). However, 91% of THA patients attained the MCID using the MDC 90 method (12.5 points), and 93% reached MCID using the 1/2 SD method (8.4 points). In general, the proportion of patients reaching MCID was lower for PROMIS than for other validated outcomes tools; for example, with the 1/2 SD method, 72% of patients who underwent arthroscopic partial meniscectomy reached the MCID on PROMIS Physical Function compared with 86% on KOOS.
MCID calculations can provide clinical correlation for PROM scores interpretation. The PROMIS form is increasingly used because of its generalizability across diagnoses. However, we found lower proportions of MCID attainment using PROMIS scores compared with historical PROMs. By using historical proportions of attainment on common orthopaedic procedures and a spectrum of MCID calculation techniques, the PROMIS MCID benchmarks are realizable for common orthopaedic procedures. For clinical practices that routinely collect PROMIS scores in the clinical setting, these results can be used by individual surgeons to evaluate personal practice trends and by healthcare systems to quantify whether clinical care initiatives result in meaningful differences. Furthermore, these MCID thresholds can be used by researchers conducting retrospective outcomes research with PROMIS.
Level III, therapeutic study.
Karhade AV
,Bernstein DN
,Desai V
,Bedair HS
,O'Donnell EA
,Tanaka MJ
,Bono CM
,Harris MB
,Schwab JH
,Tobert DG
... -
《-》
-
Poor Knee-specific and Generic Patient-reported Outcome Measure Scores at 6 Months Are Associated With Early Revision Knee Arthroplasty: A Study From the Australian Orthopaedic Association National Joint Replacement Registry.
The ability to identify which patients are at a greater risk of early revision knee arthroplasty has important practical and resource implications. Many international arthroplasty registries administer patient-reported outcome measures (PROMs) to provide a holistic assessment of pain, function, and quality of life. However, few PROM scores have been evaluated as potential indicators of early revision knee arthroplasty, and earlier studies have largely focused on knee-specific measures.
This national registry-based study asked: (1) Which 6-month postoperative knee-specific and generic PROM scores are associated with early revision knee arthroplasty (defined as revision surgery performed 6 to 24 months after the primary procedure)? (2) Is a clinically important improvement in PROM scores (based on thresholds for the minimal important change) after primary knee arthroplasty associated with a lower risk of early revision?
Preoperative and 6-month postoperative PROM scores for patients undergoing primary knee arthroplasty were sourced from the Australian Orthopaedic Association National Joint Replacement Registry (AOANJRR) and Arthroplasty Clinical Outcomes Registry National. Between January 2013 and December 2020, PROM data were available for 19,402 primary total knee arthroplasties; these data were linked to AOANJRR data on revision knee arthroplasty. Of these, 3448 procedures were excluded because they did not have 6-month PROM data, they had not reached the 6-month postoperative point, they had died before 24 months, or they had received revision knee arthroplasty before the 6-month PROMs assessment. After these exclusions, data were analyzed for 15,954 primary knee arthroplasties. Associations between knee-specific (knee pain, Oxford Knee Score, and 12-item Knee injury and Osteoarthritis Outcome Score [KOOS-12]) or generic PROM scores (5-level EuroQol quality of life instrument [EQ-5D], EQ VAS, perceived change, and satisfaction) and revision surgery were explored using t-tests, chi-square tests, and regression models. Ninety-four revision procedures were performed at 6 to 24 months, most commonly for infection (39% [37 procedures]). The early revision group was younger than the unrevised group (mean age 64 years versus 68 years) and a between-group difference in American Society of Anesthesiologists (ASA) grade was noted. Apart from a small difference in preoperative low back pain for the early revision group (mean low back pain VAS 4.2 points for the early revision group versus 3.3 points for the unrevised group), there were no between-group differences in preoperative knee-specific or generic PROM scores on univariate analysis. As the inclusion of ASA grade or low back pain score did not alter the model results, the final multivariable model included only the most clinically plausible confounders (age and gender) as covariates. Multivariable models (adjusting for age and gender) were also used to examine the association between a clinically important improvement in PROM scores (based on published thresholds for minimal important change) and the likelihood of early revision.
After adjusting for age and gender, poor postoperative knee pain, Oxford, KOOS-12, EQ-5D, and EQ VAS scores were all associated with early revision. A one-unit increase (worsening) in knee pain at 6 months was associated with a 31% increase in the likelihood of revision (RR 1.31 [95% confidence interval (CI) 1.19 to 1.43]; p < 0.001). Reflecting the reversed scoring direction, a one-unit increase (improvement) in Oxford or KOOS-12 score was associated with a 9% and 5% reduction in revision risk, respectively (RR for Oxford: 0.91 [95% CI 0.90 to 0.93]; p < 0.001; RR for KOOS-12 summary: 0.95 [95% CI 0.94 to 0.97]; p < 0.001). Patient dissatisfaction (RR 6.8 [95% CI 3.7 to 12.3]) and patient-perceived worsening (RR 11.7 [95% CI 7.4 to 18.5]) at 6 months were also associated with an increased likelihood of early revision. After adjusting for age and gender, patients who did not achieve a clinically important improvement in PROM scores had a higher risk of early revision (RR 2.9 for the knee pain VAS, RR 4.2 for the Oxford Knee Score, RR 6.3 to 8.6 for KOOS-12, and RR 2.3 for EQ-5D) compared with those who did (reference group).
Knee-specific and generic PROM scores offer an efficient approach to identifying patients at greater risk of early revision surgery, using either the 6-month score or the magnitude of improvement. These data indicate that surgeons can use single- and multi-item measures to detect a patient-perceived unsuccessful surgical outcome at 6 months after primary knee arthroplasty. Surgeons should be alert to poor PROM scores at 6 months or small improvements in scores (for example, less than 2 points for knee pain VAS or less than 10.5 points for Oxford Knee Score), which signal a need for direct patient follow-up or expedited clinical review.
Level III, therapeutic study.
Ackerman IN
,Harris IA
,Cashman K
,Rowden N
,Lorimer M
,Graves SE
... -
《-》