The Role of Large Language Models in Transforming Emergency Medicine: Scoping Review.
Artificial intelligence (AI), more specifically large language models (LLMs), holds significant potential in revolutionizing emergency care delivery by optimizing clinical workflows and enhancing the quality of decision-making. Although enthusiasm for integrating LLMs into emergency medicine (EM) is growing, the existing literature is characterized by a disparate collection of individual studies, conceptual analyses, and preliminary implementations. Given these complexities and gaps in understanding, a cohesive framework is needed to comprehend the existing body of knowledge on the application of LLMs in EM.
Given the absence of a comprehensive framework for exploring the roles of LLMs in EM, this scoping review aims to systematically map the existing literature on LLMs' potential applications within EM and identify directions for future research. Addressing this gap will allow for informed advancements in the field.
Using PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) criteria, we searched Ovid MEDLINE, Embase, Web of Science, and Google Scholar for papers published between January 2018 and August 2023 that discussed LLMs' use in EM. We excluded other forms of AI. A total of 1994 unique titles and abstracts were screened, and each full-text paper was independently reviewed by 2 authors. Data were abstracted independently, and 5 authors performed a collaborative quantitative and qualitative synthesis of the data.
A total of 43 papers were included. Studies were predominantly from 2022 to 2023 and conducted in the United States and China. We uncovered four major themes: (1) clinical decision-making and support was highlighted as a pivotal area, with LLMs playing a substantial role in enhancing patient care, notably through their application in real-time triage, allowing early recognition of patient urgency; (2) efficiency, workflow, and information management demonstrated the capacity of LLMs to significantly boost operational efficiency, particularly through the automation of patient record synthesis, which could reduce administrative burden and enhance patient-centric care; (3) risks, ethics, and transparency were identified as areas of concern, especially regarding the reliability of LLMs' outputs, and specific studies highlighted the challenges of ensuring unbiased decision-making amidst potentially flawed training data sets, stressing the importance of thorough validation and ethical oversight; and (4) education and communication possibilities included LLMs' capacity to enrich medical training, such as through using simulated patient interactions that enhance communication skills.
LLMs have the potential to fundamentally transform EM, enhancing clinical decision-making, optimizing workflows, and improving patient outcomes. This review sets the stage for future advancements by identifying key research areas: prospective validation of LLM applications, establishing standards for responsible use, understanding provider and patient perceptions, and improving physicians' AI literacy. Effective integration of LLMs into EM will require collaborative efforts and thorough evaluation to ensure these technologies can be safely and effectively applied.
Preiksaitis C
,Ashenburg N
,Bunney G
,Chu A
,Kabeer R
,Riley F
,Ribeira R
,Rose C
... -
《JMIR Medical Informatics》
Assessing the Alignment of Large Language Models With Human Values for Mental Health Integration: Cross-Sectional Study Using Schwartz's Theory of Basic Values.
Large language models (LLMs) hold potential for mental health applications. However, their opaque alignment processes may embed biases that shape problematic perspectives. Evaluating the values embedded within LLMs that guide their decision-making have ethical importance. Schwartz's theory of basic values (STBV) provides a framework for quantifying cultural value orientations and has shown utility for examining values in mental health contexts, including cultural, diagnostic, and therapist-client dynamics.
This study aimed to (1) evaluate whether the STBV can measure value-like constructs within leading LLMs and (2) determine whether LLMs exhibit distinct value-like patterns from humans and each other.
In total, 4 LLMs (Bard, Claude 2, Generative Pretrained Transformer [GPT]-3.5, GPT-4) were anthropomorphized and instructed to complete the Portrait Values Questionnaire-Revised (PVQ-RR) to assess value-like constructs. Their responses over 10 trials were analyzed for reliability and validity. To benchmark the LLMs' value profiles, their results were compared to published data from a diverse sample of 53,472 individuals across 49 nations who had completed the PVQ-RR. This allowed us to assess whether the LLMs diverged from established human value patterns across cultural groups. Value profiles were also compared between models via statistical tests.
The PVQ-RR showed good reliability and validity for quantifying value-like infrastructure within the LLMs. However, substantial divergence emerged between the LLMs' value profiles and population data. The models lacked consensus and exhibited distinct motivational biases, reflecting opaque alignment processes. For example, all models prioritized universalism and self-direction, while de-emphasizing achievement, power, and security relative to humans. Successful discriminant analysis differentiated the 4 LLMs' distinct value profiles. Further examination found the biased value profiles strongly predicted the LLMs' responses when presented with mental health dilemmas requiring choosing between opposing values. This provided further validation for the models embedding distinct motivational value-like constructs that shape their decision-making.
This study leveraged the STBV to map the motivational value-like infrastructure underpinning leading LLMs. Although the study demonstrated the STBV can effectively characterize value-like infrastructure within LLMs, substantial divergence from human values raises ethical concerns about aligning these models with mental health applications. The biases toward certain cultural value sets pose risks if integrated without proper safeguards. For example, prioritizing universalism could promote unconditional acceptance even when clinically unwise. Furthermore, the differences between the LLMs underscore the need to standardize alignment processes to capture true cultural diversity. Thus, any responsible integration of LLMs into mental health care must account for their embedded biases and motivation mismatches to ensure equitable delivery across diverse populations. Achieving this will require transparency and refinement of alignment techniques to instill comprehensive human values.
Hadar-Shoval D
,Asraf K
,Mizrachi Y
,Haber Y
,Elyoseph Z
... -
《JMIR Mental Health》
Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals.
As the health care industry increasingly embraces large language models (LLMs), understanding the consequence of this integration becomes crucial for maximizing benefits while mitigating potential pitfalls. This paper explores the evolving relationship among clinician trust in LLMs, the transition of data sources from predominantly human-generated to artificial intelligence (AI)-generated content, and the subsequent impact on the performance of LLMs and clinician competence. One of the primary concerns identified in this paper is the LLMs' self-referential learning loops, where AI-generated content feeds into the learning algorithms, threatening the diversity of the data pool, potentially entrenching biases, and reducing the efficacy of LLMs. While theoretical at this stage, this feedback loop poses a significant challenge as the integration of LLMs in health care deepens, emphasizing the need for proactive dialogue and strategic measures to ensure the safe and effective use of LLM technology. Another key takeaway from our investigation is the role of user expertise and the necessity for a discerning approach to trusting and validating LLM outputs. The paper highlights how expert users, particularly clinicians, can leverage LLMs to enhance productivity by off-loading routine tasks while maintaining a critical oversight to identify and correct potential inaccuracies in AI-generated content. This balance of trust and skepticism is vital for ensuring that LLMs augment rather than undermine the quality of patient care. We also discuss the risks associated with the deskilling of health care professionals. Frequent reliance on LLMs for critical tasks could result in a decline in health care providers' diagnostic and thinking skills, particularly affecting the training and development of future professionals. The legal and ethical considerations surrounding the deployment of LLMs in health care are also examined. We discuss the medicolegal challenges, including liability in cases of erroneous diagnoses or treatment advice generated by LLMs. The paper references recent legislative efforts, such as The Algorithmic Accountability Act of 2023, as crucial steps toward establishing a framework for the ethical and responsible use of AI-based technologies in health care. In conclusion, this paper advocates for a strategic approach to integrating LLMs into health care. By emphasizing the importance of maintaining clinician expertise, fostering critical engagement with LLM outputs, and navigating the legal and ethical landscape, we can ensure that LLMs serve as valuable tools in enhancing patient care and supporting health care professionals. This approach addresses the immediate challenges posed by integrating LLMs and sets a foundation for their maintainable and responsible use in the future.
Choudhury A
,Chaudhry Z
《JOURNAL OF MEDICAL INTERNET RESEARCH》
Assessing the research landscape and clinical utility of large language models: a scoping review.
Large language models (LLMs) like OpenAI's ChatGPT are powerful generative systems that rapidly synthesize natural language responses. Research on LLMs has revealed their potential and pitfalls, especially in clinical settings. However, the evolving landscape of LLM research in medicine has left several gaps regarding their evaluation, application, and evidence base.
This scoping review aims to (1) summarize current research evidence on the accuracy and efficacy of LLMs in medical applications, (2) discuss the ethical, legal, logistical, and socioeconomic implications of LLM use in clinical settings, (3) explore barriers and facilitators to LLM implementation in healthcare, (4) propose a standardized evaluation framework for assessing LLMs' clinical utility, and (5) identify evidence gaps and propose future research directions for LLMs in clinical applications.
We screened 4,036 records from MEDLINE, EMBASE, CINAHL, medRxiv, bioRxiv, and arXiv from January 2023 (inception of the search) to June 26, 2023 for English-language papers and analyzed findings from 55 worldwide studies. Quality of evidence was reported based on the Oxford Centre for Evidence-based Medicine recommendations.
Our results demonstrate that LLMs show promise in compiling patient notes, assisting patients in navigating the healthcare system, and to some extent, supporting clinical decision-making when combined with human oversight. However, their utilization is limited by biases in training data that may harm patients, the generation of inaccurate but convincing information, and ethical, legal, socioeconomic, and privacy concerns. We also identified a lack of standardized methods for evaluating LLMs' effectiveness and feasibility.
This review thus highlights potential future directions and questions to address these limitations and to further explore LLMs' potential in enhancing healthcare delivery.
Park YJ
,Pillai A
,Deng J
,Guo E
,Gupta M
,Paget M
,Naugler C
... -
《BMC Medical Informatics and Decision Making》
Beyond the black stump: rapid reviews of health research issues affecting regional, rural and remote Australia.
CHAPTER 1: RETAIL INITIATIVES TO IMPROVE THE HEALTHINESS OF FOOD ENVIRONMENTS IN RURAL, REGIONAL AND REMOTE COMMUNITIES: Objective: To synthesise the evidence for effectiveness of initiatives aimed at improving food retail environments and consumer dietary behaviour in rural, regional and remote populations in Australia and comparable countries, and to discuss the implications for future food environment initiatives for rural, regional and remote areas of Australia.
Rapid review of articles published between January 2000 and May 2020.
We searched MEDLINE (EBSCOhost), Health and Society Database (Informit) and Rural and Remote Health Database (Informit), and included studies undertaken in rural food environment settings in Australia and other countries.
Twenty-one articles met the inclusion criteria, including five conducted in Australia. Four of the Australian studies were conducted in very remote populations and in grocery stores, and one was conducted in regional Australia. All of the overseas studies were conducted in rural North America. All of them revealed a positive influence on food environment or consumer behaviour, and all were conducted in disadvantaged, rural communities. Positive outcomes were consistently revealed by studies of initiatives that focused on promotion and awareness of healthy foods and included co-design to generate community ownership and branding.
Initiatives aimed at improving rural food retail environments were effective and, when implemented in different rural settings, may encourage improvements in population diets. The paucity of studies over the past 20 years in Australia shows a need for more research into effective food retail environment initiatives, modelled on examples from overseas, with studies needed across all levels of remoteness in Australia. Several retail initiatives that were undertaken in rural North America could be replicated in rural Australia and could underpin future research. CHAPTER 2: WHICH INTERVENTIONS BEST SUPPORT THE HEALTH AND WELLBEING NEEDS OF RURAL POPULATIONS EXPERIENCING NATURAL DISASTERS?: Objective: To explore and evaluate health and social care interventions delivered to rural and remote communities experiencing natural disasters in Australia and other high income countries.
We used systematic rapid review methods. First we identified a test set of citations and generated a frequency table of Medical Subject Headings (MeSH) to index articles. Then we used combinations of MeSH terms and keywords to search the MEDLINE (Ovid) database, and screened the titles and abstracts of the retrieved references.
We identified 1438 articles via database searches, and a further 62 articles via hand searching of key journals and reference lists. We also found four relevant grey literature resources. After removing duplicates and undertaking two stages of screening, we included 28 studies in a synthesis of qualitative evidence.
Four of us read and assessed the full text articles. We then conducted a thematic analysis using the three phases of the natural disaster response cycle.
There is a lack of robust evaluation of programs and interventions supporting the health and wellbeing of people in rural communities affected by natural disasters. To address the cumulative and long term impacts, evidence suggests that continuous support of people's health and wellbeing is needed. By using a lens of rural adversity, the complexity of the lived experience of natural disasters by rural residents can be better understood and can inform development of new models of community-based and integrated care services. CHAPTER 3: THE IMPACT OF BUSHFIRE ON THE WELLBEING OF CHILDREN LIVING IN RURAL AND REMOTE AUSTRALIA: Objective: To investigate the impact of bushfire events on the wellbeing of children living in rural and remote Australia.
Literature review completed using rapid realist review methods, and taking into consideration the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement for systematic reviews.
We sourced data from six databases: EBSCOhost (Education), EBSCOhost (Health), EBSCOhost (Psychology), Informit, MEDLINE and PsycINFO. We developed search terms to identify articles that could address the research question based on the inclusion criteria of peer reviewed full text journal articles published in English between 1983 and 2020. We initially identified 60 studies and, following closer review, extracted data from eight studies that met the inclusion criteria.
Children exposed to bushfires may be at increased risk of poorer wellbeing outcomes. Findings suggest that the impact of bushfire exposure may not be apparent in the short term but may become more pronounced later in life. Children particularly at risk are those from more vulnerable backgrounds who may have compounding factors that limit their ability to overcome bushfire trauma.
We identified the short, medium and long term impacts of bushfire exposure on the wellbeing of children in Australia. We did not identify any evidence-based interventions for supporting outcomes for this population. Given the likely increase in bushfire events in Australia, research into effective interventions should be a priority. CHAPTER 4: THE ROLE OF NATIONAL POLICIES TO ADDRESS RURAL ALLIED HEALTH, NURSING AND DENTISTRY WORKFORCE MALDISTRIBUTION: Objective: Maldistribution of the health workforce between rural, remote and metropolitan communities contributes to longstanding health inequalities. Many developed countries have implemented policies to encourage health care professionals to work in rural and remote communities. This scoping review is an international synthesis of those policies, examining their effectiveness at recruiting and retaining nursing, dental and allied health professionals in rural communities.
Using scoping review methods, we included primary research - published between 1 September 2009 and 30 June 2020 - that reported an evaluation of existing policy initiatives to address workforce maldistribution in high income countries with a land mass greater than 100 000 km .
We searched MEDLINE, Ovid Embase, Ovid Emcare, Informit, Scopus, and Web of Science. We screened 5169 articles for inclusion by title and abstract, of which we included 297 for full text screening. We then extracted data on 51 studies that had been conducted in Australia, the United States, Canada, United Kingdom and Norway.
We grouped the studies based on World Health Organization recommendations on recruitment and retention of health care workers: education strategies (n = 27), regulatory change (n = 11), financial incentives (n = 6), personal and professional support (n = 4), and approaches with multiple components (n = 3).
Considerable work has occurred to address workforce maldistribution at a local level, underpinned by good practice guidelines, but rarely at scale or with explicit links to coherent overarching policy. To achieve policy aspirations, multiple synergistic evidence-based initiatives are needed, and implementation must be accompanied by well designed longitudinal evaluations that assess the effectiveness of policy objectives. CHAPTER 5: AVAILABILITY AND CHARACTERISTICS OF PUBLICLY AVAILABLE HEALTH WORKFORCE DATA SOURCES IN AUSTRALIA: Objective: Many data sources are used in Australia to inform health workforce planning, but their characteristics in terms of relevance, accessibility and accuracy are uncertain. We aimed to identify and appraise publicly available data sources used to describe the Australian health workforce.
We conducted a scoping review in which we searched bibliographic databases, websites and grey literature. Two reviewers independently undertook title and abstract screening and full text screening using Covidence software. We then assessed the relevance, accessibility and accuracy of data sources using a customised appraisal tool.
We searched for potential workforce data sources in nine databases (MEDLINE, Embase, Ovid Emcare, Scopus, Web of Science, Informit, the JBI Evidence-based Practice Database, PsycINFO and the Cochrane Library) and the grey literature, and examined several pre-defined websites.
During the screening process we identified 6955 abstracts and examined 48 websites, from which we identified 12 publicly available data sources - eight primary and four secondary data sources. The primary data sources were generally of modest quality, with low scores in terms of reference period, accessibility and missing data. No single primary data source scored well across all domains of the appraisal tool.
We identified several limitations of data sources used to describe the Australian health workforce. Establishment of a high quality, longitudinal, linked database that can inform all aspects of health workforce development is urgently needed, particularly for rural health workforce and services planning. CHAPTER 6: RAPID REALIST REVIEW OF OPIOID TAPERING IN THE CONTEXT OF LONG TERM OPIOID USE FOR NON-CANCER PAIN IN RURAL AREAS: Objective: To describe interventions, barriers and enablers associated with opioid tapering for patients with chronic non-cancer pain in rural primary care settings.
Rapid realist review registered on the international register of systematic reviews (PROSPERO) and conducted in accordance with RAMESES standards.
English language, peer-reviewed articles reporting qualitative, quantitative and mixed method studies, published between January 2016 and July 2020, and accessed via MEDLINE, Embase, CINAHL Complete, PsycINFO, Informit or the Cochrane Library during June and July 2020. Grey literature relating to prescribing, deprescribing or tapering of opioids in chronic non-cancer pain, published between January 2016 and July 2020, was identified by searching national and international government, health service and peek organisation websites using Google Scholar.
Our analysis of reported approaches to tapering conducted across rural and non-rural contexts showed that tapering opioids is complex and challenging, and identified several barriers and enablers. Successful outcomes in rural areas appear likely through therapeutic relationships, coordination and support, by using modalities and models of care that are appropriate in rural settings and by paying attention to harm minimisation.
Rural primary care providers do not have access to resources available in metropolitan centres for dealing with patients who have chronic non-cancer pain and are taking opioid medications. They often operate alone or in small group practices, without peer support and access to multidisciplinary and specialist teams. Opioid tapering approaches described in the literature include regulation, multimodal and multidisciplinary approaches, primary care provider support, guidelines, and patient-centred strategies. There is little research to inform tapering in rural contexts. Our review provides a synthesis of the current evidence in the form of a conceptual model. This preliminary model could inform the development of a model of care for use in implementation research, which could test a variety of mechanisms for supporting decision making, reducing primary care providers' concerns about potential harms arising from opioid tapering, and improving patient outcomes.
Osborne SR
,Alston LV
,Bolton KA
,Whelan J
,Reeve E
,Wong Shee A
,Browne J
,Walker T
,Versace VL
,Allender S
,Nichols M
,Backholer K
,Goodwin N
,Lewis S
,Dalton H
,Prael G
,Curtin M
,Brooks R
,Verdon S
,Crockett J
,Hodgins G
,Walsh S
,Lyle DM
,Thompson SC
,Browne LJ
,Knight S
,Pit SW
,Jones M
,Gillam MH
,Leach MJ
,Gonzalez-Chica DA
,Muyambi K
,Eshetie T
,Tran K
,May E
,Lieschke G
,Parker V
,Smith A
,Hayes C
,Dunlop AJ
,Rajappa H
,White R
,Oakley P
,Holliday S
... -
《-》