Study Reveals ChatGPT Health Under-Triage Medical Emergencies

Mar 4, 2026, 2:42 AM
Image for article Study Reveals ChatGPT Health Under-Triage Medical Emergencies

Hover over text to view sources

A recent study published in the journal Nature Medicine found that OpenAI's ChatGPT Health often underestimates the seriousness of medical emergencies, failing to recommend immediate care in more than half of evaluated cases. Researchers fed the AI tool 60 clinical scenarios to assess its ability to triage appropriately, comparing its recommendations with those of three independent physicians.
The study's lead author, Dr Ashwin Ramaswamy, noted that the AI under-triaged 51.6% of cases that required emergency room visits, suggesting patients wait for 24 to 48 hours instead. This included critical conditions, such as diabetic ketoacidosis and respiratory failure, where timely intervention is crucial. "Any doctor would say that these patients need to go to the emergency department," Ramaswamy stated.
Interestingly, the AI performed well with "textbook emergencies" like strokes, accurately triaging such cases 100% of the time. However, it struggled with more nuanced situations where clinical judgment is essential. In one example, the AI identified early warning signs of respiratory failure but still recommended delaying treatment.
In addition to under-triaging emergencies, the study highlighted concerns regarding ChatGPT Health's handling of suicidal ideation. The tool was inconsistent, sometimes failing to refer users to the 988 Suicide and Crisis Lifeline when warranted. Ramaswamy described the AI's performance in these scenarios as "paradoxical," indicating a troubling lack of reliability in recognizing high-risk situations.
OpenAI has acknowledged the study's findings but argued that they do not represent typical use or functionality of ChatGPT Health. The company emphasized that the model is designed for users to ask follow-up questions for further context, rather than providing a single response. Despite this, experts warn that reliance on AI for critical medical advice could lead to significant risks, particularly in high-stakes situations.
Dr John Mafi, a primary care physician at UCLA Health, stressed the necessity for rigorous testing of AI tools before they are widely implemented for life-altering decisions. "We need to ensure that the benefits outweigh the harms," he stated. Ramaswamy echoed this sentiment, urging caution against using AI as a substitute for professional medical advice, especially in emergencies.
The implications of this study are significant, given that ChatGPT Health is used by over 40 million people globally for health-related inquiries. Many individuals rely on AI tools for immediate answers outside of regular healthcare hours, which can lead to misguided trust in their recommendations.
Experts suggest that while AI can play a role in healthcare, it should complement, not replace, human judgment. Dr Ethan Goh, executive director of an AI research network, noted that AI can be helpful but has severe limitations that users must understand.
In conclusion, the findings of this evaluation underscore the urgent need for comprehensive safety standards and independent audits for AI healthcare tools. Ramaswamy indicated that ongoing assessments of ChatGPT Health and similar platforms are vital to ensure that improvements translate into safer patient care. As AI continues to evolve, the integration of these tools into healthcare must be approached with caution to prevent unnecessary harm.

Related articles

AI Tools Like DeepSeek Revolutionizing Youth Mental Health in China

Artificial intelligence tools, particularly DeepSeek, are transforming mental health care for Chinese youth by providing personalized support and therapy options. These innovations are reshaping how young people engage with mental health resources, though concerns about emotional impact and overreliance on technology persist.

Apple's AI Health Coach Project Faces Challenges Amid Leadership Changes

Apple's ambitious AI health coach project, known as Project Mulberry, is reportedly being scaled back due to leadership changes and increasing competition. While the company aims to integrate AI-driven wellness features into its Health app, a more fragmented launch may be on the horizon as Apple reassesses its approach.

EMA and FDA Establish Joint Principles for AI in Drug Development

The European Medicines Agency (EMA) and the US Food and Drug Administration (FDA) have announced ten guiding principles for the use of artificial intelligence (AI) in drug development. This initiative aims to harmonize regulations across the EU and US, ensuring patient safety while fostering innovation in the pharmaceutical sector.

Exploring the New AI Health App: ChatGPT Health

The newly launched AI health app, ChatGPT Health, aims to personalize health information for users by analyzing their medical files. While it offers potential benefits in patient engagement and understanding, experts caution against relying solely on AI for medical advice due to concerns about accuracy and privacy.

EMA and FDA Establish Common Principles for AI in Medicine

The European Medicines Agency (EMA) and the US Food and Drug Administration (FDA) have jointly released ten principles for the responsible use of artificial intelligence (AI) in medicine development. These principles aim to enhance safety, ethical standards, and international collaboration in the pharmaceutical industry.