Close Menu
orrao.com
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
What's Hot

Ukrainian Drone Attacks on Russian Apartment Building Kills 4

May 15, 2026

Delivering Mail on Ukraine’s Front Line

May 15, 2026

235/65R17 All-Season Tires: What to Look For

May 15, 2026
Facebook X (Twitter) Instagram
orrao.comorrao.com
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
Subscribe
orrao.com
Home»Science»AI chatbots fail to diagnose patients by talking with them
Science

AI chatbots fail to diagnose patients by talking with them

January 2, 2025No Comments3 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


Don’t call your favorite AI “doctor” just yet

Just_Super/Getty Images

Advanced models of artificial intelligence score well in professional medical examinations, but still passed one of the most important tasks of a doctor: talking to patients to gather relevant medical information and provide an accurate diagnosis.

“While large speech models show impressive results in multiple-choice tests, their accuracy drops significantly in dynamic conversations,” he says. Pranav Rajpurkar at Harvard University. “Models in particular struggle with open-ended diagnostic reasoning.”

This became evident when researchers developed a method to assess the reasoning abilities of a clinical AI model based on doctor-patient conversations. The “patients” were based on 2000 medical cases, mainly drawn from US medical board examinations.

“Simulating patient interactions allows assessment of medical history-taking skills, a critical component of clinical practice that cannot be assessed using case vignettes,” he says. Shreya johriAlso at Harvard University. The new benchmark assessment, called CRAFT-MD, “mirrors real-life scenarios where patients are unsure of which details are important to share and may disclose relevant information only when prompted by specific questions.”

The CRAFT-MD benchmark itself is based on AI. OpenAI’s GPT-4 model played the role of a “patient AI” in conversation with the “clinical AI” being tested. The GPT-4 also helped to grade the results by comparing the clinical diagnosis of AI with the correct answer for each case. These assessments were double-checked by human medical experts. Interviews were also reviewed to check the accuracy of the patient’s AI and to see whether the clinical AI succeeded in gathering relevant medical information.

Multiple experiments showed that the top four major language models (OpenAI’s GPT-3.5 and GPT-4 models, Meta’s Llama-2-7b model, and Mistral AI’s Mistral-v2-7b model) performed significantly worse than the conversation-based benchmark. that it was making diagnoses based on written case summaries. OpenAI, Meta and Mistral AI did not respond to requests for comment.

For example, the GPT-4’s diagnostic accuracy was an impressive 82 percent when presented with structured case summaries and allowed to select a diagnosis from a multiple-choice list, compared to less than 49 percent when not. multiple choice options. When it had to make diagnoses from simulated patient interviews, however, its accuracy dropped to 26 percent.

And GPT-4 was the best AI model tested in the study, GPT-3.5 often came in second, the Mistral AI model sometimes ranked second or third, and Meta’s Llama model generally scored the lowest.

The AI ​​models also failed to collect complete medical histories much of the time, and the leading GPT-4 model did so in only 71 percent of simulated patient interviews. Even though AI models collected a patient’s relevant medical history, they did not always produce correct diagnoses.

Such simulated patient interviews are a “much more useful” way to assess AI clinical reasoning skills than medical exams, he says. Eric Topol at the Scripps Research Translational Institute in California.

If an AI model can beat that benchmark by consistently making accurate diagnoses based on simulated patient interviews, it wouldn’t necessarily outperform human doctors, says Rajpurkar. He notes that medical practice in the real world is “more messy” than in simulations. It involves managing multiple patients, coordinating with health care teams, performing physical examinations, and understanding the “complex social and systemic factors” in local health situations.

“Our strong benchmark performance would suggest that AI can be a powerful tool to support clinical work, but it is not necessarily a substitute for the comprehensive judgment of experienced clinicians,” says Rajpurkar.

Topics:



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleJarrod Bowen: West Ham captain faces six-week spell on sidelines with foot injury | Football News
Next Article The New Israeli Silence – Sponsored Content
Admin
  • Website

Related Posts

Science

Electrical synapses genetically engineered in mammals for first time

April 14, 2025
Science

Does Your Language’s Grammar Change How You Think?

April 14, 2025
Science

This Butterfly’s Epic Migration Is Written into Its Chemistry

April 13, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest News
U.S.

WATCH: Firefighters rescue cat trapped between patio, home foundation

December 1, 2024
Science

Laughing gas could be picked up by a breathalyser

February 5, 2025
Health

Weekly Health Quiz: The Importance of Butyrate, Debunking BMI, and the Impact of Antibiotics

December 15, 2025
Business

US to sell up to $2.2 billion in weapons to UAE, Saudi Arabia

October 13, 2024
World

Ceasefire largely holds but Israelis near Lebanon border have their doubts

November 28, 2024
Israel at War

Survey: 83% of Jewish US college students have experienced antisemitism since Oct. 7

February 2, 2025
Categories
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
Most Popular

Why DeepSeek’s AI Model Just Became the Top-Rated App in the U.S.

January 28, 202556 Views

Why Time ‘Slows’ When You’re in Danger

January 8, 202517 Views

New Music Friday February 14: SZA, Selena Gomez, benny blanco, Sabrina Carpenter, Drake, Jack Harlow and More

February 14, 202516 Views

Top Scholar Says Evidence for Special Education Inclusion is ‘Fundamentally Flawed’

January 13, 202514 Views

Oh hi there 👋
It’s nice to meet you.

Sign up to receive awesome content in your inbox, every month.

Check your inbox or spam folder to confirm your subscription.

  • Home
  • About us
  • Get In Touch
  • Privacy Policy
  • Terms & Conditions
© 2026 All Rights Reserved - Orrao.com

Type above and press Enter to search. Press Esc to cancel.