Close Menu
orrao.com
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
What's Hot

How Ukraine Is Advancing Its Drone Warfare

October 31, 2025

Why Biostimulants Outperform Fertilizers in Building Lasting Soil Health

October 31, 2025

Traditional Samurai Movement Improves Knee Strength and Mobility for Seniors

October 31, 2025
Facebook X (Twitter) Instagram
orrao.comorrao.com
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
Subscribe
orrao.com
Home»Science»AI chatbots fail to diagnose patients by talking with them
Science

AI chatbots fail to diagnose patients by talking with them

January 2, 2025No Comments3 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


Don’t call your favorite AI “doctor” just yet

Just_Super/Getty Images

Advanced models of artificial intelligence score well in professional medical examinations, but still passed one of the most important tasks of a doctor: talking to patients to gather relevant medical information and provide an accurate diagnosis.

“While large speech models show impressive results in multiple-choice tests, their accuracy drops significantly in dynamic conversations,” he says. Pranav Rajpurkar at Harvard University. “Models in particular struggle with open-ended diagnostic reasoning.”

This became evident when researchers developed a method to assess the reasoning abilities of a clinical AI model based on doctor-patient conversations. The “patients” were based on 2000 medical cases, mainly drawn from US medical board examinations.

“Simulating patient interactions allows assessment of medical history-taking skills, a critical component of clinical practice that cannot be assessed using case vignettes,” he says. Shreya johriAlso at Harvard University. The new benchmark assessment, called CRAFT-MD, “mirrors real-life scenarios where patients are unsure of which details are important to share and may disclose relevant information only when prompted by specific questions.”

The CRAFT-MD benchmark itself is based on AI. OpenAI’s GPT-4 model played the role of a “patient AI” in conversation with the “clinical AI” being tested. The GPT-4 also helped to grade the results by comparing the clinical diagnosis of AI with the correct answer for each case. These assessments were double-checked by human medical experts. Interviews were also reviewed to check the accuracy of the patient’s AI and to see whether the clinical AI succeeded in gathering relevant medical information.

Multiple experiments showed that the top four major language models (OpenAI’s GPT-3.5 and GPT-4 models, Meta’s Llama-2-7b model, and Mistral AI’s Mistral-v2-7b model) performed significantly worse than the conversation-based benchmark. that it was making diagnoses based on written case summaries. OpenAI, Meta and Mistral AI did not respond to requests for comment.

For example, the GPT-4’s diagnostic accuracy was an impressive 82 percent when presented with structured case summaries and allowed to select a diagnosis from a multiple-choice list, compared to less than 49 percent when not. multiple choice options. When it had to make diagnoses from simulated patient interviews, however, its accuracy dropped to 26 percent.

And GPT-4 was the best AI model tested in the study, GPT-3.5 often came in second, the Mistral AI model sometimes ranked second or third, and Meta’s Llama model generally scored the lowest.

The AI ​​models also failed to collect complete medical histories much of the time, and the leading GPT-4 model did so in only 71 percent of simulated patient interviews. Even though AI models collected a patient’s relevant medical history, they did not always produce correct diagnoses.

Such simulated patient interviews are a “much more useful” way to assess AI clinical reasoning skills than medical exams, he says. Eric Topol at the Scripps Research Translational Institute in California.

If an AI model can beat that benchmark by consistently making accurate diagnoses based on simulated patient interviews, it wouldn’t necessarily outperform human doctors, says Rajpurkar. He notes that medical practice in the real world is “more messy” than in simulations. It involves managing multiple patients, coordinating with health care teams, performing physical examinations, and understanding the “complex social and systemic factors” in local health situations.

“Our strong benchmark performance would suggest that AI can be a powerful tool to support clinical work, but it is not necessarily a substitute for the comprehensive judgment of experienced clinicians,” says Rajpurkar.

Topics:



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleJarrod Bowen: West Ham captain faces six-week spell on sidelines with foot injury | Football News
Next Article The New Israeli Silence – Sponsored Content
Admin
  • Website

Related Posts

Science

Electrical synapses genetically engineered in mammals for first time

April 14, 2025
Science

Does Your Language’s Grammar Change How You Think?

April 14, 2025
Science

This Butterfly’s Epic Migration Is Written into Its Chemistry

April 13, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest News
Sports

Six Nations 2025: France fly-half Romain Ntamack banned from England game following red card against Wales | Rugby Union News

February 5, 2025
Science

A Brief History of the End of the F*cking World review: We are suckers for end times

March 1, 2025
Entertainment

Diddy’s Baby Mother Dana Tran, Son Christian Support Twins at Football Game

November 11, 2024
Health

A Savory, Flavor-Packed Protein Boost

March 4, 2025
Israel at War

Houthi drone crashes in south as terror group said to brace for major Israeli attack

December 25, 2024
Sports

'I don't understand… It's stupid, ridiculous' | Max fumes after Q2 exit

November 3, 2024
Categories
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
Most Popular

Why DeepSeek’s AI Model Just Became the Top-Rated App in the U.S.

January 28, 202552 Views

Why Time ‘Slows’ When You’re in Danger

January 8, 202515 Views

Top Scholar Says Evidence for Special Education Inclusion is ‘Fundamentally Flawed’

January 13, 202511 Views

Antoine Semenyo shines for Bournemouth but Liverpool look unstoppable – Premier League hits and misses | Football News

February 1, 20259 Views

Oh hi there 👋
It’s nice to meet you.

Sign up to receive awesome content in your inbox, every month.

Check your inbox or spam folder to confirm your subscription.

  • Home
  • About us
  • Get In Touch
  • Privacy Policy
  • Terms & Conditions
© 2025 All Rights Reserved - Orrao.com

Type above and press Enter to search. Press Esc to cancel.