Close Menu
orrao.com
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
What's Hot

Delicious Herbal Tea Recipes and Their Benefits

December 23, 2025

Car Bomb Kills Russian General in Moscow

December 23, 2025

Your Gut Bacteria Is Under Attack by Pesticides and Everyday Chemical Pollutants

December 23, 2025
Facebook X (Twitter) Instagram
orrao.comorrao.com
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
Subscribe
orrao.com
Home»Science»AI chatbots fail to diagnose patients by talking with them
Science

AI chatbots fail to diagnose patients by talking with them

January 2, 2025No Comments3 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


Don’t call your favorite AI “doctor” just yet

Just_Super/Getty Images

Advanced models of artificial intelligence score well in professional medical examinations, but still passed one of the most important tasks of a doctor: talking to patients to gather relevant medical information and provide an accurate diagnosis.

“While large speech models show impressive results in multiple-choice tests, their accuracy drops significantly in dynamic conversations,” he says. Pranav Rajpurkar at Harvard University. “Models in particular struggle with open-ended diagnostic reasoning.”

This became evident when researchers developed a method to assess the reasoning abilities of a clinical AI model based on doctor-patient conversations. The “patients” were based on 2000 medical cases, mainly drawn from US medical board examinations.

“Simulating patient interactions allows assessment of medical history-taking skills, a critical component of clinical practice that cannot be assessed using case vignettes,” he says. Shreya johriAlso at Harvard University. The new benchmark assessment, called CRAFT-MD, “mirrors real-life scenarios where patients are unsure of which details are important to share and may disclose relevant information only when prompted by specific questions.”

The CRAFT-MD benchmark itself is based on AI. OpenAI’s GPT-4 model played the role of a “patient AI” in conversation with the “clinical AI” being tested. The GPT-4 also helped to grade the results by comparing the clinical diagnosis of AI with the correct answer for each case. These assessments were double-checked by human medical experts. Interviews were also reviewed to check the accuracy of the patient’s AI and to see whether the clinical AI succeeded in gathering relevant medical information.

Multiple experiments showed that the top four major language models (OpenAI’s GPT-3.5 and GPT-4 models, Meta’s Llama-2-7b model, and Mistral AI’s Mistral-v2-7b model) performed significantly worse than the conversation-based benchmark. that it was making diagnoses based on written case summaries. OpenAI, Meta and Mistral AI did not respond to requests for comment.

For example, the GPT-4’s diagnostic accuracy was an impressive 82 percent when presented with structured case summaries and allowed to select a diagnosis from a multiple-choice list, compared to less than 49 percent when not. multiple choice options. When it had to make diagnoses from simulated patient interviews, however, its accuracy dropped to 26 percent.

And GPT-4 was the best AI model tested in the study, GPT-3.5 often came in second, the Mistral AI model sometimes ranked second or third, and Meta’s Llama model generally scored the lowest.

The AI ​​models also failed to collect complete medical histories much of the time, and the leading GPT-4 model did so in only 71 percent of simulated patient interviews. Even though AI models collected a patient’s relevant medical history, they did not always produce correct diagnoses.

Such simulated patient interviews are a “much more useful” way to assess AI clinical reasoning skills than medical exams, he says. Eric Topol at the Scripps Research Translational Institute in California.

If an AI model can beat that benchmark by consistently making accurate diagnoses based on simulated patient interviews, it wouldn’t necessarily outperform human doctors, says Rajpurkar. He notes that medical practice in the real world is “more messy” than in simulations. It involves managing multiple patients, coordinating with health care teams, performing physical examinations, and understanding the “complex social and systemic factors” in local health situations.

“Our strong benchmark performance would suggest that AI can be a powerful tool to support clinical work, but it is not necessarily a substitute for the comprehensive judgment of experienced clinicians,” says Rajpurkar.

Topics:



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleJarrod Bowen: West Ham captain faces six-week spell on sidelines with foot injury | Football News
Next Article The New Israeli Silence – Sponsored Content
Admin
  • Website

Related Posts

Science

Electrical synapses genetically engineered in mammals for first time

April 14, 2025
Science

Does Your Language’s Grammar Change How You Think?

April 14, 2025
Science

This Butterfly’s Epic Migration Is Written into Its Chemistry

April 13, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest News
Politics

Tracking the Most Impactful Changes — ProPublica

February 7, 2025
Sports

Jack Grealish: How Man City’s £100m man has gone one year without a goal for the club | Football News

December 6, 2024
U.S.

As the Super Bowl nears, New Orleans grapples with how safe is safe enough

February 6, 2025
Science

Sicily’s hills were 40 metres below water during Earth’s megaflood

January 22, 2025
Science

Why you don't need to worry about 'over-potting' your plants

March 24, 2025
Entertainment

Tom Brady Posts Thirst Trap During Fishing Outing

December 2, 2024
Categories
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
Most Popular

Why DeepSeek’s AI Model Just Became the Top-Rated App in the U.S.

January 28, 202553 Views

New Music Friday February 14: SZA, Selena Gomez, benny blanco, Sabrina Carpenter, Drake, Jack Harlow and More

February 14, 202515 Views

Why Time ‘Slows’ When You’re in Danger

January 8, 202515 Views

Top Scholar Says Evidence for Special Education Inclusion is ‘Fundamentally Flawed’

January 13, 202512 Views

Oh hi there 👋
It’s nice to meet you.

Sign up to receive awesome content in your inbox, every month.

Check your inbox or spam folder to confirm your subscription.

  • Home
  • About us
  • Get In Touch
  • Privacy Policy
  • Terms & Conditions
© 2025 All Rights Reserved - Orrao.com

Type above and press Enter to search. Press Esc to cancel.