ChatGPT bombs test on diagnosing kids’ medical cases with 83% error rate | It was bad at recognizing relationships and needs selective training, researchers say.

L4sBot@lemmy.world · 11 months ago

ChatGPT bombs test on diagnosing kids’ medical cases with 83% error rate | It was bad at recognizing relationships and needs selective training, researchers say.

AutoTL;DR@lemmings.world · 11 months ago

This is the best summary I could come up with:

While the chatty AI bot has previously underwhelmed with its attempts to diagnose challenging medical cases—with an accuracy rate of 39 percent in an analysis last year—a study out this week in JAMA Pediatrics suggests the fourth version of the large language model is especially bad with kids.

The medical field has generally been an early adopter of AI-powered technologies, resulting in some notable failures, such as creating algorithmic racial bias, as well as successes, such as automating administrative tasks and helping to interpret chest scans and retinal images.

But AI’s potential for problem-solving has raised considerable interest in developing it into a helpful tool for complex diagnostics—no eccentric, prickly, pill-popping medical genius required.

For ChatGPT’s test, the researchers pasted the relevant text of the medical cases into the prompt, and then two qualified physician-researchers scored the AI-generated answers as correct, incorrect, or “did not fully capture the diagnosis.”

Though the chatbot struggled in this test, the researchers suggest it could improve by being specifically and selectively trained on accurate and trustworthy medical literature—not stuff on the Internet, which can include inaccurate information and misinformation.

“This presents an opportunity for researchers to investigate if specific medical data training and tuning can improve the diagnostic accuracy of LLM-based chatbots,” the authors conclude.

The original article contains 721 words, the summary contains 211 words. Saved 71%. I’m a bot and I’m open source!