ChatGPT outperforms trainee doctors in assessing respiratory illness in children

Research suggests that LLMs could be used to support trainee doctors, nurses and GPs

The chatbot ChatGPT performed better than trainee doctors in assessing complex cases of respiratory disease in areas such as cystic fibrosis, asthma, and chest infections, according to a study presented at the European Respiratory Society (ERS) Congress in Vienna, Austria.

The study also showed that Google’s chatbot Bard performed better than trainees in some aspects, and Microsoft’s Bing chatbot performed as well as trainees. The research suggests that these large language models (LLMs) could be used to support trainee doctors, nurses and general practitioners to triage patients more quickly and ease pressure on health services.

The study was presented by Dr Manjith Narayanan, a consultant in paediatric pulmonology at the Royal Hospital for Children and Young People, Edinburgh, and honorary senior clinical lecturer at the University of Edinburgh, UK.

He said: “Large language models, like ChatGPT, have come into prominence in the last year and a half with their ability to seemingly understand natural language and provide responses that can adequately simulate a human-like conversation. These tools have several potential applications in medicine. My motivation to carry out this research was to assess how well LLMs are able to assist clinicians in real life.”

To investigate this, Dr Narayanan used clinical scenarios that occur frequently in paediatric respiratory medicine. Ten trainee doctors with less than four months of clinical experience in paediatrics were given an hour to solve each scenario using the internet, but not chatbots. Each scenario was also presented to the three chatbots.

Solutions provided by ChatGPT version 3.5 scored an average of seven out of nine overall and were believed to be more human-like than responses from the other chatbots. Bard scored an average of six out of nine, while Bing scored an average of four out of nine – the same as trainee doctors overall.

Dr Narayanan concluded: “Our study is the first, to our knowledge, to test LLMs against trainee doctors in situations that reflect real-life clinical practice. This study shows us another way we could be using LLMs and how close we are to regular day-to-day clinical application.”

About Author