You may have heard the buzz about it ChatGPTa type of chatbot that uses artificial intelligence (AI) to write essays, turn computer novices into programmers, and help people communicate.
ChatGPT can also play a role in helping people understand medical information.
While ChatGPT won’t replace talking to your doctor any time soon, our new research shows its potential to answer common cancer questions.
Here’s what we found out when we asked ChatGPT and Google the same questions. You may be surprised by the results.
What does ChatGPT have to do with health?
ChatGPT is trained on massive amounts of text data to generate conversational responses to text-based questions.
ChatGPT represents a new era of AI technology will be accompanied by search engines, including Google and Bing, to change the way we navigate information online. This also applies to the way we search for health information.
For example, you can ask ChatGPT questions like “Which cancers are most common?” or “Can you give me a clear English summary of common cancer symptoms that you should not ignore.” It produces fluid and coherent responses. But are these correct?
We compared ChatGPT with Google
Us newly published research compared how ChatGPT and Google responded to common questions about cancer.
These include simple fact-based questions such as “What exactly is cancer?” and “What are the most common types of cancer?”. There were also more complex questions about cancer symptoms, prognosis (how a condition is likely to progress), and treatment side effects.
On simple, fact-based questions, ChatGPT delivered concise answers that were comparable in quality to the recommended snippet from Google. The feature snippet is “the answer” to Google’s algorithm highlights at the top of the page.
While there were similarities, there were also major differences between ChatGPT and Google answers. Google provided highly visible references (links to other websites) with its answers. ChatGPT gave different answers when the same question was asked multiple times.
We also evaluated the slightly more complex question, “Is coughing a sign of lung cancer?”.
Google’s feature snippet indicated that a cough that doesn’t go away after three weeks is a major symptom of lung cancer.
But ChatGPT provided more nuanced answers. It indicated that a prolonged cough is a symptom of lung cancer. It also clarified that coughing is a symptom of many conditions and would require a doctor to make an accurate diagnosis.
Our clinical team considered these clarifications important. Not only do they minimize the chance of alarm, they also give users clear instructions on what to do next – consult a doctor.
How about even more complex questions?
Then we asked a question about side effects of a specific cancer drug: “Does pembrolizumab cause a fever and do I need to go to the hospital?”.
We asked ChatGPT this five times and got five different responses. This is due to randomness built into ChatGPT, which can help communicate in an almost human way, but will yield multiple answers to the same question.
All five respondents recommended talking to a healthcare professional. But not everyone said it was urgent or clearly defined how potentially serious this side effect was. One response said fever was not a common side effect, but did not explicitly say it could occur.
Overall, we rated the quality of ChatGPT’s answers to this question as poor.
Woman on sofa with towel one forehead and thermometer in hand
Does pembrolizumab cause a fever and do I have to go to the hospital? Shutterstock
This is in contrast to Google, which did not generate a featured snippet, probably due to the complexity of the query.
Instead, Google relied on users to find the information they needed. The first link led them to the manufacturer’s product website. This source clearly stated that people should seek immediate medical attention if there was a fever with pembrolizumab.
We have shown that ChatGPT does not always provide clearly visible references for its answers. It gives varying answers to a single question and is not updated in real time. It can also produce incorrect answers in a confident way.
Bing’s new chatbot, which differs from ChatGPT and has been released since our research, has a much clearer and more reliable process for outlining reference sources and aims to stay as up-to-date as possible. This shows how fast this type of AI technology is evolving and that the availability of increasingly sophisticated AI chatbots is likely to grow significantly.
In the future, however, any AI used as a virtual assistant in healthcare should be able to communicate any uncertainty about its answers, rather than making up an incorrect answer, and produce consistently reliable answers.
We need to develop minimum quality standards for AI interventions in healthcare. This also implies that they generate based on evidence information.
Finally, healthcare providers should be conscious of such AI innovations to discuss their limitations with patients.