Document detail
ID

doi:10.1007/s00345-024-05146-3...

Author
Pompili, David Richa, Yasmina Collins, Patrick Richards, Helen Hennessey, Derek B
Langue
en
Editor

Springer

Category

Urology

Year

2024

listing date

7/31/2024

Keywords
artificial intelligence (ai) large language model (llm) patient information leaflet chatgpt google bard patient education topics content llama level reading generated quality generate palm llms average pils
Metrics

Abstract

Purpose Large language models (LLMs) are a form of artificial intelligence (AI) that uses deep learning techniques to understand, summarize and generate content.

The potential benefits of LLMs in healthcare is predicted to be immense.

The objective of this study was to examine the quality of patient information leaflets (PILs) produced by 3 LLMs on urological topics.

Methods Prompts were created to generate PILs from 3 LLMs: ChatGPT-4, PaLM 2 (Google Bard) and Llama 2 (Meta) across four urology topics (circumcision, nephrectomy, overactive bladder syndrome, and transurethral resection of the prostate).

PILs were evaluated using a quality assessment checklist.

PIL readability was assessed by the Average Reading Level Consensus Calculator.

Results PILs generated by PaLM 2 had the highest overall average quality score (3.58), followed by Llama 2 (3.34) and ChatGPT-4 (3.08).

PaLM 2 generated PILs were of the highest quality in all topics except TURP and was the only LLM to include images.

Medical inaccuracies were present in all generated content including instances of significant error.

Readability analysis identified PaLM 2 generated PILs as the simplest (age 14–15 average reading level).

Llama 2 PILs were the most difficult (age 16–17 average).

Conclusion While LLMs can generate PILs that may help reduce healthcare professional workload, generated content requires clinician input for accuracy and inclusion of health literacy aids, such as images.

LLM-generated PILs were above the average reading level for adults, necessitating improvement in LLM algorithms and/or prompt design.

How satisfied patients are to LLM-generated PILs remains to be evaluated.

Pompili, David,Richa, Yasmina,Collins, Patrick,Richards, Helen,Hennessey, Derek B, 2024, Using artificial intelligence to generate medical literature for urology patients: a comparison of three different large language models, Springer

Document

Open

Share

Source

Articles recommended by ES/IODE AI

A Novel MR Imaging Sequence of 3D-ZOOMit Real Inversion-Recovery Imaging Improves Endolymphatic Hydrops Detection in Patients with Ménière Disease
ménière disease p < detection imaging sequences 3d-zoomit 3d endolymphatic real tse reconstruction ir inversion-recovery hydrops ratio
Successful omental flap coverage repair of a rectovaginal fistula after low anterior resection: a case report
rectovaginal fistula rectal cancer low anterior resection omental flap muscle flap rectal cancer pod initial repair rvf flap omental lar coverage