Documentdetail
ID kaart

doi:10.1007/s00345-024-05146-3...

Auteur
Pompili, David Richa, Yasmina Collins, Patrick Richards, Helen Hennessey, Derek B
Langue
en
Editor

Springer

Categorie

Urology

Jaar

2024

vermelding datum

31-07-2024

Trefwoorden
artificial intelligence (ai) large language model (llm) patient information leaflet chatgpt google bard patient education topics content llama level reading generated quality generate palm llms average pils
Metriek

Beschrijving

Purpose Large language models (LLMs) are a form of artificial intelligence (AI) that uses deep learning techniques to understand, summarize and generate content.

The potential benefits of LLMs in healthcare is predicted to be immense.

The objective of this study was to examine the quality of patient information leaflets (PILs) produced by 3 LLMs on urological topics.

Methods Prompts were created to generate PILs from 3 LLMs: ChatGPT-4, PaLM 2 (Google Bard) and Llama 2 (Meta) across four urology topics (circumcision, nephrectomy, overactive bladder syndrome, and transurethral resection of the prostate).

PILs were evaluated using a quality assessment checklist.

PIL readability was assessed by the Average Reading Level Consensus Calculator.

Results PILs generated by PaLM 2 had the highest overall average quality score (3.58), followed by Llama 2 (3.34) and ChatGPT-4 (3.08).

PaLM 2 generated PILs were of the highest quality in all topics except TURP and was the only LLM to include images.

Medical inaccuracies were present in all generated content including instances of significant error.

Readability analysis identified PaLM 2 generated PILs as the simplest (age 14–15 average reading level).

Llama 2 PILs were the most difficult (age 16–17 average).

Conclusion While LLMs can generate PILs that may help reduce healthcare professional workload, generated content requires clinician input for accuracy and inclusion of health literacy aids, such as images.

LLM-generated PILs were above the average reading level for adults, necessitating improvement in LLM algorithms and/or prompt design.

How satisfied patients are to LLM-generated PILs remains to be evaluated.

Pompili, David,Richa, Yasmina,Collins, Patrick,Richards, Helen,Hennessey, Derek B, 2024, Using artificial intelligence to generate medical literature for urology patients: a comparison of three different large language models, Springer

Document

Openen

Delen

Bron

Artikelen aanbevolen door ES/IODE AI

Choice Between Partial Trajectories: Disentangling Goals from Beliefs
agents models aligned based bootstrapped learning reward function model return choice choices partial