oai:arXiv.org:2405.19519
Computer Science
2024
6/5/2024
Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text.
The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer.
We propose a two-layer RAG framework for query-focused answer generation and evaluate a proof-of-concept for this framework in the context of query-focused summary generation from social media forums, focusing on emerging drug-related information.
The evaluations demonstrate the effectiveness of the two-layer framework in resource constrained settings to enable researchers in obtaining near real-time data from users.
Das, Sudeshna,Ge, Yao,Guo, Yuting,Rajwal, Swati,Hairston, JaMor,Powell, Jeanne,Walker, Drew,Peddireddy, Snigdha,Lakamana, Sahithi,Bozkurt, Selen,Reyna, Matthew,Sameni, Reza,Xiao, Yunyu,Kim, Sangmi,Chandler, Rasheeta,Hernandez, Natalie,Mowery, Danielle,Wightman, Rachel,Love, Jennifer,Spadaro, Anthony,Perrone, Jeanmarie,Sarker, Abeed, 2024, Two-layer retrieval augmented generation framework for low-resource medical question-answering: proof of concept using Reddit data