Dokumentdetails
ID

oai:arXiv.org:2410.11076

Thema
Computer Science - Computation and... Computer Science - Artificial Inte...
Autor
Dong, Mingwen Kumar, Nischal Ashok Hu, Yiqun Chauhan, Anuj Hang, Chung-Wei Chang, Shuaichen Pan, Lin Lan, Wuwei Zhu, Henghui Jiang, Jiarong Ng, Patrick Wang, Zhiguo
Kategorie

Computer Science

Jahr

2024

Auflistungsdatum

23.10.2024

Schlüsselwörter
language sql text-to-sql clarification unanswerable user
Metrisch

Zusammenfassung

Previous text-to-SQL datasets and systems have primarily focused on user questions with clear intentions that can be answered.

However, real user questions can often be ambiguous with multiple interpretations or unanswerable due to a lack of relevant data.

In this work, we construct a practical conversational text-to-SQL dataset called PRACTIQ, consisting of ambiguous and unanswerable questions inspired by real-world user questions.

We first identified four categories of ambiguous questions and four categories of unanswerable questions by studying existing text-to-SQL datasets.

Then, we generate conversations with four turns: the initial user question, an assistant response seeking clarification, the user's clarification, and the assistant's clarified SQL response with the natural language explanation of the execution results.

For some ambiguous queries, we also directly generate helpful SQL responses, that consider multiple aspects of ambiguity, instead of requesting user clarification.

To benchmark the performance on ambiguous, unanswerable, and answerable questions, we implemented large language model (LLM)-based baselines using various LLMs.

Our approach involves two steps: question category classification and clarification SQL prediction.

Our experiments reveal that state-of-the-art systems struggle to handle ambiguous and unanswerable questions effectively.

We will release our code for data generation and experiments on GitHub.

Dong, Mingwen,Kumar, Nischal Ashok,Hu, Yiqun,Chauhan, Anuj,Hang, Chung-Wei,Chang, Shuaichen,Pan, Lin,Lan, Wuwei,Zhu, Henghui,Jiang, Jiarong,Ng, Patrick,Wang, Zhiguo, 2024, PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries

Dokumentieren

Öffnen

Teilen

Quelle

Artikel empfohlen von ES/IODE AI

In-vitro study of cytotoxic and apoptotic potential of Thalassia hemprichii (Ehren.) Asch. And Enhalus acoroides (L.f.) Royle against human breast cancer cell line (MCF-7) with correlation to their chemical profile
breast cancer seagrass apoptosis mitochondrial membrane potential cell cycle assay agents using membrane breast mitochondrial extract effect red effects performed investigated gene cancer seagrasses cytotoxic cells cell mcf-7