oai:arXiv.org:2402.17355
Computer Science
2024
06/03/2024
In the current landscape of large language models (LLMs), the process of instruction tuning serves as an essential step.
Considering the high computing power overhead, data-efficient instruction tuning was proposed to reduce the training data size in this process, aiming at selecting high-quality instructional data.
Nevertheless, we argue that most current data-efficient instruction-tuning methods are highly dependent on the quality of the original instruction-tuning dataset.
When it comes to datasets synthesized by LLMs, a common scenario in this field, dirty samples will even be selected with a higher probability than other samples.
To address these challenges, we utilized external knowledge (relevant examples or paragraphs) to evaluate those samples synthesized by LLMs with an in-context-based relative predictive entropy.
Based on the new metric, we proposed a framework, dubbed as \textbf{RECOST}, which integrates external-knowledge-base re-ranking and diversity-consistent sampling into a single pipeline.
Through extensive experiments on several synthetic datasets (Alpaca and Alpaca-gpt4), we demonstrate the effectiveness of our method and achieve even better results with only \textbf{1\%} of the full dataset.
Zhang, Qi,Zhang, Yiming,Wang, Haobo,Zhao, Junbo, 2024, RECOST: External Knowledge Guided Data-efficient Instruction Tuning