Logic Distillation: Learning from Code Function by Function for Planning and Decision-making

Document detail

ID

oai:arXiv.org:2407.19405

Topic

Computer Science - Artificial Inte...

Author

Chen, Dong Zhang, Shilin Gao, Fei Zhuang, Yueting Tang, Siliang Liu, Qidong Xu, Mingliang

Year

2024

listing date

7/31/2024

Keywords

logical reasoning logic distillation ld s-llms planning decision-making capabilities

Metrics

Abstract

Large language models (LLMs) have garnered increasing attention owing to their powerful logical reasoning capabilities.

Generally, larger LLMs (L-LLMs) that require paid interfaces exhibit significantly superior performance compared to smaller LLMs (S-LLMs) that can be deployed on a variety of devices.

Knowledge distillation (KD) aims to empower S-LLMs with the capabilities of L-LLMs, while S-LLMs merely mimic the outputs of L-LLMs, failing to get the powerful logical reasoning capabilities.

Consequently, S-LLMs are helpless when it comes to planning and decision-making tasks that require logical reasoning capabilities.

To tackle the identified challenges, we propose a novel framework called Logic Distillation (LD).

Initially, LD employs L-LLMs to instantiate complex instructions into discrete functions and illustrates their usage to establish a function base.

Subsequently, based on the function base, LD fine-tunes S-LLMs to learn the logic employed by L-LLMs in planning and decision-making.

During testing, LD utilizes a retriever to identify the top-$K$ relevant functions based on instructions and current states, which will be selected and invoked by S-LLMs.

Ultimately, S-LLMs yield planning and decision-making outcomes, function by function.

Relevant experiments demonstrate that with the assistance of LD, S-LLMs can achieve outstanding results in planning and decision-making tasks, comparable to, or even surpassing, those of L-LLMs.

;Comment: 9 pages, 7 figures

Chen, Dong,Zhang, Shilin,Gao, Fei,Zhuang, Yueting,Tang, Siliang,Liu, Qidong,Xu, Mingliang, 2024, Logic Distillation: Learning from Code Function by Function for Planning and Decision-making

Document

Open

Source

Articles recommended by ES/IODE AI

Computer Science

Analysis of the ICML 2023 Ranking Data: Can Authors' Opinions of Their Own Papers Assist Peer Review in Machine Learning?

peer author-provided analysis rankings scores review learning machine science computer

BMJ Neurology Open

Batoclimab as induction and maintenance therapy in patients with myasthenia gravis: rationale and study design of a phase 3 clinical trial

gravis myasthenia study clinical phase baseline improvement mg-adl 340 week trial placebo period mg maintenance qw

American Journal of Cancer R...

NOLC1 was identified as a tumor suppressor gene in thyroid cancer and correlated with prognosis by bioinformatics

cancer patients thca tumor nolc1