Semi-Supervised One-Shot Imitation Learning

Document detail

ID

oai:arXiv.org:2408.05285

Topic

Computer Science - Machine Learnin... Computer Science - Artificial Inte...

Author

Wu, Philipp Hakhamaneshi, Kourosh Du, Yuqing Mordatch, Igor Rajeswaran, Aravind Abbeel, Pieter

Year

2024

listing date

8/14/2024

Keywords

semi-supervised task trajectories osil dataset learning

Metrics

Abstract

One-shot Imitation Learning~(OSIL) aims to imbue AI agents with the ability to learn a new task from a single demonstration.

To supervise the learning, OSIL typically requires a prohibitively large number of paired expert demonstrations -- i.e. trajectories corresponding to different variations of the same semantic task.

To overcome this limitation, we introduce the semi-supervised OSIL problem setting, where the learning agent is presented with a large dataset of trajectories with no task labels (i.e. an unpaired dataset), along with a small dataset of multiple demonstrations per semantic task (i.e. a paired dataset).

This presents a more realistic and practical embodiment of few-shot learning and requires the agent to effectively leverage weak supervision from a large dataset of trajectories.

Subsequently, we develop an algorithm specifically applicable to this semi-supervised OSIL setting.

Our approach first learns an embedding space where different tasks cluster uniquely.

We utilize this embedding space and the clustering it supports to self-generate pairings between trajectories in the large unpaired dataset.

Through empirical results on simulated control tasks, we demonstrate that OSIL models trained on such self-generated pairings are competitive with OSIL models trained with ground-truth labels, presenting a major advancement in the label-efficiency of OSIL.

Wu, Philipp,Hakhamaneshi, Kourosh,Du, Yuqing,Mordatch, Igor,Rajeswaran, Aravind,Abbeel, Pieter, 2024, Semi-Supervised One-Shot Imitation Learning