Document detail
ID

oai:arXiv.org:2403.11461

Topic
Computer Science - Robotics
Author
Wang, Weiyao Lei, Yutian Jin, Shiyu Hager, Gregory D. Zhang, Liangjun
Category

Computer Science

Year

2024

listing date

3/27/2024

Keywords
in-hand tasks virtual vihe
Metrics

Abstract

In this work, we introduce the Virtual In-Hand Eye Transformer (VIHE), a novel method designed to enhance 3D manipulation capabilities through action-aware view rendering.

VIHE autoregressively refines actions in multiple stages by conditioning on rendered views posed from action predictions in the earlier stages.

These virtual in-hand views provide a strong inductive bias for effectively recognizing the correct pose for the hand, especially for challenging high-precision tasks such as peg insertion.

On 18 manipulation tasks in RLBench simulated environments, VIHE achieves a new state-of-the-art, with a 12% absolute improvement, increasing from 65% to 77% over the existing state-of-the-art model using 100 demonstrations per task.

In real-world scenarios, VIHE can learn manipulation tasks with just a handful of demonstrations, highlighting its practical utility.

Videos and code implementation can be found at our project site: https://vihe-3d.github.io.

Wang, Weiyao,Lei, Yutian,Jin, Shiyu,Hager, Gregory D.,Zhang, Liangjun, 2024, VIHE: Virtual In-Hand Eye Transformer for 3D Robotic Manipulation

Document

Open

Share

Source

Articles recommended by ES/IODE AI

A Novel MR Imaging Sequence of 3D-ZOOMit Real Inversion-Recovery Imaging Improves Endolymphatic Hydrops Detection in Patients with Ménière Disease
ménière disease p < detection imaging sequences 3d-zoomit 3d endolymphatic real tse reconstruction ir inversion-recovery hydrops ratio
Successful omental flap coverage repair of a rectovaginal fistula after low anterior resection: a case report
rectovaginal fistula rectal cancer low anterior resection omental flap muscle flap rectal cancer pod initial repair rvf flap omental lar coverage