detalle del documento
IDENTIFICACIÓN

oai:arXiv.org:2406.01867

Tema
Computer Science - Computer Vision...
Autor
Uchida, Kengo Shibuya, Takashi Takida, Yuhta Murata, Naoki Tanke, Julian Takahashi, Shusuke Mitsufuji, Yuki
Categoría

Computer Science

Año

2024

fecha de cotización

19/2/2025

Palabras clave
framework adversarial generation
Métrico

Resumen

In text-to-motion generation, controllability as well as generation quality and speed has become increasingly critical.

The controllability challenges include generating a motion of a length that matches the given textual description and editing the generated motions according to control signals, such as the start-end positions and the pelvis trajectory.

In this paper, we propose MoLA, which provides fast, high-quality, variable-length motion generation and can also deal with multiple editing tasks in a single framework.

Our approach revisits the motion representation used as inputs and outputs in the model, incorporating an activation variable to enable variable-length motion generation.

Additionally, we integrate a variational autoencoder and a latent diffusion model, further enhanced through adversarial training, to achieve high-quality and fast generation.

Moreover, we apply a training-free guided generation framework to achieve various editing tasks with motion control inputs.

We quantitatively show the effectiveness of adversarial learning in text-to-motion generation, and demonstrate the applicability of our editing framework to multiple editing tasks in the motion domain.

;Comment: 13 pages, 8 figures

Uchida, Kengo,Shibuya, Takashi,Takida, Yuhta,Murata, Naoki,Tanke, Julian,Takahashi, Shusuke,Mitsufuji, Yuki, 2024, MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training

Documento

Abrir

Compartir

Fuente

Artículos recomendados por ES/IODE IA

Sparse-to-Dense LiDAR Point Generation by LiDAR-Camera Fusion for 3D Object Detection
detection features detecting computer information generation semantic data objects