Détail du document
Identifiant

oai:arXiv.org:2411.02299

Sujet
Computer Science - Computer Vision... Computer Science - Machine Learnin...
Auteur
Zhao, Rongzhen Wang, Vivienne Kannala, Juho Pajarinen, Joni
Catégorie

Computer Science

Année

2024

Date de référencement

06/11/2024

Mots clés
attributes object computer gdr discrete learning representation features
Métrique

Résumé

Object-Centric Learning (OCL) can discover objects in images or videos by simply reconstructing the input.

For better object discovery, representative OCL methods reconstruct the input as its Variational Autoencoder (VAE) intermediate representation, which suppresses pixel noises and promotes object separability by discretizing continuous super-pixels with template features.

However, treating features as units overlooks their composing attributes, thus impeding model generalization; indexing features with scalar numbers loses attribute-level similarities and differences, thus hindering model convergence.

We propose \textit{Grouped Discrete Representation} (GDR) for OCL.

We decompose features into combinatorial attributes via organized channel grouping, and compose these attributes into discrete representation via tuple indexes.

Experiments show that our GDR improves both Transformer- and Diffusion-based OCL methods consistently on various datasets.

Visualizations show that our GDR captures better object separability.

Zhao, Rongzhen,Wang, Vivienne,Kannala, Juho,Pajarinen, Joni, 2024, Grouped Discrete Representation for Object-Centric Learning

Document

Ouvrir

Partager

Source

Articles recommandés par ES/IODE IA

Clinical Relevance of Plaque Distribution for Basilar Artery Stenosis
study endovascular imaging wall basilar complications plaque postoperative artery plaques stenosis