Détail du document
Identifiant

oai:arXiv.org:2410.11473

Sujet
Computer Science - Computer Vision...
Auteur
Lin, Jiayi Huang, Jiabo Hu, Jian Gong, Shaogang
Catégorie

Computer Science

Année

2024

Date de référencement

08/01/2025

Mots clés
prompt class text semantic segmentation
Métrique

Résumé

Visual-textual correlations in the attention maps derived from text-to-image diffusion models are proven beneficial to dense visual prediction tasks, e.g., semantic segmentation.

However, a significant challenge arises due to the input distributional discrepancy between the context-rich sentences used for image generation and the isolated class names typically used in semantic segmentation.

This discrepancy hinders diffusion models from capturing accurate visual-textual correlations.

To solve this, we propose InvSeg, a test-time prompt inversion method that tackles open-vocabulary semantic segmentation by inverting image-specific visual context into text prompt embedding space, leveraging structure information derived from the diffusion model's reconstruction process to enrich text prompts so as to associate each class with a structure-consistent mask.

Specifically, we introduce Contrastive Soft Clustering (CSC) to align derived masks with the image's structure information, softly selecting anchors for each class and calculating weighted distances to push inner-class pixels closer while separating inter-class pixels, thereby ensuring mask distinction and internal consistency.

By incorporating sample-specific context, InvSeg learns context-rich text prompts in embedding space and achieves accurate semantic alignment across modalities.

Experiments show that InvSeg achieves state-of-the-art performance on the PASCAL VOC, PASCAL Context and COCO Object datasets.

;Comment: AAAI 2025

Lin, Jiayi,Huang, Jiabo,Hu, Jian,Gong, Shaogang, 2024, InvSeg: Test-Time Prompt Inversion for Semantic Segmentation

Document

Ouvrir

Partager

Source

Articles recommandés par ES/IODE IA

Skin cancer prevention behaviors, beliefs, distress, and worry among hispanics in Florida and Puerto Rico
skin cancer hispanic/latino prevention behaviors protection motivation theory florida puerto rico variables rico psychosocial behavior response efficacy levels skin cancer participants prevention behaviors spanish-preferring tampeños puerto hispanics