Document detail
ID

oai:arXiv.org:2405.18706

Topic
Computer Science - Computer Vision...
Author
Huang, You Lan, Zongyu Cao, Liujuan Lin, Xianming Zhang, Shengchuan Jiang, Guannan Ji, Rongrong
Category

Computer Science

Year

2024

listing date

6/5/2024

Keywords
target pipeline propose embeddings object focsam segmentation image
Metrics

Abstract

The Segment Anything Model (SAM) marks a notable milestone in segmentation models, highlighted by its robust zero-shot capabilities and ability to handle diverse prompts.

SAM follows a pipeline that separates interactive segmentation into image preprocessing through a large encoder and interactive inference via a lightweight decoder, ensuring efficient real-time performance.

However, SAM faces stability issues in challenging samples upon this pipeline.

These issues arise from two main factors.

Firstly, the image preprocessing disables SAM from dynamically using image-level zoom-in strategies to refocus on the target object during interaction.

Secondly, the lightweight decoder struggles to sufficiently integrate interactive information with image embeddings.

To address these two limitations, we propose FocSAM with a pipeline redesigned on two pivotal aspects.

First, we propose Dynamic Window Multi-head Self-Attention (Dwin-MSA) to dynamically refocus SAM's image embeddings on the target object.

Dwin-MSA localizes attention computations around the target object, enhancing object-related embeddings with minimal computational overhead.

Second, we propose Pixel-wise Dynamic ReLU (P-DyReLU) to enable sufficient integration of interactive information from a few initial clicks that have significant impacts on the overall segmentation results.

Experimentally, FocSAM augments SAM's interactive segmentation performance to match the existing state-of-the-art method in segmentation quality, requiring only about 5.6% of this method's inference time on CPUs.

;Comment: Accepted to CVPR 2024

Huang, You,Lan, Zongyu,Cao, Liujuan,Lin, Xianming,Zhang, Shengchuan,Jiang, Guannan,Ji, Rongrong, 2024, FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

Document

Open

Share

Source

Articles recommended by ES/IODE AI

High-Frequency Repetitive Magnetic Stimulation at the Sacrum Alleviates Chronic Constipation in Parkinson’s Patients
magnetic stimulation parkinson’s significant patients scale sacrum pd hf-rms chronic constipation scores
The mechanism of PFK-1 in the occurrence and development of bladder cancer by regulating ZEB1 lactylation
bladder cancer pfk-1 zeb1 lactylation glycolysis inhibits lactate glucose bc pfk-1 cancer lactylation cells bladder