Document detail
ID

oai:arXiv.org:2210.01069

Topic
Computer Science - Computer Vision...
Author
Chen, Sixiang Ye, Tian Liu, Yun Chen, Erkang
Category

Computer Science

Year

2022

listing date

10/5/2022

Keywords
image
Metrics

Abstract

Recently, image restoration transformers have achieved comparable performance with previous state-of-the-art CNNs.

However, how to efficiently leverage such architectures remains an open problem.

In this work, we present Dual-former whose critical insight is to combine the powerful global modeling ability of self-attention modules and the local modeling ability of convolutions in an overall architecture.

With convolution-based Local Feature Extraction modules equipped in the encoder and the decoder, we only adopt a novel Hybrid Transformer Block in the latent layer to model the long-distance dependence in spatial dimensions and handle the uneven distribution between channels.

Such a design eliminates the substantial computational complexity in previous image restoration transformers and achieves superior performance on multiple image restoration tasks.

Experiments demonstrate that Dual-former achieves a 1.91dB gain over the state-of-the-art MAXIM method on the Indoor dataset for single image dehazing while consuming only 4.2% GFLOPs as MAXIM.

For single image deraining, it exceeds the SOTA method by 0.1dB PSNR on the average results of five datasets with only 21.5% GFLOPs.

Dual-former also substantially surpasses the latest desnowing method on various datasets, with fewer parameters.

Chen, Sixiang,Ye, Tian,Liu, Yun,Chen, Erkang, 2022, Dual-former: Hybrid Self-attention Transformer for Efficient Image Restoration

Document

Open

Share

Source

Articles recommended by ES/IODE AI