Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering

Documentdetail

ID kaart

oai:arXiv.org:2409.12784

Onderwerp

Computer Science - Computer Vision... Computer Science - Artificial Inte...

Auteur

Lim, Youngsun Choi, Hojun Shim, Hyunjung

Categorie

Computer Science

Jaar

2024

vermelding datum

12-02-2025

Trefwoorden

evaluation computer dataset images tti generation i-halla

Metriek

Beschrijving

Despite the impressive success of text-to-image (TTI) generation models, existing studies overlook the issue of whether these models accurately convey factual information.

In this paper, we focus on the problem of image hallucination, where images created by generation models fail to faithfully depict factual content.

To address this, we introduce I-HallA (Image Hallucination evaluation with Question Answering), a novel automated evaluation metric that measures the factuality of generated images through visual question answering (VQA).

We also introduce I-HallA v1.0, a curated benchmark dataset for this purpose.

As part of this process, we develop a pipeline that generates high-quality question-answer pairs using multiple GPT-4 Omni-based agents, with human judgments to ensure accuracy.

Our evaluation protocols measure image hallucination by testing if images from existing TTI models can correctly respond to these questions.

The I-HallA v1.0 dataset comprises 1.2K diverse image-text pairs across nine categories with 1,000 rigorously curated questions covering various compositional challenges.

We evaluate five TTI models using I-HallA and reveal that these state-of-the-art models often fail to accurately convey factual information.

Moreover, we validate the reliability of our metric by demonstrating a strong Spearman correlation ($\rho$=0.95) with human judgments.

We believe our benchmark dataset and metric can serve as a foundation for developing factually accurate TTI generation models.

Additional resources can be found on our project page: https://sgt-lim.github.io/I-HallA/.

;Comment: 20 pages

Lim, Youngsun,Choi, Hojun,Shim, Hyunjung, 2024, Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering

Document

Openen

Bron

Artikelen aanbevolen door ES/IODE AI

Computer Science

Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering

Documentdetail

ID kaart

Onderwerp

Auteur

Categorie

Jaar

vermelding datum

Trefwoorden

Metriek

Beschrijving

Document

Delen

Bron

Artikelen aanbevolen door ES/IODE AI

Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering

Role of Repetitive Transcranial Magnetic Stimulation in Treatment of Fibromyalgia: A Randomized Controlled Trial

Bone metastasis prediction in non-small-cell lung cancer: primary CT-based radiomics signature and clinical feature