detalle del documento
IDENTIFICACIÓN

oai:arXiv.org:2408.03907

Tema
Computer Science - Computation and... Computer Science - Artificial Inte...
Autor
Kumar, Shachi H Sahay, Saurav Mazumder, Sahisnu Okur, Eda Manuvinakurike, Ramesh Beckage, Nicole Su, Hsuan Lee, Hung-yi Nachman, Lama
Categoría

Computer Science

Año

2024

fecha de cotización

14/8/2024

Palabras clave
human language metrics evaluation
Métrico

Resumen

Large Language Models (LLMs) have excelled at language understanding and generating human-level text.

However, even with supervised training and human alignment, these LLMs are susceptible to adversarial attacks where malicious users can prompt the model to generate undesirable text.

LLMs also inherently encode potential biases that can cause various harmful effects during interactions.

Bias evaluation metrics lack standards as well as consensus and existing methods often rely on human-generated templates and annotations which are expensive and labor intensive.

In this work, we train models to automatically create adversarial prompts to elicit biased responses from target LLMs.

We present LLM- based bias evaluation metrics and also analyze several existing automatic evaluation methods and metrics.

We analyze the various nuances of model responses, identify the strengths and weaknesses of model families, and assess where evaluation methods fall short.

We compare these metrics to human evaluation and validate that the LLM-as-a-Judge metric aligns with human judgement on bias in response generation.

;Comment: 6 pages paper content, 17 pages of appendix

Kumar, Shachi H,Sahay, Saurav,Mazumder, Sahisnu,Okur, Eda,Manuvinakurike, Ramesh,Beckage, Nicole,Su, Hsuan,Lee, Hung-yi,Nachman, Lama, 2024, Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models

Documento

Abrir

Compartir

Fuente

Artículos recomendados por ES/IODE IA

Investigation of Heavy Metal Analysis on Medicinal Plants Used for the Treatment of Skin Cancer by Traditional Practitioners in Pretoria
heavy metals medicinal plants skin cancer icp-ms health risk assessment treatment cancer plants 0 metal health medicinal