detalle del documento
IDENTIFICACIÓN

oai:arXiv.org:2410.13779

Tema
Computer Science - Computation and... Computer Science - Machine Learnin...
Autor
Frydenlund, Arvid
Categoría

Computer Science

Año

2024

fecha de cotización

23/10/2024

Palabras clave
path-star models node task
Métrico

Resumen

The recently introduced path-star task is a minimal task designed to exemplify limitations to the abilities of language models (Bachmann and Nagarajan, 2024).

It involves a path-star graph where multiple arms radiate from a single starting node and each node is unique.

Given the start node and a specified target node that ends an arm, the task is to generate the arm containing that target node.

This is straightforward for a human but surprisingly difficult for language models, which did not outperform the random baseline.

The authors hypothesized this is due to a deficiency in teacher-forcing and the next-token prediction paradigm.

We demonstrate the task is learnable using teacher-forcing in alternative settings and that the issue is partially due to representation.

We introduce a regularization method using structured samples of the same graph but with differing target nodes, improving results across a variety of model types.

We provide RASP proofs showing the task is theoretically solvable.

Finally, we find settings where an encoder-only model can consistently solve the task.

;Comment: EMNLP 2024 Main

Frydenlund, Arvid, 2024, The Mystery of the Pathological Path-star Task for Language Models

Documento

Abrir

Compartir

Fuente

Artículos recomendados por ES/IODE IA

Potential mechanisms of osthole against bladder cancer cells based on network pharmacology, molecular docking, and experimental validation
osthole bladder cancer network pharmacology molecular docking pathway cytotoxic pi3k-akt potential genes target cells bladder cancer osthole