Documentdetail
ID kaart

oai:arXiv.org:2410.12318

Onderwerp
Computer Science - Cryptography an... Computer Science - Artificial Inte...
Auteur
Cai, Jiacheng Yu, Jiahao Shao, Yangguang Wu, Yuhang Xing, Xinyu
Categorie

Computer Science

Jaar

2024

vermelding datum

23-10-2024

Trefwoorden
utf fingerprinting model tokens
Metriek

Beschrijving

Fingerprinting large language models (LLMs) is essential for verifying model ownership, ensuring authenticity, and preventing misuse.

Traditional fingerprinting methods often require significant computational overhead or white-box verification access.

In this paper, we introduce UTF, a novel and efficient approach to fingerprinting LLMs by leveraging under-trained tokens.

Under-trained tokens are tokens that the model has not fully learned during its training phase.

By utilizing these tokens, we perform supervised fine-tuning to embed specific input-output pairs into the model.

This process allows the LLM to produce predetermined outputs when presented with certain inputs, effectively embedding a unique fingerprint.

Our method has minimal overhead and impact on model's performance, and does not require white-box access to target model's ownership identification.

Compared to existing fingerprinting methods, UTF is also more effective and robust to fine-tuning and random guess.

Cai, Jiacheng,Yu, Jiahao,Shao, Yangguang,Wu, Yuhang,Xing, Xinyu, 2024, UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification

Document

Openen

Delen

Bron

Artikelen aanbevolen door ES/IODE AI

Choice Between Partial Trajectories: Disentangling Goals from Beliefs
agents models aligned based bootstrapped learning reward function model return choice choices partial