Détail du document
Identifiant

oai:arXiv.org:2407.13218

Sujet
Computer Science - Machine Learnin... Computer Science - Artificial Inte...
Auteur
Borisyuk, Fedor Song, Qingquan Zhou, Mingzhou Parameswaran, Ganesh Arun, Madhu Popuri, Siva Bingol, Tugrul Pei, Zhuotao Lee, Kuang-Hsuan Zheng, Lu Shao, Qizhan Naqvi, Ali Zhou, Sen Gupta, Aman
Catégorie

Computer Science

Année

2024

Date de référencement

14/08/2024

Mots clés
linkedin gpu system linr indexes retrieval model
Métrique

Résumé

This paper introduces LiNR, LinkedIn's large-scale, GPU-based retrieval system.

LiNR supports a billion-sized index on GPU models.

We discuss our experiences and challenges in creating scalable, differentiable search indexes using TensorFlow and PyTorch at production scale.

In LiNR, both items and model weights are integrated into the model binary.

Viewing index construction as a form of model training, we describe scaling our system for large indexes, incorporating full scans and efficient filtering.

A key focus is on enabling attribute-based pre-filtering for exhaustive GPU searches, addressing the common challenge of post-filtering in KNN searches that often reduces system quality.

We further provide multi-embedding retrieval algorithms and strategies for tackling cold start issues in retrieval.

Our advancements in supporting larger indexes through quantization are also discussed.

We believe LiNR represents one of the industry's first Live-updated model-based retrieval indexes.

Applied to out-of-network post recommendations on LinkedIn Feed, LiNR has contributed to a 3% relative increase in professional daily active users.

We envisage LiNR as a step towards integrating retrieval and ranking into a single GPU model, simplifying complex infrastructures and enabling end-to-end optimization of the entire differentiable infrastructure through gradient descent.

Borisyuk, Fedor,Song, Qingquan,Zhou, Mingzhou,Parameswaran, Ganesh,Arun, Madhu,Popuri, Siva,Bingol, Tugrul,Pei, Zhuotao,Lee, Kuang-Hsuan,Zheng, Lu,Shao, Qizhan,Naqvi, Ali,Zhou, Sen,Gupta, Aman, 2024, LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

Document

Ouvrir

Partager

Source

Articles recommandés par ES/IODE IA

Asynchronous Online Adaptation via Modular Drift Detection for Deep Receivers
deep channel processing receiver modular drift receivers detection
The Lasting impact of the COVID-19 pandemic on outpatient neurology consultations
rates consultations patients neurology outcomes clinic appointments referrals outpatient pandemic