Document detail
ID

oai:arXiv.org:2407.13218

Topic
Computer Science - Machine Learnin... Computer Science - Artificial Inte...
Author
Borisyuk, Fedor Song, Qingquan Zhou, Mingzhou Parameswaran, Ganesh Arun, Madhu Popuri, Siva Bingol, Tugrul Pei, Zhuotao Lee, Kuang-Hsuan Zheng, Lu Shao, Qizhan Naqvi, Ali Zhou, Sen Gupta, Aman
Category

Computer Science

Year

2024

listing date

8/14/2024

Keywords
linkedin gpu system linr indexes retrieval model
Metrics

Abstract

This paper introduces LiNR, LinkedIn's large-scale, GPU-based retrieval system.

LiNR supports a billion-sized index on GPU models.

We discuss our experiences and challenges in creating scalable, differentiable search indexes using TensorFlow and PyTorch at production scale.

In LiNR, both items and model weights are integrated into the model binary.

Viewing index construction as a form of model training, we describe scaling our system for large indexes, incorporating full scans and efficient filtering.

A key focus is on enabling attribute-based pre-filtering for exhaustive GPU searches, addressing the common challenge of post-filtering in KNN searches that often reduces system quality.

We further provide multi-embedding retrieval algorithms and strategies for tackling cold start issues in retrieval.

Our advancements in supporting larger indexes through quantization are also discussed.

We believe LiNR represents one of the industry's first Live-updated model-based retrieval indexes.

Applied to out-of-network post recommendations on LinkedIn Feed, LiNR has contributed to a 3% relative increase in professional daily active users.

We envisage LiNR as a step towards integrating retrieval and ranking into a single GPU model, simplifying complex infrastructures and enabling end-to-end optimization of the entire differentiable infrastructure through gradient descent.

Borisyuk, Fedor,Song, Qingquan,Zhou, Mingzhou,Parameswaran, Ganesh,Arun, Madhu,Popuri, Siva,Bingol, Tugrul,Pei, Zhuotao,Lee, Kuang-Hsuan,Zheng, Lu,Shao, Qizhan,Naqvi, Ali,Zhou, Sen,Gupta, Aman, 2024, LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

Document

Open

Share

Source

Articles recommended by ES/IODE AI