Explicit Word Density Estimation for Language Modelling

Document detail

ID

oai:arXiv.org:2406.10256

Topic

Computer Science - Computation and... Computer Science - Artificial Inte... Computer Science - Machine Learnin...

Author

Andonov, Jovan Ganea, Octavian Grnarova, Paulina Bécigneul, Gary Hofmann, Thomas

Year

2024

listing date

6/19/2024

Keywords

modelling

Metrics

Abstract

Language Modelling has been a central part of Natural Language Processing for a very long time and in the past few years LSTM-based language models have been the go-to method for commercial language modeling.

Recently, it has been shown that when looking at language modelling from a matrix factorization point of view, the final Softmax layer limits the expressiveness of the model, by putting an upper bound on the rank of the resulting matrix.

Additionally, a new family of neural networks based called NeuralODEs, has been introduced as a continuous alternative to Residual Networks.

Moreover, it has been shown that there is a connection between these models and Normalizing Flows.

In this work we propose a new family of language models based on NeuralODEs and the continuous analogue of Normalizing Flows and manage to improve on some of the baselines.

;Comment: Master's thesis

Andonov, Jovan,Ganea, Octavian,Grnarova, Paulina,Bécigneul, Gary,Hofmann, Thomas, 2024, Explicit Word Density Estimation for Language Modelling

Document

Open

Source

Articles recommended by ES/IODE AI

Computer Science