Language Modeling Without Neural Networks
Generating Shakespeare has become the “Hello World” of language models. 1 Recently, I’ve been messing with alternative language models and came across unbounded n-gram models. These models are purely statistical and don’t require optimizing weights or training. A year ago, I read the paper Infini-gram , which scaled an unbounded n-gram model to trillions of tokens. While their model had applications helping guide neural LLMs during generation, standalone language generation was not explored. In this post, I’ll explain how unbounded n-gram models work and how I improved their language generation capabilities.