Languages

Pre-trained NLP models for 358+ Wikipedia languages. Each language includes tokenizers, n-gram models, Markov chains, vocabularies, and embeddings.

318 with models
4.37x avg compression

Language Coverage

Explore languages by geographic region. Hover over countries to see available models.

Hover over a continent to see languages

S