Languages

Pre-trained NLP models for 358+ Wikipedia languages. Each language includes tokenizers, n-gram models, Markov chains, vocabularies, and embeddings.

31 with models
4.23x avg compression