eml

Emilian-Romagnol emiliàn e rumagnòl

14,744 Words in vocabulary

3.37x Best compression

0.3584 Best isotropy

Sample text

Excerpts from Emilian-Romagnol Wikipedia articles.

'l è 'l nòm 'd un domìni genèric. Al funsiòuna da 'l zógn dal ed domìni tachê a ...

'l è 'l nòm 'd un domìni genèric. Al funsiòuna da 'l setèmber dal ed domìni tach...

Al 294 'l è 'n an edl III sécol dal Calendàri gregoriàn. Avenimèint Nê Mort III

The 20 most frequently used words in Emilian-Romagnol Wikipedia.

Explore Emilian-Romagnol interactively with browser-based demos.

Key metrics for all model types at a glance.

from wikilangs import tokenizer
tok = tokenizer('latest', 'eml', 32000)
tokens = tok.tokenize("Your text here")

from wikilangs import ngram
ng = ngram('latest', 'eml', gram_size=3)
score = ng.score("Your text here")

from wikilangs import markov
mc = markov('latest', 'eml', depth=3)
text = mc.generate(length=50)

from wikilangs import vocabulary
vocab = vocabulary('latest', 'eml')
info = vocab.lookup("word")

from wikilangs import embeddings
emb = embeddings('latest', 'eml', dimension=64)
vec = emb.embed_word("word")

Model Type	Variants	Description
Tokenizers	8k, 16k, 32k, 64k	BPE tokenizers with different vocabulary sizes
N-gram (Word)	2, 3, 4, 5-gram	Word-level language models
N-gram (Subword)	2, 3, 4, 5-gram	Subword-level language models
Markov (Word)	Depth 1–5	Word-level text generation
Markov (Subword)	Depth 1–5	Subword-level text generation
Vocabulary	—	Word dictionary with frequency and IDF
Embeddings	32d, 64d, 128d	Position-aware word embeddings