Language:
Mili 1.0

Mili 1.0

LDC-IL Transliterator
Download Link

Mili 1.0

Mili is a transliterator application developed at LDC-IL to facilitate the conversion among various Indian and Roman scripts.

Indian Script ⇄ Roman Script Conversion:

Mili transliterates the prevalent Indian scripts to Roman and vice versa using a predefined schema.

Indian Script ⇄ Indian Script Conversion:

The Unicode standard organizes the characters of prevalent Indian scripts (Malayalam, Kannada, Telugu, Tamil, Odia, Gujarati, Gurmukhi, Bengali and Devanagari) such that equivalent characters are spaced by multiples of 128 code points. Each of these 9 scripts occupies a block of 128 code points within the range U+0900 to U+0D7F. Mili leverages this unique arrangement for transliteration among these scripts

For instance:
Malayalam character ‘അ’ (/a/)

Transliteration Formula:
Mili uses a mathematical formula to map characters from the source script to the target script

charTarget = CharFromUnicode(intUCode(charSource) + ((TargetScriptBlock - SourceScriptBlock) * 128))

Example: To transliterate the Malayalam character ‘അ’ (/a/) to Kannada:

Handling Aberrations

For characters that do not align perfectly, such as Kannada ‘ೞ’ (/ɻa/) and its Malayalam and Tamil counterparts, manual checks and re-mappings are done. If the target script lacks a corresponding character, a nearby phoneme is used. Cultural symbols like Gurmukhi's ‘ੴ’ (Ek Onkar) are not transliterated.

Additional Script Support

Mili also supports transliteration for scripts outside the U+0900 to U+0D7F range, such as Brahmi, Chakma, Kaithi, Kharoshthi, Lepcha, Limbu, Saurashtra, Sharada, Syloti Nagri, Takri, and Tirhuta. For these scripts, characters are initially mapped to Kannada as an intermediary before applying the transliteration formula. For Meetei Mayek, Bangla script is used as the intermediary.

Mili is an essential tool for linguists, researchers, and anyone working with multiple Indian scripts, enabling transliteration across different scripts.

Credits: Rajesha N, Manasa G Linguistic Data Consortium for Indian Languages (LDC-IL), Central Institute of Indian Languages, Mysore.

Mili 1.0 Interface :