Improving mkgmap's Unicode transliteration

12 Aug 2009

      I'm going to Greece tomorrow and I noticed to my dismay that mkgmap's
transliteration tables for the Greek alphabet were totally missing.

So I hacked up a small Perl script which uses the Unicode::UCD and the
Text::Unidecode modules to fillin the blanks:

    avar@aoeu:~/src/mkgmap/resources/chars/ascii$ perl
re-transliterate.pl < row03.trans > row03.trans.tmp && mv
row03.trans.tmp row03.trans

The script and a patch to row03.trans which Works For Me are attached.
But of course the tool can also be run on the rest of the files to
fill in more blanks.

And my script can of course be modified a bit further to spit out
transliterations for files not yet in mkgmap row* files.

I don't know what the row* files were originally based on but there's
a lot of prior art for transliterating Unicode and there's no need to
redo all this work for mkgmap. The Unicode Consortium has published
transliteration tables (which Text::Unidecode is largely based on),
it's much easier to use stuff like that rather than doing all the work
yourselves.

Anyway, off to pack for my flight.

Ævar Arnfjörð Bjarmason

Ævar Arnfjörð Bjarmason

Steve Ratcliffe

tags

participants (2)