Transliteration of persian and arabic script
data:image/s3,"s3://crabby-images/e65b7/e65b783511d08c23af480d060b5f67a468ab8e36" alt=""
Currently if the name-tag is written in persian or arabic script on a garmin device it ends up transliterated into latin script. I am not sure if garmin is doing this internally or mkgmap. Can any developer clear this up? I am asking because this transliteration contains some error as it seems some persian characters (at least one, the "alef") are not mapped to latin ones so they end up as question marks Some examples: source: پمپ بنزین نامجو current: pmp bnzyn n?mjw correct: pmp bnzyn namjw source: میدان سعدآباد current: myd?n s'dab?d correct: mydan s'dabad source: مجتمع پزشکی ثامن الائمه current: mjtm` pzshkhy th?mn ?l?y'Mh correct: mjtm` pzshkhy thamn alay'Mh I've found script/make-transliteration-table.pl and as far as I understand the mapping error seems in Perl's Text::Unidecode module then. Maybe the script's author, Ævar, can take a look :) Thanks in advance, Claudius
data:image/s3,"s3://crabby-images/65b66/65b66aedfb8c69a1feef42153928d1d262ea0abd" alt=""
The transliteration is for sure done in mkgmap and not in the garmin unit. But don't ask me details, where and how it happens. Claudius Henrichs schrieb:
Currently if the name-tag is written in persian or arabic script on a garmin device it ends up transliterated into latin script. I am not sure if garmin is doing this internally or mkgmap. Can any developer clear this up?
I am asking because this transliteration contains some error as it seems some persian characters (at least one, the "alef") are not mapped to latin ones so they end up as question marks
Some examples: source: پمپ بنزین نامجو current: pmp bnzyn n?mjw correct: pmp bnzyn namjw
source: میدان سعدآباد current: myd?n s'dab?d correct: mydan s'dabad
source: مجتمع پزشکی ثامن الائمه current: mjtm` pzshkhy th?mn ?l?y'Mh correct: mjtm` pzshkhy thamn alay'Mh
I've found script/make-transliteration-table.pl and as far as I understand the mapping error seems in Perl's Text::Unidecode module then. Maybe the script's author, Ævar, can take a look :)
Thanks in advance, Claudius
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/c125b/c125b853f0995d45aaac92eceb3ca5c1f81f52f5" alt=""
On Tue, May 18, 2010 at 07:39:54PM +0200, Johann Gail wrote:
The transliteration is for sure done in mkgmap and not in the garmin unit. But don't ask me details, where and how it happens.
Right. If someone could supply some details, I could submit some patches. It would be nice to translate curly quotes to ' or ", because Garmin doesn't seem to support those cp1252 code points. Also, if it is not already the case, we should translate Roman numerals and other non-ASCII Unicode numerals into ASCII. At least around here, there are road signs that say "Ring Ⅲ", but the way currently carries the name "Ring III" and Garmin displays it as "Ring Iii". (I think I read somewhere that there is a control code that prevents the Garmin lower-casing.) How hard would it be to translate strings into strings, say, "ⅯⅯⅨ" into "2009"? There is also something strange about the Russian encoding. The letter "я" is translated into "â", while I believe the correct transliteration would be "ya" or "ja", at least at the end of a word. But I can live with the "â", at least it looks like "a" and is easy to recognize. Marko
data:image/s3,"s3://crabby-images/00b89/00b89395245bfc26de5eed215b8fe68394fbdd3c" alt=""
El 18/05/10 20:23, Marko Mäkelä escribió:
On Tue, May 18, 2010 at 07:39:54PM +0200, Johann Gail wrote:
The transliteration is for sure done in mkgmap and not in the garmin unit. But don't ask me details, where and how it happens.
Right. If someone could supply some details, I could submit some patches. It would be nice to translate curly quotes to ' or ", because Garmin doesn't seem to support those cp1252 code points.
Also, if it is not already the case, we should translate Roman numerals and other non-ASCII Unicode numerals into ASCII. At least around here, there are road signs that say "Ring Ⅲ", but the way currently carries the name "Ring III" and Garmin displays it as "Ring Iii". (I think I read somewhere that there is a control code that prevents the Garmin lower-casing.) How hard would it be to translate strings into strings, say, "ⅯⅯⅨ" into "2009"?
I think Roman numbers should be displayed as upper case roman number, at least when they are part of a ref or name. If you have always seen a street name as, say "Juan XXIII" it may be difficult to recognize it as "Juan 23". For dates it could be better to translate into Arabic numbers as you propose.
data:image/s3,"s3://crabby-images/65b66/65b66aedfb8c69a1feef42153928d1d262ea0abd" alt=""
I think Roman numbers should be displayed as upper case roman number, at least when they are part of a ref or name. If you have always seen a street name as, say "Juan XXIII" it may be difficult to recognize it as "Juan 23". For dates it could be better to translate into Arabic numbers as you propose.
Yes, its my opinion too. Don't try to transliterate roman numbers. If in the database are entered roman numbers, somone has done it by intention. If this is really wrong, then correct the database.
data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
There is also something strange about the Russian encoding. The letter "я" is translated into "â", while I believe the correct transliteration would be "ya" or "ja", at least at the end of a word. But I can live with the "â", at least it looks like "a" and is easy to recognize.
You can transliterate into ascii or latin1 (ie if you give --latin1 or not). The ascii version is ia and the latin1 version is "â" The latin1 versions were generated by icu4j. If the latin1 versions are not useful, then they can be removed. ..Steve
data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
On 17/05/10 11:50, Claudius Henrichs wrote:
Currently if the name-tag is written in persian or arabic script on a garmin device it ends up transliterated into latin script. I am not sure if garmin is doing this internally or mkgmap. Can any developer clear this up?
I am asking because this transliteration contains some error as it seems some persian characters (at least one, the "alef") are not mapped to latin ones so they end up as question marks
Some examples: source: پمپ بنزین نامجو current: pmp bnzyn n?mjw correct: pmp bnzyn namjw
I think that these characters are in the file resources/chars/ascii/row06.trans The line that needs changing would be: U+0627 ? # Character ا In this case. ..Steve
participants (5)
-
Carlos Dávila
-
Claudius Henrichs
-
Johann Gail
-
Marko Mäkelä
-
Steve Ratcliffe