Commit r4809: fix java.lang.AssertionError while building index from unicode tiles

Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct 2021 fix java.lang.AssertionError while building index from unicode tiles mdrUnicode_v2.patch by Ticker Berkin http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809

Hi devs. With this new version I get a new crash, but now with --code-page=936, not with unicode: Exception in thread "main" java.lang.AssertionError: mdr20 value changed f=5174 t=5180 count=2995 at uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Record.java:134) at uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20.java:84) at uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile.java:335) at uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:270) at uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuilder.java:331) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:126) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147) at uk.me.parabola.mkgmap.main.Main.main(Main.java:118) mkgmap command: java -ea -jar mkgmap-r4809.jar --index --bounds=bounds.zip --housenumbers --code-page=936 31177013.o5m https://files.mkgmap.org.uk/download/524/31177013.o5m El 22/10/21 a las 9:42, svn commit escribió:
Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct 2021
fix java.lang.AssertionError while building index from unicode tiles mdrUnicode_v2.patch by Ticker Berkin
http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809 _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Carlos mkgmap doesn't have a resources/sort for code-page 936 (Microsoft's character encoding for simplified Chinese). I was surprised it doesn't give any warning about this. I'll look more closely tomorrow to see what happens when it doesn't find the resource file. I presume this didn't crash before, but did the index work? I suspect this will have many of the same problems as unicode sort had for unspecified characters. I'll also investigate the other change relating to collation strength. Ticker On Sat, 2021-10-23 at 22:26 +0200, Carlos Dávila wrote:
Hi devs.
With this new version I get a new crash, but now with --code- page=936, not with unicode:
Exception in thread "main" java.lang.AssertionError: mdr20 value changed f=5174 t=5180 count=2995 at uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Record.java:134 ) at uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20.java:84) at uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile.java:335) at uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:270) at uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuilder.java:3 31) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.ja va:126) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147) at uk.me.parabola.mkgmap.main.Main.main(Main.java:118)
mkgmap command: java -ea -jar mkgmap-r4809.jar --index --bounds=bounds.zip --housenumbers --code-page=936 31177013.o5m
https://files.mkgmap.org.uk/download/524/31177013.o5m
El 22/10/21 a las 9:42, svn commit escribió:
Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct 2021
fix java.lang.AssertionError while building index from unicode tiles mdrUnicode_v2.patch by Ticker Berkin
http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809 _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

As you thought, it didn't crash. I can't type Chinese characters, but using copy from JOSM/paste into BaseCamp, I could test address searches and they seem to work. El 23/10/21 a las 23:50, Ticker Berkin escribió:
Hi Carlos
mkgmap doesn't have a resources/sort for code-page 936 (Microsoft's character encoding for simplified Chinese). I was surprised it doesn't give any warning about this. I'll look more closely tomorrow to see what happens when it doesn't find the resource file.
I presume this didn't crash before, but did the index work?
I suspect this will have many of the same problems as unicode sort had for unspecified characters.
I'll also investigate the other change relating to collation strength.
Ticker
On Sat, 2021-10-23 at 22:26 +0200, Carlos Dávila wrote:
Hi devs.
With this new version I get a new crash, but now with --code- page=936, not with unicode:
Exception in thread "main" java.lang.AssertionError: mdr20 value changed f=5174 t=5180 count=2995 at uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Record.java:134 ) at uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20.java:84) at uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile.java:335) at uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:270) at uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuilder.java:3 31) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.ja va:126) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147) at uk.me.parabola.mkgmap.main.Main.main(Main.java:118)
mkgmap command: java -ea -jar mkgmap-r4809.jar --index --bounds=bounds.zip --housenumbers --code-page=936 31177013.o5m
https://files.mkgmap.org.uk/download/524/31177013.o5m
El 22/10/21 a las 9:42, svn commit escribió:
Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct 2021
fix java.lang.AssertionError while building index from unicode tiles mdrUnicode_v2.patch by Ticker Berkin
http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809 _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Carlos When mkgmap doesn't have a resources/sort for the given code page, it defaults the sort to cp1252 (Western European). As part of building the the various indexes, it sorts counties, regions, cities, streets etc using this sort, but any characters that don't have a defined sort order are ignored in the ordering. The result of this is that, using cp1252 on Chinese, all names seem the same. I suspect that indexes are mostly empty and find is ignoring them. There is some logic that is differentiating the names in these structures on exact naming, and this inconsistency causes the assertion crash. The actual output in the map image is cp836, which Basecamp and Mapsource appear to handle. I don't know how well it is supported by Garmin devices. Is there a reason for using cp836 rather than cp65001/unicode? Ticker On Sun, 2021-10-24 at 16:22 +0200, Carlos Dávila wrote:
using copy from JOSM/paste into BaseCamp, I could test address searches and they seem to work.
El 23/10/21 a las 23:50, Ticker Berkin escribió:
Hi Carlos
mkgmap doesn't have a resources/sort for code-page 936 (Microsoft's character encoding for simplified Chinese). I was surprised it doesn't give any warning about this. I'll look more closely tomorrow to see what happens when it doesn't find the resource file.
I presume this didn't crash before, but did the index work?
I suspect this will have many of the same problems as unicode sort had for unspecified characters.
I'll also investigate the other change relating to collation strength.
Ticker
On Sat, 2021-10-23 at 22:26 +0200, Carlos Dávila wrote:
Hi devs.
With this new version I get a new crash, but now with --code- page=936, not with unicode:
Exception in thread "main" java.lang.AssertionError: mdr20 value changed f=5174 t=5180 count=2995 at uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Record.java :134 ) at uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20.java:8 4) at uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile.java: 335) at uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:270) at uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuilder.ja va:3 31) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReade r.ja va:126) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147) at uk.me.parabola.mkgmap.main.Main.main(Main.java:118)
mkgmap command: java -ea -jar mkgmap-r4809.jar --index --bounds=bounds.zip --housenumbers --code-page=936 31177013.o5m
https://files.mkgmap.org.uk/download/524/31177013.o5m
El 22/10/21 a las 9:42, svn commit escribió:
Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct 2021
fix java.lang.AssertionError while building index from unicode tiles mdrUnicode_v2.patch by Ticker Berkin
http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809 _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

The reason for using code-pages other than 65001 is that many Garmin devices fail to load non original unicode maps. See Felix explanation here: https://openmtbmap.org/download/odbl/#Compatibility_-_Unicode_vs_Non_Unicode... El 24/10/21 a las 18:14, Ticker Berkin escribió:
Hi Carlos
When mkgmap doesn't have a resources/sort for the given code page, it defaults the sort to cp1252 (Western European).
As part of building the the various indexes, it sorts counties, regions, cities, streets etc using this sort, but any characters that don't have a defined sort order are ignored in the ordering. The result of this is that, using cp1252 on Chinese, all names seem the same.
I suspect that indexes are mostly empty and find is ignoring them.
There is some logic that is differentiating the names in these structures on exact naming, and this inconsistency causes the assertion crash.
The actual output in the map image is cp836, which Basecamp and Mapsource appear to handle. I don't know how well it is supported by Garmin devices.
Is there a reason for using cp836 rather than cp65001/unicode?
Ticker
On Sun, 2021-10-24 at 16:22 +0200, Carlos Dávila wrote:
using copy from JOSM/paste into BaseCamp, I could test address searches and they seem to work.
El 23/10/21 a las 23:50, Ticker Berkin escribió:
Hi Carlos
mkgmap doesn't have a resources/sort for code-page 936 (Microsoft's character encoding for simplified Chinese). I was surprised it doesn't give any warning about this. I'll look more closely tomorrow to see what happens when it doesn't find the resource file.
I presume this didn't crash before, but did the index work?
I suspect this will have many of the same problems as unicode sort had for unspecified characters.
I'll also investigate the other change relating to collation strength.
Ticker
On Sat, 2021-10-23 at 22:26 +0200, Carlos Dávila wrote:
Hi devs.
With this new version I get a new crash, but now with --code- page=936, not with unicode:
Exception in thread "main" java.lang.AssertionError: mdr20 value changed f=5174 t=5180 count=2995 at uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Record.java :134 ) at uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20.java:8 4) at uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile.java: 335) at uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:270) at uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuilder.ja va:3 31) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReade r.ja va:126) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147) at uk.me.parabola.mkgmap.main.Main.main(Main.java:118)
mkgmap command: java -ea -jar mkgmap-r4809.jar --index --bounds=bounds.zip --housenumbers --code-page=936 31177013.o5m
https://files.mkgmap.org.uk/download/524/31177013.o5m
El 22/10/21 a las 9:42, svn commit escribió:
Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct 2021
fix java.lang.AssertionError while building index from unicode tiles mdrUnicode_v2.patch by Ticker Berkin
http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809 _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Carlos & Gerd Changing the default sort used when resources/sort/cp... matching the code-page doesn't exist from cp1252 to cp65001/unicode stops this crash. It probably gives a much better index. Except in the cases where the requested codepage uses transliterations from resources/chars/ascii or latin1, I think the default sort should be unicode. I haven't yet investigated how and when these transliterations occur. Tomorrow I'll look at reasons why the exception happens, even when the sort is discarding significant characters. Ticker On Sun, 2021-10-24 at 18:31 +0200, Carlos Dávila wrote:
The reason for using code-pages other than 65001 is that many Garmin here: https://openmtbmap.org/download/odbl/#Compatibility_-_Unicode_vs_Non_Unicode...
El 24/10/21 a las 18:14, Ticker Berkin escribió:
Hi Carlos
When mkgmap doesn't have a resources/sort for the given code page, it defaults the sort to cp1252 (Western European).
As part of building the the various indexes, it sorts counties, regions, cities, streets etc using this sort, but any characters that don't have a defined sort order are ignored in the ordering. The result of this is that, using cp1252 on Chinese, all names seem the same.
I suspect that indexes are mostly empty and find is ignoring them.
There is some logic that is differentiating the names in these structures on exact naming, and this inconsistency causes the assertion crash.
The actual output in the map image is cp836, which Basecamp and Mapsource appear to handle. I don't know how well it is supported by Garmin devices.
Is there a reason for using cp836 rather than cp65001/unicode?
Ticker
On Sun, 2021-10-24 at 16:22 +0200, Carlos Dávila wrote:
using copy from JOSM/paste into BaseCamp, I could test address searches and they seem to work.
El 23/10/21 a las 23:50, Ticker Berkin escribió:
Hi Carlos
mkgmap doesn't have a resources/sort for code-page 936 (Microsoft's character encoding for simplified Chinese). I was surprised it doesn't give any warning about this. I'll look more closely tomorrow to see what happens when it doesn't find the resource file.
I presume this didn't crash before, but did the index work?
I suspect this will have many of the same problems as unicode sort had for unspecified characters.
I'll also investigate the other change relating to collation strength.
Ticker
On Sat, 2021-10-23 at 22:26 +0200, Carlos Dávila wrote:
Hi devs.
With this new version I get a new crash, but now with --code- page=936, not with unicode:
Exception in thread "main" java.lang.AssertionError: mdr20 value changed f=5174 t=5180 count=2995 at uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Record. java :134 ) at uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20.ja va:8 4) at uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile.j ava: 335) at uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:270) at uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuilde r.ja va:3 31) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsR eade r.ja va:126) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147) at uk.me.parabola.mkgmap.main.Main.main(Main.java:118)
mkgmap command: java -ea -jar mkgmap-r4809.jar --index --bounds=bounds.zip --housenumbers --code-page=936 31177013.o5m
https://files.mkgmap.org.uk/download/524/31177013.o5m
El 22/10/21 a las 9:42, svn commit escribió:
Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct 2021
fix java.lang.AssertionError while building index from unicode tiles mdrUnicode_v2.patch by Ticker Berkin
http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809 _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi There are a lot of problems relating to --index with all multi-byte character sets except Unicode 1/ Unicode sets misc flags here and there in the MDR logic. It is unclear if these are relating to fixed >1 byte, variable length, or unicode explicitly. 2/ There is some logic that works out various positions of name components in the final output encoding and this only handles single- byte or unicode. 3/ The sort/collation tables are set for cp1252, so all but western- european characters will be ignored, resulting in what should be different names existing only once in the indexing structures. The fix I'm working on for "r4809 crashes by buildung mdr" will stop this crash, but won't change any of the above. I don't know if there has ever been an attempt to make mkgmap indexing work for character sets like cp836. Ticker On Sun, 2021-10-24 at 18:04 +0100, Ticker Berkin wrote:
Hi Carlos & Gerd
Changing the default sort used when resources/sort/cp... matching the code-page doesn't exist from cp1252 to cp65001/unicode stops this crash. It probably gives a much better index.
Except in the cases where the requested codepage uses transliterations from resources/chars/ascii or latin1, I think the default sort should be unicode. I haven't yet investigated how and when these transliterations occur. Tomorrow I'll look at reasons why the exception happens, even when the sort is discarding significant characters.
Ticker
On Sun, 2021-10-24 at 18:31 +0200, Carlos Dávila wrote:
The reason for using code-pages other than 65001 is that many Garmin here: https://openmtbmap.org/download/odbl/#Compatibility_-_Unicode_vs_Non_Unicode...
El 24/10/21 a las 18:14, Ticker Berkin escribió:
Hi Carlos
When mkgmap doesn't have a resources/sort for the given code page, it defaults the sort to cp1252 (Western European).
As part of building the the various indexes, it sorts counties, regions, cities, streets etc using this sort, but any characters that don't have a defined sort order are ignored in the ordering. The result of this is that, using cp1252 on Chinese, all names seem the same.
I suspect that indexes are mostly empty and find is ignoring them.
There is some logic that is differentiating the names in these structures on exact naming, and this inconsistency causes the assertion crash.
The actual output in the map image is cp836, which Basecamp and Mapsource appear to handle. I don't know how well it is supported by Garmin devices.
Is there a reason for using cp836 rather than cp65001/unicode?
Ticker
On Sun, 2021-10-24 at 16:22 +0200, Carlos Dávila wrote:
using copy from JOSM/paste into BaseCamp, I could test address searches and they seem to work.
El 23/10/21 a las 23:50, Ticker Berkin escribió:
Hi Carlos
mkgmap doesn't have a resources/sort for code-page 936 (Microsoft's character encoding for simplified Chinese). I was surprised it doesn't give any warning about this. I'll look more closely tomorrow to see what happens when it doesn't find the resource file.
I presume this didn't crash before, but did the index work?
I suspect this will have many of the same problems as unicode sort had for unspecified characters.
I'll also investigate the other change relating to collation strength.
Ticker
On Sat, 2021-10-23 at 22:26 +0200, Carlos Dávila wrote:
Hi devs.
With this new version I get a new crash, but now with -- code- page=936, not with unicode:
Exception in thread "main" java.lang.AssertionError: mdr20 value changed f=5174 t=5180 count=2995 at uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Recor d. java :134 ) at uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20. ja va:8 4) at uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile .j ava: 335) at uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:27 0) at uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuil de r.ja va:3 31) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArg sR eade r.ja va:126) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147) at uk.me.parabola.mkgmap.main.Main.main(Main.java:118)
mkgmap command: java -ea -jar mkgmap-r4809.jar --index --bounds=bounds.zip --housenumbers --code-page=936 31177013.o5m
https://files.mkgmap.org.uk/download/524/31177013.o5m
El 22/10/21 a las 9:42, svn commit escribió: > Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct > 2021 > > fix java.lang.AssertionError while building index from > unicode > tiles > mdrUnicode_v2.patch by Ticker Berkin > > http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809 > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Ticker, I can't say much about the code but I wonder why you talk about cp836 while Carlos reported to use --code-page=936. Probably just a typo in your posts? Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Ticker Berkin <rwb-mkgmap@jagit.co.uk> Gesendet: Mittwoch, 27. Oktober 2021 11:11 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Commit r4809: fix java.lang.AssertionError while building index from unicode tiles Hi There are a lot of problems relating to --index with all multi-byte character sets except Unicode 1/ Unicode sets misc flags here and there in the MDR logic. It is unclear if these are relating to fixed >1 byte, variable length, or unicode explicitly. 2/ There is some logic that works out various positions of name components in the final output encoding and this only handles single- byte or unicode. 3/ The sort/collation tables are set for cp1252, so all but western- european characters will be ignored, resulting in what should be different names existing only once in the indexing structures. The fix I'm working on for "r4809 crashes by buildung mdr" will stop this crash, but won't change any of the above. I don't know if there has ever been an attempt to make mkgmap indexing work for character sets like cp836. Ticker On Sun, 2021-10-24 at 18:04 +0100, Ticker Berkin wrote:
Hi Carlos & Gerd
Changing the default sort used when resources/sort/cp... matching the code-page doesn't exist from cp1252 to cp65001/unicode stops this crash. It probably gives a much better index.
Except in the cases where the requested codepage uses transliterations from resources/chars/ascii or latin1, I think the default sort should be unicode. I haven't yet investigated how and when these transliterations occur.
Tomorrow I'll look at reasons why the exception happens, even when the sort is discarding significant characters.
Ticker
On Sun, 2021-10-24 at 18:31 +0200, Carlos Dávila wrote:
The reason for using code-pages other than 65001 is that many Garmin here: https://openmtbmap.org/download/odbl/#Compatibility_-_Unicode_vs_Non_Unicode...
El 24/10/21 a las 18:14, Ticker Berkin escribió:
Hi Carlos
When mkgmap doesn't have a resources/sort for the given code page, it defaults the sort to cp1252 (Western European).
As part of building the the various indexes, it sorts counties, regions, cities, streets etc using this sort, but any characters that don't have a defined sort order are ignored in the ordering. The result of this is that, using cp1252 on Chinese, all names seem the same.
I suspect that indexes are mostly empty and find is ignoring them.
There is some logic that is differentiating the names in these structures on exact naming, and this inconsistency causes the assertion crash.
The actual output in the map image is cp836, which Basecamp and Mapsource appear to handle. I don't know how well it is supported by Garmin devices.
Is there a reason for using cp836 rather than cp65001/unicode?
Ticker
On Sun, 2021-10-24 at 16:22 +0200, Carlos Dávila wrote:
using copy from JOSM/paste into BaseCamp, I could test address searches and they seem to work.
El 23/10/21 a las 23:50, Ticker Berkin escribió:
Hi Carlos
mkgmap doesn't have a resources/sort for code-page 936 (Microsoft's character encoding for simplified Chinese). I was surprised it doesn't give any warning about this. I'll look more closely tomorrow to see what happens when it doesn't find the resource file.
I presume this didn't crash before, but did the index work?
I suspect this will have many of the same problems as unicode sort had for unspecified characters.
I'll also investigate the other change relating to collation strength.
Ticker
On Sat, 2021-10-23 at 22:26 +0200, Carlos Dávila wrote:
Hi devs.
With this new version I get a new crash, but now with -- code- page=936, not with unicode:
Exception in thread "main" java.lang.AssertionError: mdr20 value changed f=5174 t=5180 count=2995 at uk.me.parabola.imgfmt.app.mdr.Mdr5Record.setMdr20(Mdr5Recor d. java :134 ) at uk.me.parabola.imgfmt.app.mdr.Mdr20.buildFromStreets(Mdr20. ja va:8 4) at uk.me.parabola.imgfmt.app.mdr.MDRFile.writeSections(MDRFile .j ava: 335) at uk.me.parabola.imgfmt.app.mdr.MDRFile.write(MDRFile.java:27 0) at uk.me.parabola.mkgmap.combiners.MdrBuilder.onFinish(MdrBuil de r.ja va:3 31) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:690) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArg sR eade r.ja va:126) at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:147) at uk.me.parabola.mkgmap.main.Main.main(Main.java:118)
mkgmap command: java -ea -jar mkgmap-r4809.jar --index --bounds=bounds.zip --housenumbers --code-page=936 31177013.o5m
https://files.mkgmap.org.uk/download/524/31177013.o5m
El 22/10/21 a las 9:42, svn commit escribió: > Version mkgmap-r4809 was committed by gerd on Fri, 22 Oct > 2021 > > fix java.lang.AssertionError while building index from > unicode > tiles > mdrUnicode_v2.patch by Ticker Berkin > > http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4809 > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
participants (4)
-
Carlos Dávila
-
Gerd Petermann
-
svn commit
-
Ticker Berkin