
I have a file that is the result of translated a shapefile of parcel boundaries obtained from MassGIS using a modified polyshp2osm. The file is in .osm format and has several thousand closed ways, each with a bunch of tags, boundary=parcel and some metadata from the shapefile. Long ago I used to just mix this in to a cloudmade massachusetts extract, split, and then compile with mkgmap. My mkgmap style file has an extra rule to map boundary=parcel to a minor depth contour, so I get think blue lines for the parcel boundaries. At some point, splitter started having trouble with this file. I tried again, with just "stow-lots.osm", and got the following, with splitter just sitting there consuming CPU. I can provide the file to anyone who wants to debug this, but I'm guessing the backtrace might be enough to spot the issue. Thanks, Greg ---------------------------------------- cache= description= geonames-file= legacy-mode=false mapid=63240001 max-areas=255 max-nodes=1000000 max-threads=8 (auto) mixed=false no-trim=false output=pbf output-dir= overlap=2000 resolution=13 split-file= status-freq=120 write-kml= Elapsed time: 0s Memory: Current 81MB (2MB used, 79MB free) Max 3055MB Time started: Tue Nov 22 19:31:23 EST 2011 Map is being split for resolution 13: - area boundaries are aligned to 0x800 map units - areas are multiples of 0x1000 map units wide and high Processing stow-lots.osm in 1 file Time: Tue Nov 22 19:31:23 EST 2011 Exact map coverage is (42.39023208618164,-71.55908346176147) to (42.46644973754883,-71.46432638168335) Trimmed and rounded map coverage is (42.4072265625,-71.5869140625) to (42.4951171875,-71.4990234375) Splitting nodes into areas containing a maximum of 1,000,000 nodes each... Area (42.4072265625,-71.5869140625) to (42.4951171875,-71.4990234375) contains 12,291 nodes. DONE! 1 areas: Area 63240001 covers (0x1e2800,0xffcd1800) to (0x1e3800,0xffcd2800) Writing out split osm files Tue Nov 22 19:31:23 EST 2011 Processing 1 areas in a single pass (42.4072265625,-71.5869140625) to (42.4951171875,-71.4990234375) Starting pass 1 of 1, processing 1 areas (63240001 to 63240001) Making SparseMultiMap Making SparseMultiMap Processing stow-lots.osm Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -3473 at it.unimi.dsi.fastutil.longs.LongArrayList.getLong(LongArrayList.java:231) at it.unimi.dsi.fastutil.longs.AbstractLongList.get(AbstractLongList.java:403) at uk.me.parabola.splitter.SparseInt2ShortMapInline.containsKey(SparseInt2ShortMapInline.java:112) at uk.me.parabola.splitter.SparseInt2ShortMultiMap$Inner.put(SparseInt2ShortMultiMap.java:78) at uk.me.parabola.splitter.SparseInt2ShortMultiMap.put(SparseInt2ShortMultiMap.java:31) at uk.me.parabola.splitter.SplitProcessor.writeNode(SplitProcessor.java:209) at uk.me.parabola.splitter.SplitProcessor.processNode(SplitProcessor.java:118) at uk.me.parabola.splitter.OSMParser.endElement(OSMParser.java:243) at uk.me.parabola.splitter.AbstractXppParser.parse(AbstractXppParser.java:57) at uk.me.parabola.splitter.Main.processMap(Main.java:412) at uk.me.parabola.splitter.Main.writeAreas(Main.java:368) at uk.me.parabola.splitter.Main.split(Main.java:190) at uk.me.parabola.splitter.Main.start(Main.java:118) at uk.me.parabola.splitter.Main.main(Main.java:107) Elapsed time: 1m 59s Memory: Current 81MB (3MB used, 78MB free) Max 3055MB Elapsed time: 3m 59s Memory: Current 81MB (3MB used, 78MB free) Max 3055MB Elapsed time: 5m 59s Memory: Current 81MB (3MB used, 78MB free) Max 3055MB Elapsed time: 7m 59s Memory: Current 81MB (3MB used, 78MB free) Max 3055MB Elapsed time: 9m 59s Memory: Current 81MB (3MB used, 78MB free) Max 3055MB Elapsed time: 11m 59s Memory: Current 81MB (3MB used, 78MB free) Max 3055MB Elapsed time: 13m 59s Memory: Current 81MB (3MB used, 78MB free) Max 3055MB Elapsed time: 15m 59s Memory: Current 81MB (3MB used, 78MB free) Max 3055MB ***** Full GC ***** Elapsed time: 18m 0s Memory: Current 81MB (1MB used, 80MB free) Max 3055MB Elapsed time: 20m 0s Memory: Current 81MB (2MB used, 79MB free) Max 3055MB Elapsed time: 22m 0s Memory: Current 81MB (2MB used, 79MB free) Max 3055MB Elapsed time: 24m 0s Memory: Current 81MB (2MB used, 79MB free) Max 3055MB Elapsed time: 26m 0s Memory: Current 81MB (2MB used, 79MB free) Max 3055MB

Hello Greg, the program seems to find only 12,291 nodes. Maybe you need parameter --mixed for this. Anyway, I'd like to analyse it to find out if the patched version r190 can handle it, so please send it to me. ciao, Gerd -- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.

GerdP <gpetermann_muenchen@hotmail.com> writes:
the program seems to find only 12,291 nodes. Maybe you need parameter --mixed for this. Anyway, I'd like to analyse it to find out if the patched version r190 can handle it, so please send it to me.
(I've sent it in private mail.) Is --mixed only on the branch that contains r190? I had tried with the latest trunk. The file has 113743 nodes and ways, numbered -1 to -113743, of which 2603 are ways and 111137 are nodes. This seems to be the convention for converting to osm before adding into josm to upload. Does the splitter code perhaps assume non-negative ids?

Hello Greg, parameter mixed is listed in your log, so your version supports it. Anyhow, you are right. Splitter r185 assumes IDs to be in a range between 1 and 2.000.000. I think r190 will not work as well with negative numbers, so that's something to be done. Gerd -- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Greg, it seems that splitter r123 was the last one that was able to handle this data. All later releases assume non-negative IDs. I'll try to find a fix for that in the memory_optimization branch. Gerd -- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.

GerdP <gpetermann_muenchen@hotmail.com> writes:
it seems that splitter r123 was the last one that was able to handle this data. All later releases assume non-negative IDs.
I'll try to find a fix for that in the memory_optimization branch.
Thanks very much (and to wanmil for committing) I can confirm that after switching to the memory_optimization branch, I can split the parcel data file and the resulting map looks fine in RoadTrip. (This is using Charlie's TYP and style, plus a rule to map boundary=parcel to a minor depth countour.)

Hello Greg, thanks for the feedback. I wasn't able to test it with the small sample because it covered a too small area. Gerd -- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.

GerdP <gpetermann_muenchen@hotmail.com> writes:
thanks for the feedback. I wasn't able to test it with the small sample because it covered a too small area.
Actually that file is all I use for parcel data (just my town) so far. But I did a splitter run with the town parcel file plus the 6 new england states as pbf (from geofabrik), and from that built a map that seems to work fine, in RoadTrip and on an Oregon 450.

Hello, http://gis.638310.n2.nabble.com/file/n7029263/memory_v7.patch memory_v7.patch this patch for the memory_optimization branch solves the problem with negative IDs in OSM data. Changes: - SparseLong2ShortMapInline allows now all possible values in a long - SplitProcessor uses a grid to reduce the number of nodeBelongsToThisArea() calls (this was discussed earlyer by Steve and Scott Crosby). This improves performance when splitting europe or other large files. Gerd -- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hello,
http://gis.638310.n2.nabble.com/file/n7029263/memory_v7.patch memory_v7.patch
this patch for the memory_optimization branch solves the problem with negative IDs in OSM data. Changes: - SparseLong2ShortMapInline allows now all possible values in a long - SplitProcessor uses a grid to reduce the number of nodeBelongsToThisArea() calls (this was discussed earlyer by Steve and Scott Crosby). This improves performance when splitting europe or other large files.
Gerd
Thanks Gerd! commited to r192. WanMil

Hello WanMil, during the last days I tried a lot of different techniques to further improve memory usage, but nothing really improved the throughput of splitter. The default settings are now so well tuned that the only situation where --optimize-mem=true really helped was when spttiting rather small files (pbf with 100MB or less). But these small files are never a problem regarding memory, and I guess that most users that want to split large files (europe or so) have machines with 8GB or more. So, the attached patch removes all code handling the --optimize-mem parameter to remove useless complex code, keeping only those changes that really improve something. I've also added code to the unit tests and small correction to the now unused SparseInt2ShortMap implementations. Ciao, Gerd

Hi, I've committed it. Do you think it's ready to merge back to trunk if noone complains within the next week? WanMil
Hello WanMil,
during the last days I tried a lot of different techniques to further improve memory usage, but nothing really improved the throughput of splitter. The default settings are now so well tuned that the only situation where --optimize-mem=true really helped was when spttiting rather small files (pbf with 100MB or less). But these small files are never a problem regarding memory, and I guess that most users that want to split large files (europe or so) have machines with 8GB or more.
So, the attached patch removes all code handling the --optimize-mem parameter to remove useless complex code, keeping only those changes that really improve something. I've also added code to the unit tests and small correction to the now unused SparseInt2ShortMap implementations.
Ciao, Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi, yes, it's ready and I do not plan further changes. ciao, Gerd
Date: Wed, 30 Nov 2011 20:46:26 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [Patch v8] splitter optimization
Hi,
I've committed it. Do you think it's ready to merge back to trunk if noone complains within the next week?
WanMil
Hello WanMil,
during the last days I tried a lot of different techniques to further improve memory usage, but nothing really improved the throughput of splitter. The default settings are now so well tuned that the only situation where --optimize-mem=true really helped was when spttiting rather small files (pbf with 100MB or less). But these small files are never a problem regarding memory, and I guess that most users that want to split large files (europe or so) have machines with 8GB or more.
So, the attached patch removes all code handling the --optimize-mem parameter to remove useless complex code, keeping only those changes that really improve something. I've also added code to the unit tests and small correction to the now unused SparseInt2ShortMap implementations.
Ciao, Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi, On Wed, Nov 30, Gerd Petermann wrote:
Hi,
yes, it's ready and I do not plan further changes.
somewhere there is now a bug in splitter. With r191 and r193 I get this one if I try to split the SRTM data for DACH area: 120.000.000 nodes processed... id=240833204 130.000.000 nodes processed... id=359663733 1.000.000 ways processed... id=369464864 Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.get(ArrayList.java:324) at java.util.Collections$UnmodifiableList.get(Collections.java:1154) at uk.me.parabola.splitter.BinaryMapParser.parseWays(BinaryMapParser.java:95) at crosby.binary.BinaryParser.parse(BinaryParser.java:121) at crosby.binary.BinaryParser.handleBlock(BinaryParser.java:68) at crosby.binary.file.FileBlock.process(FileBlock.java:135) at crosby.binary.file.BlockInputStream.process(BlockInputStream.java:34) at uk.me.parabola.splitter.Main.processMap(Main.java:404) at uk.me.parabola.splitter.Main.calculateAreas(Main.java:288) at uk.me.parabola.splitter.Main.split(Main.java:164) at uk.me.parabola.splitter.Main.start(Main.java:119) at uk.me.parabola.splitter.Main.main(Main.java:108) This worked fine with splitter r181 and r185. Thorsten -- Thorsten Kukuk, Project Manager/Release Manager SLES SUSE LINUX Products GmbH, Maxfeldstr. 5, D-90409 Nuernberg GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)

Hello Thorsten, I think this patch should fix it. http://gis.638310.n2.nabble.com/file/n7050751/memory_v9.patch memory_v9.patch I think the chance to hit this error was 1:1000000 Gerd -- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Additional remark: The error occurs in the routine that is counting the ways while parsing the input file to calculate the size of the areas. I'd prefer to stop reading the file when the first way is found and --mixed is not specified. Instead, splitter counts all the ways (and relations) just to print some log messages about how many ways were processed. I did not find a simple way to stop the pbf reader. Maybe Scott Crosby knows one? Gerd -- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hello list, I've tried to use the gmap-mdr-branch (http://www.mkgmap.org.uk/snapshots/mkgmap-gmap-mdr-r2130.jar). With --index and --gmapsupp option I get a working gmapsupp.img-file, but no mdx-file. Without --gmapsupp-option I get the mdx-file. Cheers, Martin

Hi
I've tried to use the gmap-mdr-branch (http://www.mkgmap.org.uk/snapshots/mkgmap-gmap-mdr-r2130.jar). With --index and --gmapsupp option I get a working gmapsupp.img-file, but no mdx-file. Without --gmapsupp-option I get the mdx-file.
This is related to my previous post to the list. The _mdr.img and .mdx files are related and are created together. With the --gmapsupp option, the index is created inside the gmapsupp file instead of externally. I'll think about how both could be created. ..Steve

On 2011-12-01 21:21, Steve Ratcliffe wrote:
The _mdr.img and .mdx files are related and are created together. With the --gmapsupp option, the index is created inside the gmapsupp file instead of externally.
I'll think about how both could be created.
..Steve _______________________________________________
This is a first cut of getting mkgmap to generate a searchable street index without having to rely on MapSource is it? Steve

On 02/12/11 11:07, Steve Hosgood wrote:
This is a first cut of getting mkgmap to generate a searchable street index without having to rely on MapSource is it?
Yes, it creates the index directly in the gmapsupp.img without having to use MapSource. ..Steve

Thanks! Commited: r194 WanMil
Hello Thorsten,
I think this patch should fix it.
http://gis.638310.n2.nabble.com/file/n7050751/memory_v9.patch memory_v9.patch
I think the chance to hit this error was 1:1000000
Gerd

On Thu, Dec 01, GerdP wrote:
Hello Thorsten,
I think this patch should fix it.
http://gis.638310.n2.nabble.com/file/n7050751/memory_v9.patch memory_v9.patch
yes, it seems to work now. Thanks, Thorsten -- Thorsten Kukuk, Project Manager/Release Manager SLES SUSE LINUX Products GmbH, Maxfeldstr. 5, D-90409 Nuernberg GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)

In order to not get the merge lost ... I've been using the patched splitter all along. Would be nice to see the trunk merged with the memory optimization branch... On 30.11.2011 20:46, WanMil wrote:
Hi,
I've committed it. Do you think it's ready to merge back to trunk if noone complains within the next week?
WanMil
Hello WanMil,
during the last days I tried a lot of different techniques to further improve memory usage, but nothing really improved the throughput of splitter. The default settings are now so well tuned that the only situation where --optimize-mem=true really helped was when spttiting rather small files (pbf with 100MB or less). But these small files are never a problem regarding memory, and I guess that most users that want to split large files (europe or so) have machines with 8GB or more.
So, the attached patch removes all code handling the --optimize-mem parameter to remove useless complex code, keeping only those changes that really improve something. I've also added code to the unit tests and small correction to the now unused SparseInt2ShortMap implementations.
Ciao, Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi, I've merged back the memory_optmization branch. Please check if the merge is correct because I got some conflicts (and I am not a merge expert...). Have fun! WanMil
In order to not get the merge lost ... I've been using the patched splitter all along. Would be nice to see the trunk merged with the memory optimization branch...
On 30.11.2011 20:46, WanMil wrote:
Hi,
I've committed it. Do you think it's ready to merge back to trunk if noone complains within the next week?
WanMil
Hello WanMil,
during the last days I tried a lot of different techniques to further improve memory usage, but nothing really improved the throughput of splitter. The default settings are now so well tuned that the only situation where --optimize-mem=true really helped was when spttiting rather small files (pbf with 100MB or less). But these small files are never a problem regarding memory, and I guess that most users that want to split large files (europe or so) have machines with 8GB or more.
So, the attached patch removes all code handling the --optimize-mem parameter to remove useless complex code, keeping only those changes that really improve something. I've also added code to the unit tests and small correction to the now unused SparseInt2ShortMap implementations.
Ciao, Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello Wanmil, no, the merged version is missing a correction. I'll look at it more closely tomorrow, today I have no more time. Gerd -- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hello WanMil, forget my previous post. It was my working copy on the netbook that was out-aged. Splitter r198 looks good. ciao, Gerd GerdP wrote
Hello Wanmil,
no, the merged version is missing a correction. I'll look at it more closely tomorrow, today I have no more time.
Gerd
-- View this message in context: http://gis.638310.n2.nabble.com/failure-of-splitter-on-parcel-data-tp7022818... Sent from the Mkgmap Development mailing list archive at Nabble.com.
participants (9)
-
Felix Hartmann
-
Gerd Petermann
-
GerdP
-
Greg Troxel
-
Martin
-
Steve Hosgood
-
Steve Ratcliffe
-
Thorsten Kukuk
-
WanMil