
I've ran this splitter against my America extract on a centrino2 laptop with 4gb ram, below are the results. Splitter was working on ~243 million nodes and Java had 3.9 GB heap space (Xmx3900m). Memory usage before I went to bed: 4.3 GB virtual, 3.4 memory, Splitter was calculating the areas then. Hundreds of messages like these followed: Area (37.4853515625,-123.5302734375) to (38.4521484375,-122.255859375) contains 439,211 nodes split horizontally into: (37.4853515625,-123.5302734375) to (38.4521484375,-122.51953125) (159,737 nodes) and (37.4853515625,-122.51953125) to (38.4521484375,-122.255859375) (279,474 nodes) Then finally: Area (37.4853515625,-122.255859375) to (38.4521484375,-121.728515625) contains 430,424 nodes Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Integer.valueOf(Integer.java:601) at uk.me.parabola.splitter.AreaSplitter.splitVert(AreaSplitter.java:162) at uk.me.parabola.splitter.AreaSplitter.split(AreaSplitter.java:74) at uk.me.parabola.splitter.Main.calculateAreas(Main.java:175) at uk.me.parabola.splitter.Main.split(Main.java:96) at uk.me.parabola.splitter.Main.main(Main.java:79) I think it was getting close, so a gig more memory would've been enough perhaps... Chris Miller wrote:
I've now made some changes that remove the 4-area per relation limit and also the 255 tile limit. The 255 tile fix is just a workaround for now, it requires a full reprocess for each set of 255 tiles rather than tackling them all in a single pass. This will still be significantly better than only processing 4 areas at a time however.
I've made some code changes to allow more than 4 areas per relation and more than 255 tiles per split. I won't have time to commit these changes until I get home later this evening. If you want to try it out in the meantime you can download a version from here (please treat this as a very unofficial and beta-quality release!):
http://redyeti.net/splitter.jar
Here's what's changed from the version that's currently in the codestore:
- Replaced the SAX parser with XPP for modest performance and memory benefits - Improved program output to give more detail about what's going on (work in progress) - Removed limit of 4 areas per relation (no memory or performance penalty) - Removed limit of 255 areas per split. When there are more than 255 areas, multiple passes are made with up to 255 areas processed per pass
Any feedback, questions or suggestions are welcome. I haven't tried this on anything as big as North/South America yet, would be very interested to hear how it goes.
Chris
Great info, thanks! I'll try this solution, it sounds almost perfect...
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev