Memory limits for mkgmap and splitter

Hi, I used to process the europe.osm (from geofabrik.de) file using splitter. It does not work (in finited time) any more because of the 4GB limit of my hardware RAM. I have a suggestion, which comes from my experience years ago, where I had to crunch gigabytes of textual data using PERL. In the old days I had written some scripts, which used associative arrays for data storage. I had to have access to the whole data "at once" for report generation. The trick was to test the processing on small amount of data and for the big run "tie" the "associative array" functionality to BerkeleyDB. The processing was slow, but possible without spending any more time on searching for other solution. Would it be possible to allow an option, where data storage instead of internal structures would be stored partly on disk and partly in database engine cache (see BerkeleyDB implementation). I am aware, that the processing will be 10 or 100 times slower, but will not take my workstation down grabbing all RAM and swap space and could do that in background leaving me substantial amount of RAM and one CPU core free for other tasks. In PERL it is builtin "tie" functionality, I have, however, no idea what is the used data structure in mkgmap and splitter and how to translate the trick into Java. If you think that the change is trivial and point me to the critical section I might see if I get it implemented. Any ideas? Paul -- Don't take life too seriously; you will never get out of it alive. -- Elbert Hubbard

Just to get some figures to talk about: On an 8 core machine with 8 gb ram it takes about 4 hours to split North/South America en about 1 hour to split the rest of the world. The pre-splitting of the planet into those two large sections is done using Osmosis in my case. So using a 10x processing time the splitting process still could be considered 'workable', but 20 to 100x processing time is not usable in production environments where weekly updates are applied. Getting some more RAM into your machine might be the easiest solution as long as the chipset/bios allows it (some cheap AMD chipsets/motherboards allow even 16GB ram). Ram is cheap these days. Paul Ortyl wrote:
Hi,
I used to process the europe.osm (from geofabrik.de) file using splitter. It does not work (in finited time) any more because of the 4GB limit of my hardware RAM.
I have a suggestion, which comes from my experience years ago, where I had to crunch gigabytes of textual data using PERL. In the old days I had written some scripts, which used associative arrays for data storage. I had to have access to the whole data "at once" for report generation. The trick was to test the processing on small amount of data and for the big run "tie" the "associative array" functionality to BerkeleyDB. The processing was slow, but possible without spending any more time on searching for other solution.
Would it be possible to allow an option, where data storage instead of internal structures would be stored partly on disk and partly in database engine cache (see BerkeleyDB implementation). I am aware, that the processing will be 10 or 100 times slower, but will not take my workstation down grabbing all RAM and swap space and could do that in background leaving me substantial amount of RAM and one CPU core free for other tasks.
In PERL it is builtin "tie" functionality, I have, however, no idea what is the used data structure in mkgmap and splitter and how to translate the trick into Java. If you think that the change is trivial and point me to the critical section I might see if I get it implemented.
Any ideas?
Paul

2009/7/1 Lambertus <osm@na1400.info>:
Just to get some figures to talk about: On an 8 core machine with 8 gb ram it takes about 4 hours to split North/South America en about 1 hour to split the rest of the world. The pre-splitting of the planet into those two large sections is done using Osmosis in my case.
So using a 10x processing time the splitting process still could be considered 'workable', but 20 to 100x processing time is not usable in production environments where weekly updates are applied. Getting some more RAM into your machine might be the easiest solution as long as the chipset/bios allows it (some cheap AMD chipsets/motherboards allow even 16GB ram). Ram is cheap these days.
My notebook (Thinkpad T500) is already maxed out. The only help would be replacing 2x2GB with 2x4GB and that is not cheap, it would mean additional 50% of the original notebook price, otherwise I would already be done. Buying additional PC for splitter and mkgmap only is not the thing I am willing to do. I have just found some examples of the "tie" functionality for java: http://www.arachna.com/roller/spidaman/date/20060208 Paul -- Don't take life too seriously; you will never get out of it alive. -- Elbert Hubbard

Instead of converting code to use RAM/hard disk, why don't you just increase your computer's virtual memory? Whether Linux (swap space) or Windows (paging file), as long as you have spare hard drive space you can increase virtual memory very easily, which will stop program failures due to memory limitations.

2009/7/1 WessexMario <wessexmario-osm@yahoo.co.uk>:
Instead of converting code to use RAM/hard disk, why don't you just increase your computer's virtual memory? Whether Linux (swap space) or Windows (paging file), as long as you have spare hard drive space you can increase virtual memory very easily, which will stop program failures due to memory limitations.
Been there, done that... I have no idea how JVM manages the memory, but as soon as it gets to swapping the whole system gets unresponsive (waiting a few minutes for a keystroke) and the processing takes DAYS. It means days when I cannot use the machine for anything else. I have no problem with letting the processing take a few days as long as I can still use the system for other purposes. Paul -- Don't take life too seriously; you will never get out of it alive. -- Elbert Hubbard

Hi
In PERL it is builtin "tie" functionality, I have, however, no idea what is the used data structure in mkgmap and splitter and how to translate the trick into Java. If you think that the change is trivial and point me to the critical section I might see if I get it implemented.
I was originally going to use berkeley db for the splitter and if you go back to revision 3 in subversion there is some code that uses the BerkeleyDB java edition. I gave up on it because it seemed like it would take vastly more disk space than the planet file and was much slower than I was expecting or hoping. But I didn't persevere that much before switching to the current open hash map implementation, so I'm not saying that it is unworkable. ..Steve

2009/7/1 Steve Ratcliffe <steve@parabola.demon.co.uk>:
Hi
In PERL it is builtin "tie" functionality, I have, however, no idea what is the used data structure in mkgmap and splitter and how to translate the trick into Java. If you think that the change is trivial and point me to the critical section I might see if I get it implemented.
I was originally going to use berkeley db for the splitter and if you go back to revision 3 in subversion there is some code that uses the BerkeleyDB java edition.
I gave up on it because it seemed like it would take vastly more disk space than the planet file and was much slower than I was expecting or hoping. But I didn't persevere that much before switching to the current open hash map implementation, so I'm not saying that it is unworkable.
Could you please tell me which "Map" would have to be "reimplemented"? There was a lot of changes (and file deletions) since version 3. Could you explain why you write that "There is a maximum of 255 output files. This should anyway be enough with the current amount of data." ? Thanks, Paul -- Don't take life too seriously; you will never get out of it alive. -- Elbert Hubbard

Hi
Could you please tell me which "Map" would have to be "reimplemented"? There was a lot of changes (and file deletions) since version 3.
Yes, it is completely different code. The original code (r3) did nothing except attempt to save the nodes to a berkeley db. I then gave up on it and wrote the current code that uses an open hash map.
Could you explain why you write that "There is a maximum of 255 output files. This should anyway be enough with the current amount of data." ?
Well at the time europe was about quarter of the planet file and took about 60 tiles, so probably about enough to any area of interest and perhaps even the whole planet. This was without routing which currently requires smaller tiles. Also the data is growing fast, and so it is certainly not true any more. ..Steve

Paul Ortyl wrote:
Could you explain why you write that "There is a maximum of 255 output files. This should anyway be enough with the current amount of data." ? That statement is not true with the current splitter version. When splitting North/South America I get 318 tiles.

Note that I've started working on overhauling the splitter to try to reduce the memory requirements as much as possible without compromising too much on performance. I'm also hoping to overcome the limitations on the number of tiles (255) and number of areas for each relation (currently 4). My plan is to rework some of the algorithms being used, tune the memory requirements wherever possible (though Steve's already done a good job here), and switch algorithms and page to disk (using eg custom B-Trees, R-Trees, not a database) when heap space starts to run out. So far I've cut the memory requirements for the areas.list file generation in half, and hopefully this week I'll check in a further change that slightly speeds up the XML parsing and reduces memory requirements a fraction more. Further gains beyond that will unfortunately take me a while since it is a reasonable amount of work and I don't have a lot of free time to work on this. Of course I'll be sure to post to this list when I make any significant improvements beyond the above. As an aside: If you split a file that results in more than 255 tiles, my understanding (from looking at the splitter code) is that it will fail 'silently', ie the output will be corrupt (nodes/ways/relations end up in the wrong area files) without it being obvious that this has occurred. Be careful! Chris
Paul Ortyl wrote:
Could you explain why you write that "There is a maximum of 255 output files. This should anyway be enough with the current amount of data." ? That statement is not true with the current splitter version. When splitting North/South America I get 318 tiles.

Chris Miller wrote:
Note that I've started working on overhauling the splitter to try to reduce the memory requirements as much as possible without compromising too much on performance. I'm also hoping to overcome the limitations on the number of tiles (255) and number of areas for each relation (currently 4). My plan is to rework some of the algorithms being used, tune the memory requirements wherever possible (though Steve's already done a good job here), and switch algorithms and page to disk (using eg custom B-Trees, R-Trees, not a database) when heap space starts to run out.
What currently happens when not enough memory is available is ofcourse that the heap is getting swapped in and out to disk by the os' memory manager. This is so slow that it's not workable. Do you think that doing this intelligently in Splitter will be much faster? I.e. would it actually be of any real use to switch to temporary disk storage for multiple gigabytes of data?
So far I've cut the memory requirements for the areas.list file generation in half, and hopefully this week I'll check in a further change that slightly speeds up the XML parsing and reduces memory requirements a fraction more. Further gains beyond that will unfortunately take me a while since it is a reasonable amount of work and I don't have a lot of free time to work on this. Of course I'll be sure to post to this list when I make any significant improvements beyond the above.
I assume the actual splitting uses the most memory of the two stages?
As an aside: If you split a file that results in more than 255 tiles, my understanding (from looking at the splitter code) is that it will fail 'silently', ie the output will be corrupt (nodes/ways/relations end up in the wrong area files) without it being obvious that this has occurred. Be careful!
Ah, this might explain the 'POI only' tiles I have without good reason in the NE USA and Canada. Is there a possibility for a quick fix for this behavior, as this would be most welcome...? :-)

What currently happens when not enough memory is available is ofcourse that the heap is getting swapped in and out to disk by the os' memory manager. This is so slow that it's not workable. Do you think that doing this intelligently in Splitter will be much faster? I.e. would it actually be of any real use to switch to temporary disk storage for multiple gigabytes of data?
Definitely. Having knowledge of the algorithm in use and having control over exactly what is written out and when will make a huge difference since care can be taken to ensure disk access is kept as low as possible. It's still going to come at a cost, but it should be much less than letting the OS try and guess what the best thing to do is.
I assume the actual splitting uses the most memory of the two stages?
Yes, generating areas.list takes less than half the memory of the splitting stage (with the latest code). It's tricky to reduce the second stage memory usage much further, so swapping some information to disk is one of the few alternatives. Multiple parsing runs could be made over the planet file instead however I can't see that offering very good performance.
Ah, this might explain the 'POI only' tiles I have without good reason in the NE USA and Canada. Is there a possibility for a quick fix for this behavior, as this would be most welcome...? :-)
Not really... currently the limitation is that each node, way and relation can only belong to a maximum of 4 areas. This is because during the split each area is given an ID from 1-255 (8 bits) and up to 4 areas are squeezed into a single 32 bit integer for each node/way/relation. This is really important to save memory for the nodes and ways, so to support more than 255 areas would require a lot more memory. What happens with > 255 areas is that multiple areas map to the same 8 bits and so get mixed up with each other without warning - there's no bounds checking on the bit manipulation. About the only 'quick fix' is to refuse to split more than 255 areas, or to split them in multiple full passes which will take significantly longer. In the meantime your best bet is to split your areas.list file into two by hand, making sure there's < 256 areas in each. Then run the splitter twice, once for each area file you created. Currently there's another limitation in that a relation can only appear in a maximum of 4 areas - hence all the "Relation 123 in too many areas" message you probably see when splitting. The result of this is that the relation only gets written to the first 4 areas it encounters and won't appear in any additional ones. I think I can fix this one fairly easily, I'll take a look in the next few days. Chris

Chris Miller wrote:
Ah, this might explain the 'POI only' tiles I have without good reason in the NE USA and Canada. Is there a possibility for a quick fix for this behavior, as this would be most welcome...? :-)
Not really... currently the limitation is that each node, way and relation can only belong to a maximum of 4 areas. This is because during the split each area is given an ID from 1-255 (8 bits) and up to 4 areas are squeezed into a single 32 bit integer for each node/way/relation. This is really important to save memory for the nodes and ways, so to support more than 255 areas would require a lot more memory. What happens with > 255 areas is that multiple areas map to the same 8 bits and so get mixed up with each other without warning - there's no bounds checking on the bit manipulation. About the only 'quick fix' is to refuse to split more than 255 areas, or to split them in multiple full passes which will take significantly longer. In the meantime your best bet is to split your areas.list file into two by hand, making sure there's < 256 areas in each. Then run the splitter twice, once for each area file you created.
Crap, this means I'll have to split North America into two sections using Osmosis. At ground level this means broken routing between the sections and that some cities/villages will be divided into two... Thanks for the thorough explanation.
Currently there's another limitation in that a relation can only appear in a maximum of 4 areas - hence all the "Relation 123 in too many areas" message you probably see when splitting. The result of this is that the relation only gets written to the first 4 areas it encounters and won't appear in any additional ones. I think I can fix this one fairly easily, I'll take a look in the next few days.
That'll be great. Tia.

Crap, this means I'll have to split North America into two sections using Osmosis. At ground level this means broken routing between the sections and that some cities/villages will be divided into two...
I think you may have misunderstood me here. You can run the splitter against as big an osm file as you like, and the areas.list file that it generates should still be correct. The problem doesn't happen until the second stage of the split. So to be perfectly clear, I think if you do the following you should be OK: 1) run the splitter against all of North America (or whatever). Once the areas.list file has been generated, hit Ctrl+C to kill the splitter. 2) create two (or more) files by hand, areas1.list and areas2.list, each with less than 256 areas in them. Together these contain all the areas from areas.list. 3) run the splitter again, giving it areas1.list as input. This should create osm files for just the areas in that areas1.list file. 4) run the splitter again, this time with areas2.list as a parameter. This should create the remaining osm files. I haven't tried this myself and it's possible you'll run into some hurdles along the way though off the top of my head I can't think of any. If you do hit a problem please let me know and I'll see if I can create a fix so you can manually work around the 255 area limitation using the above steps until I have something more robust in place. Chris

this works perfect, I have used it to get all relations into the split files. total split is very slow because each run extracts 4 tiles only. but compute time is cheap :) your improvements will be a big step forward On Aug 4, 2009, at 10:06 AM, Chris Miller wrote:
Crap, this means I'll have to split North America into two sections using Osmosis. At ground level this means broken routing between the sections and that some cities/villages will be divided into two...
I think you may have misunderstood me here. You can run the splitter against as big an osm file as you like, and the areas.list file that it generates should still be correct. The problem doesn't happen until the second stage of the split.
So to be perfectly clear, I think if you do the following you should be OK:
1) run the splitter against all of North America (or whatever). Once the areas.list file has been generated, hit Ctrl+C to kill the splitter. 2) create two (or more) files by hand, areas1.list and areas2.list, each with less than 256 areas in them. Together these contain all the areas from areas.list. 3) run the splitter again, giving it areas1.list as input. This should create osm files for just the areas in that areas1.list file. 4) run the splitter again, this time with areas2.list as a parameter. This should create the remaining osm files.
I haven't tried this myself and it's possible you'll run into some hurdles along the way though off the top of my head I can't think of any. If you do hit a problem please let me know and I'll see if I can create a fix so you can manually work around the 255 area limitation using the above steps until I have something more robust in place.
Chris
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Great info, thanks! I'll try this solution, it sounds almost perfect... Chris Miller wrote:
Crap, this means I'll have to split North America into two sections using Osmosis. At ground level this means broken routing between the sections and that some cities/villages will be divided into two...
I think you may have misunderstood me here. You can run the splitter against as big an osm file as you like, and the areas.list file that it generates should still be correct. The problem doesn't happen until the second stage of the split.
So to be perfectly clear, I think if you do the following you should be OK:
1) run the splitter against all of North America (or whatever). Once the areas.list file has been generated, hit Ctrl+C to kill the splitter. 2) create two (or more) files by hand, areas1.list and areas2.list, each with less than 256 areas in them. Together these contain all the areas from areas.list. 3) run the splitter again, giving it areas1.list as input. This should create osm files for just the areas in that areas1.list file. 4) run the splitter again, this time with areas2.list as a parameter. This should create the remaining osm files.
I haven't tried this myself and it's possible you'll run into some hurdles along the way though off the top of my head I can't think of any. If you do hit a problem please let me know and I'll see if I can create a fix so you can manually work around the 255 area limitation using the above steps until I have something more robust in place.
Chris
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

I've now made some changes that remove the 4-area per relation limit and also the 255 tile limit. The 255 tile fix is just a workaround for now, it requires a full reprocess for each set of 255 tiles rather than tackling them all in a single pass. This will still be significantly better than only processing 4 areas at a time however. I've made some code changes to allow more than 4 areas per relation and more than 255 tiles per split. I won't have time to commit these changes until I get home later this evening. If you want to try it out in the meantime you can download a version from here (please treat this as a very unofficial and beta-quality release!): http://redyeti.net/splitter.jar Here's what's changed from the version that's currently in the codestore: - Replaced the SAX parser with XPP for modest performance and memory benefits - Improved program output to give more detail about what's going on (work in progress) - Removed limit of 4 areas per relation (no memory or performance penalty) - Removed limit of 255 areas per split. When there are more than 255 areas, multiple passes are made with up to 255 areas processed per pass Any feedback, questions or suggestions are welcome. I haven't tried this on anything as big as North/South America yet, would be very interested to hear how it goes. Chris
Great info, thanks! I'll try this solution, it sounds almost perfect...

Thanks for your work on this, much appreciated! I'm trying to get a grip on how to use this version of splitter and re-read your mail a few times. I hope I understand it: So this version should eliminate the need to split the planet with Osmosis before giving it to splitter. Just provide the planet and splitter will define around 500 tiles and reprocess the planet a few times to get all the tiles extracted. I'll start a test run with the latest planet tonight. If it doesn't work on my 4GB machine then I'll try the America's extract made with Osmosis and report back. Chris Miller wrote:
I've now made some changes that remove the 4-area per relation limit and also the 255 tile limit. The 255 tile fix is just a workaround for now, it requires a full reprocess for each set of 255 tiles rather than tackling them all in a single pass. This will still be significantly better than only processing 4 areas at a time however.
I've made some code changes to allow more than 4 areas per relation and more than 255 tiles per split. I won't have time to commit these changes until I get home later this evening. If you want to try it out in the meantime you can download a version from here (please treat this as a very unofficial and beta-quality release!):
http://redyeti.net/splitter.jar
Here's what's changed from the version that's currently in the codestore:
- Replaced the SAX parser with XPP for modest performance and memory benefits - Improved program output to give more detail about what's going on (work in progress) - Removed limit of 4 areas per relation (no memory or performance penalty) - Removed limit of 255 areas per split. When there are more than 255 areas, multiple passes are made with up to 255 areas processed per pass
Any feedback, questions or suggestions are welcome. I haven't tried this on anything as big as North/South America yet, would be very interested to hear how it goes.

My latest changes were just quick fixes to get relations and 256+ tiles working, there's still more that needs to be done before this is really useful. In particular, I need to prevent the splitter from hanging on to node and way information that it doesn't need for a given set of 255 areas. In other words, currently the memory requirements will still be very large and you don't stand much chance of splitting the planet file. I'll do some further work to try and reduce this, I think there's a couple of easy wins. You might be able to split America though, if you do attempt it please let me know how it goes.
Thanks for your work on this, much appreciated!
I'm trying to get a grip on how to use this version of splitter and re-read your mail a few times. I hope I understand it:
So this version should eliminate the need to split the planet with Osmosis before giving it to splitter. Just provide the planet and splitter will define around 500 tiles and reprocess the planet a few times to get all the tiles extracted.
I'll start a test run with the latest planet tonight. If it doesn't work on my 4GB machine then I'll try the America's extract made with Osmosis and report back.

I've built a new version that *might* be able to handle the planet OK. I don't know how many areas North America breaks in to, but if you're able to handle 255 areas (at 1,600,000 nodes each) with an older version of the splitter, then I think this version should work for the whole planet: http://redyeti.net/splitter.jar If you still can't get it to work but are able to at least generate areas.list, then you could try reducing the number of nodes per area since that will directly reduce the amount of memory required when the split files are being written out. Assuming no one finds any serious problems with this build, I'll get it checked in and released properly so everyone can benefit. Regards, Chris
My latest changes were just quick fixes to get relations and 256+ tiles working, there's still more that needs to be done before this is really useful. In particular, I need to prevent the splitter from hanging on to node and way information that it doesn't need for a given set of 255 areas. In other words, currently the memory requirements will still be very large and you don't stand much chance of splitting the planet file. I'll do some further work to try and reduce this, I think there's a couple of easy wins.
You might be able to split America though, if you do attempt it please let me know how it goes.
Thanks for your work on this, much appreciated!
I'm trying to get a grip on how to use this version of splitter and re-read your mail a few times. I hope I understand it:
So this version should eliminate the need to split the planet with Osmosis before giving it to splitter. Just provide the planet and splitter will define around 500 tiles and reprocess the planet a few times to get all the tiles extracted.
I'll start a test run with the latest planet tonight. If it doesn't work on my 4GB machine then I'll try the America's extract made with Osmosis and report back.

Tested this latest version on my American extract with -Xmx4000m: With 1.2 million nodes the Java VM crashed due to lack of memory. Using 1 million nodes the split succeeded with 367 areas in 3:20 hours. Some swapping was noticed (bad for speed). Although I'd rather use the 1.2 million setting, as that is a nice balance between the number of failed map builds and map size worldwide, using 1 million now enables me to do all the work myself instead of relying on others. Great, thanks a million! Next up: splitting the whole world on my laptop and process the output with mkgmap. See which areas need fixing manually... Chris Miller wrote:
I've built a new version that *might* be able to handle the planet OK. I don't know how many areas North America breaks in to, but if you're able to handle 255 areas (at 1,600,000 nodes each) with an older version of the splitter, then I think this version should work for the whole planet:
http://redyeti.net/splitter.jar
If you still can't get it to work but are able to at least generate areas.list, then you could try reducing the number of nodes per area since that will directly reduce the amount of memory required when the split files are being written out.
Assuming no one finds any serious problems with this build, I'll get it checked in and released properly so everyone can benefit.

Thanks for the info. Sounds like you've found about the same limit I have with --max-nodes and -Xmx4000m. This weekend I'm going to add a --max-areas parameter. Setting this to a number < 255 should allow for higher a --max-nodes value with the same heap size. In your case with 367 areas, setting --max-areas=184 should allow for a higher max-nodes setting without any extra passes required.
Tested this latest version on my American extract with -Xmx4000m:
With 1.2 million nodes the Java VM crashed due to lack of memory. Using 1 million nodes the split succeeded with 367 areas in 3:20 hours. Some swapping was noticed (bad for speed). Although I'd rather use the 1.2 million setting, as that is a nice balance between the number of failed map builds and map size worldwide, using 1 million now enables me to do all the work myself instead of relying on others. Great, thanks a million!
Next up: splitting the whole world on my laptop and process the output with mkgmap. See which areas need fixing manually...

Chris Miller wrote:
Thanks for the info. Sounds like you've found about the same limit I have with --max-nodes and -Xmx4000m. This weekend I'm going to add a --max-areas parameter. Setting this to a number < 255 should allow for higher a --max-nodes value with the same heap size. In your case with 367 areas, setting --max-areas=184 should allow for a higher max-nodes setting without any extra passes required.
Yes, that's a welcome addition. While you're on it, dare I ask to check if it's easy to do some stuff in different threads? I notice that one core is at 100% almost constantly while the harddisk is still far from being maxed-out. There should be room to further speed things up a bit.

While you're on it, dare I ask to check if it's easy to do some stuff in different threads? I notice that one core is at 100% almost constantly while the harddisk is still far from being maxed-out. There should be room to further speed things up a bit.
I've got 4 cores (8 with hyperthreading) so this is something I'm acutely aware of. Watching my PC churn away at only 12.5% CPU for a few hours isn't my idea or resources well spent! Unfortunately there's no quick win because the XML parsing is very linear, but I have already been considering various options and certainly plan to give them a try at some point. I've got quite a few other improvements planned that'll come first though.

Great, I'll keep an impatient eye at the mailinglist ;-) Chris Miller wrote:
While you're on it, dare I ask to check if it's easy to do some stuff in different threads? I notice that one core is at 100% almost constantly while the harddisk is still far from being maxed-out. There should be room to further speed things up a bit.
I've got 4 cores (8 with hyperthreading) so this is something I'm acutely aware of. Watching my PC churn away at only 12.5% CPU for a few hours isn't my idea or resources well spent! Unfortunately there's no quick win because the XML parsing is very linear, but I have already been considering various options and certainly plan to give them a try at some point. I've got quite a few other improvements planned that'll come first though.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

On Fri, Aug 7, 2009 at 2:56 PM, Chris Miller<chris.miller@kbcfp.com> wrote:
I've got 4 cores (8 with hyperthreading) so this is something I'm acutely aware of. Watching my PC churn away at only 12.5% CPU for a few hours isn't my idea or resources well spent! Unfortunately there's no quick win because the XML parsing is very linear, but I have already been considering various options and certainly plan to give them a try at some point. I've got quite a few other improvements planned that'll come first though.
Just out of interest, what performance gains (or disadvantages) would there be to working with uncompressed files, instead of bz2 and gz files? Would this be faster for those of us with copious amounts of disk space, or would the extra IO negate any CPU-related performance gains? I know that Osmosis performance on multi-core systems can apparently be improved by piping the OSM file through a decompression program, but I assume that would not be practical for Splitter which must make several passes through the file. Cheers.

Before patches I gained about 5% by extrackting witz 7z unpacker on commandline, before running the splitter (however compiled 10 days ago, so excluding the lates changes). 2009/8/7 Clinton Gladstone <clinton.gladstone@googlemail.com>
On Fri, Aug 7, 2009 at 2:56 PM, Chris Miller<chris.miller@kbcfp.com> wrote:
I've got 4 cores (8 with hyperthreading) so this is something I'm acutely aware of. Watching my PC churn away at only 12.5% CPU for a few hours isn't my idea or resources well spent! Unfortunately there's no quick win because the XML parsing is very linear, but I have already been considering various options and certainly plan to give them a try at some point. I've got quite a few other improvements planned that'll come first though.
Just out of interest, what performance gains (or disadvantages) would there be to working with uncompressed files, instead of bz2 and gz files?
Would this be faster for those of us with copious amounts of disk space, or would the extra IO negate any CPU-related performance gains?
I know that Osmosis performance on multi-core systems can apparently be improved by piping the OSM file through a decompression program, but I assume that would not be practical for Splitter which must make several passes through the file.
Cheers. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

bz2 is *very* slow to decompress, so yes if you have the space I'd recommend decompressing the osm first before running the splitter (since the splitter has to make a minimum of two passes over the file, thus also decompressing it at least twice). The (limited and simple) benchmarks I tried with .bz2 vs .osm showed that .bz2 splitting takes ~6 times longer than an uncompressed .osm file. As for gz - it is quite a lot faster to deal with than bz2 though I haven't done any benchmarking with it as far as the splitter is concerned. My guess is that uncompressed will still win out unless you have fairly slow disks and a very fast CPU. Interestingly, it's theoretically possible to parallelise bz2 compression/decompression algorithm to give an almost linear performance improvements per core. Implementing this would be a big job but on a 4+ core machine would make a pretty significant difference. It's on my todo list but please don't hold your breath! Chris
Just out of interest, what performance gains (or disadvantages) would there be to working with uncompressed files, instead of bz2 and gz files?
Would this be faster for those of us with copious amounts of disk space, or would the extra IO negate any CPU-related performance gains?
I know that Osmosis performance on multi-core systems can apparently be improved by piping the OSM file through a decompression program, but I assume that would not be practical for Splitter which must make several passes through the file.
Cheers.

I've ran this splitter against my America extract on a centrino2 laptop with 4gb ram, below are the results. Splitter was working on ~243 million nodes and Java had 3.9 GB heap space (Xmx3900m). Memory usage before I went to bed: 4.3 GB virtual, 3.4 memory, Splitter was calculating the areas then. Hundreds of messages like these followed: Area (37.4853515625,-123.5302734375) to (38.4521484375,-122.255859375) contains 439,211 nodes split horizontally into: (37.4853515625,-123.5302734375) to (38.4521484375,-122.51953125) (159,737 nodes) and (37.4853515625,-122.51953125) to (38.4521484375,-122.255859375) (279,474 nodes) Then finally: Area (37.4853515625,-122.255859375) to (38.4521484375,-121.728515625) contains 430,424 nodes Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Integer.valueOf(Integer.java:601) at uk.me.parabola.splitter.AreaSplitter.splitVert(AreaSplitter.java:162) at uk.me.parabola.splitter.AreaSplitter.split(AreaSplitter.java:74) at uk.me.parabola.splitter.Main.calculateAreas(Main.java:175) at uk.me.parabola.splitter.Main.split(Main.java:96) at uk.me.parabola.splitter.Main.main(Main.java:79) I think it was getting close, so a gig more memory would've been enough perhaps... Chris Miller wrote:
I've now made some changes that remove the 4-area per relation limit and also the 255 tile limit. The 255 tile fix is just a workaround for now, it requires a full reprocess for each set of 255 tiles rather than tackling them all in a single pass. This will still be significantly better than only processing 4 areas at a time however.
I've made some code changes to allow more than 4 areas per relation and more than 255 tiles per split. I won't have time to commit these changes until I get home later this evening. If you want to try it out in the meantime you can download a version from here (please treat this as a very unofficial and beta-quality release!):
http://redyeti.net/splitter.jar
Here's what's changed from the version that's currently in the codestore:
- Replaced the SAX parser with XPP for modest performance and memory benefits - Improved program output to give more detail about what's going on (work in progress) - Removed limit of 4 areas per relation (no memory or performance penalty) - Removed limit of 255 areas per split. When there are more than 255 areas, multiple passes are made with up to 255 areas processed per pass
Any feedback, questions or suggestions are welcome. I haven't tried this on anything as big as North/South America yet, would be very interested to hear how it goes.
Chris
Great info, thanks! I'll try this solution, it sounds almost perfect...
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Oops, I just realize that I ran Splitter with --max-node=120k instead of 1.2 million. Dunno if this influences the test results... Lambertus wrote:
I've ran this splitter against my America extract on a centrino2 laptop with 4gb ram, below are the results.
Splitter was working on ~243 million nodes and Java had 3.9 GB heap space (Xmx3900m).
Memory usage before I went to bed: 4.3 GB virtual, 3.4 memory, Splitter was calculating the areas then.
Hundreds of messages like these followed: Area (37.4853515625,-123.5302734375) to (38.4521484375,-122.255859375) contains 439,211 nodes split horizontally into: (37.4853515625,-123.5302734375) to (38.4521484375,-122.51953125) (159,737 nodes) and (37.4853515625,-122.51953125) to (38.4521484375,-122.255859375) (279,474 nodes)
Then finally:
Area (37.4853515625,-122.255859375) to (38.4521484375,-121.728515625) contains 430,424 nodes Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Integer.valueOf(Integer.java:601) at uk.me.parabola.splitter.AreaSplitter.splitVert(AreaSplitter.java:162) at uk.me.parabola.splitter.AreaSplitter.split(AreaSplitter.java:74) at uk.me.parabola.splitter.Main.calculateAreas(Main.java:175) at uk.me.parabola.splitter.Main.split(Main.java:96) at uk.me.parabola.splitter.Main.main(Main.java:79)
I think it was getting close, so a gig more memory would've been enough perhaps...
Chris Miller wrote:
I've now made some changes that remove the 4-area per relation limit and also the 255 tile limit. The 255 tile fix is just a workaround for now, it requires a full reprocess for each set of 255 tiles rather than tackling them all in a single pass. This will still be significantly better than only processing 4 areas at a time however.
I've made some code changes to allow more than 4 areas per relation and more than 255 tiles per split. I won't have time to commit these changes until I get home later this evening. If you want to try it out in the meantime you can download a version from here (please treat this as a very unofficial and beta-quality release!):
http://redyeti.net/splitter.jar
Here's what's changed from the version that's currently in the codestore:
- Replaced the SAX parser with XPP for modest performance and memory benefits - Improved program output to give more detail about what's going on (work in progress) - Removed limit of 4 areas per relation (no memory or performance penalty) - Removed limit of 255 areas per split. When there are more than 255 areas, multiple passes are made with up to 255 areas processed per pass
Any feedback, questions or suggestions are welcome. I haven't tried this on anything as big as North/South America yet, would be very interested to hear how it goes.
Chris
Great info, thanks! I'll try this solution, it sounds almost perfect...
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Thanks for the info. Given that your run started printing out the areas being split then yes you were very very close to getting the areas.list file. After that's generated, the heap size drops right down and starts growing again on the splitting pass. The smaller --max-nodes value you used would have caused a lot more areas to be created, which will have some impact on the memory required during this first phase though I'm not sure offhand how much. Maybe that's what pushed it over the edge? One thing for sure is that a smaller --max-nodes value will significantly reduce the amount of memory required during the final splitting phase (at the cost of more time due to more areas = more parsing passes). The good news is I just managed to split the entire planet(!) with the following: java -Xmx4000m -jar splitter.jar --max-nodes=1000000 planet-090715.osm It generated 591 areas so made 3 passes over the XML during the splitting stage, and in total the run took 3 hours 32 minutes (overclocked Core i7 with 6GB RAM). Earlier I had tried the split with --max-nodes=1200000 and ran out of memory. I'll add a --max-areas parameter to limit the number of areas processed on each pass. That way you'll be able to trade off performance vs memory of the split stage a bit better using a combination of --max-nodes and --max-areas. Chris
I've ran this splitter against my America extract on a centrino2 laptop with 4gb ram, below are the results.
Splitter was working on ~243 million nodes and Java had 3.9 GB heap space (Xmx3900m).
Memory usage before I went to bed: 4.3 GB virtual, 3.4 memory, Splitter was calculating the areas then.
Hundreds of messages like these followed: Area (37.4853515625,-123.5302734375) to (38.4521484375,-122.255859375) contains 439,211 nodes split horizontally into: (37.4853515625,-123.5302734375) to (38.4521484375,-122.51953125) (159,737 nodes) and (37.4853515625,-122.51953125) to (38.4521484375,-122.255859375) (279,474 nodes) Then finally:
Area (37.4853515625,-122.255859375) to (38.4521484375,-121.728515625) contains 430,424 nodes Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Integer.valueOf(Integer.java:601) at uk.me.parabola.splitter.AreaSplitter.splitVert(AreaSplitter.java:162) at uk.me.parabola.splitter.AreaSplitter.split(AreaSplitter.java:74) at uk.me.parabola.splitter.Main.calculateAreas(Main.java:175) at uk.me.parabola.splitter.Main.split(Main.java:96) at uk.me.parabola.splitter.Main.main(Main.java:79) I think it was getting close, so a gig more memory would've been enough perhaps...

Multiple parsing runs could be made over the planet file instead however I can't see that offering very good performance.
Just an thought from reading the thread: Multiple parsing runs with an bz2 zipped file could do worse to the performance. It would mean multiple decompressing of the input files. And in my experience decompressing bz2 costs a lot of resources. (In my case I'm directly using the osm.bz2 files from geofabrik as input.) Regards, Johann

Exactly, uncompressing is pretty hard on the CPU so ideally we'd want to make as few passes as possible. There are other factors too though, eg I'm thinking about doing the decompression on one thread and the parsing/splitting on another thread if there is more than one core available. I haven't yet tried to profile the splitter but once I've got rid of most of the low-hanging fruit I'll explore options like that a bit further. At this stage it's still hard to know whether additional passes are going to be better or slower than paging some of the data to disk. Chris
Just an thought from reading the thread: Multiple parsing runs with an bz2 zipped file could do worse to the performance. It would mean multiple decompressing of the input files. And in my experience decompressing bz2 costs a lot of resources. (In my case I'm directly using the osm.bz2 files from geofabrik as input.)
Regards, Johann

Johann Gail <johann.gail@gmx.de> writes:
Just an thought from reading the thread: Multiple parsing runs with an bz2 zipped file could do worse to the performance. It would mean multiple decompressing of the input files. And in my experience decompressing bz2 costs a lot of resources.
(In my case I'm directly using the osm.bz2 files from geofabrik as input.)
Perhaps true, but you should have seen the runtime from running mkgmap on input files that needed more than 1.8G per area on a machine with 2G of RAM. Redoing bz2 on the input a few times would have been fine compared to paging. I realize this is tough, as the right tuning is heavily dependent on how much memory is available.

Just an thought from reading the thread: Multiple parsing runs with an bz2 zipped file could do worse to the performance. It would mean multiple decompressing of the input files. And in my experience decompressing bz2 costs a lot of resources.
(In my case I'm directly using the osm.bz2 files from geofabrik as input.)
Perhaps true, but you should have seen the runtime from running mkgmap on input files that needed more than 1.8G per area on a machine with 2G of RAM. Redoing bz2 on the input a few times would have been fine compared to paging. I realize this is tough, as the right tuning is heavily dependent on how much memory is available.
Yes, I know it from my job. A java process starting swapping is nearly unusable. IMO the reason for it is the garbage collector, which has to read in the whole virtual memory periodically. Therefore it sometimes help to double the size of the virtual memory of the java machine (-Xmx ...), because the GC has to run less. But in general you are right. You should do everything to stop swapping of java.
participants (10)
-
Apollinaris Schoell
-
Chris Miller
-
Clinton Gladstone
-
Felix Hartmann
-
Greg Troxel
-
Johann Gail
-
Lambertus
-
Paul Ortyl
-
Steve Ratcliffe
-
WessexMario