
WanMil sent through a very nice patch for the splitter that writes out the results using multiple threads, giving a pretty big speedup during the 2nd half of the split if you have a multicore PC. For me, the time it takes to split the UK is less than half what it used to be. I've checked his patch in (r109) so feel free to give it a go. Note that by default the code will automatically use all available CPU cores but there's a new parameter --max-threads that you can use to control the number of cores that are used should you want to reduce them for whatever reason. Chris

On 05.06.2010 15:32, Chris Miller wrote:
WanMil sent through a very nice patch for the splitter that writes out the results using multiple threads, giving a pretty big speedup during the 2nd half of the split if you have a multicore PC. For me, the time it takes to split the UK is less than half what it used to be.
I've checked his patch in (r109) so feel free to give it a go. Note that by default the code will automatically use all available CPU cores but there's a new parameter --max-threads that you can use to control the number of cores that are used should you want to reduce them for whatever reason.
Chris
What about max memory needs. Is there a big increase, or doesn't it matter, because the first phase of the split is the important part? Is this also working without --cache, or only when using --cache??
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

On 05.06.2010 15:32, Chris Miller wrote:
WanMil sent through a very nice patch for the splitter that writes out the results using multiple threads, giving a pretty big speedup during the 2nd half of the split if you have a multicore PC. For me, the time it takes to split the UK is less than half what it used to be.
I've checked his patch in (r109) so feel free to give it a go. Note that by default the code will automatically use all available CPU cores but there's a new parameter --max-threads that you can use to control the number of cores that are used should you want to reduce them for whatever reason.
Chris
What about max memory needs. Is there a big increase, or doesn't it matter, because the first phase of the split is the important part? Is this also working without --cache, or only when using --cache??
It is working with all other options of tile splitter. There should be a very slight memory increase if --max-threads > 1 in the second phase but I don't think that it's relevant (correct me if you get problems). By the way: I am not sure but I think the second phase needs more memory than the first phase. (that's what the output tells me). WanMil

Hello Felix,
What about max memory needs. Is there a big increase, or doesn't it matter, because the first phase of the split is the important part? Is this also working without --cache, or only when using --cache??
It is the second stage that requires the most memory, however with this patch the memory requirements shouldn't be too much worse. Each resultant area now has it's own queue of nodes/ways/rels that need to be written, but those queues are throttled to a max of 1000 elements each so even in the worst case the additional overhead is probably a few tens of MB. As before, you can always reduce the memory requirements of the 2nd stage by reducing --max-nodes and/or --max-areas. The --cache parameter has no effect, it works just as well with or without that. Chris

WanMil's patch has inspired me to add some additional threading to the splitter. In r110 the loading of osm files is now performed in a separate thread from the parsing. This provides an especially big benefit when reading in a .bz2 compressed osm file since the decompression happens in parallel. Any .gz or .zip osm files should also benefit significantly. Uncompressed osm files see some speedup but it's much much smaller. You don't need to do anything to enable this feature. As long as --max-threads is set to a value greater than 1 it will be enabled automatically. I'd recommend you just leave out the --max-threads parameter altogether and the splitter will use an appropriate number of threads for your CPU. Here are some benchmarks I've run when splitting an osm file of the UK on a Core i7 to give you some idea of the impact of the recent changes: bz2 compressed file, no cache: No threading: 238s r109 threading: 193s r110 threading: 136s bz2 compressed file, cache: No threading: 170s r109 threading: 123s r110 threading: 92s Uncompressed osm file, no cache: No threading: 132s r109 threading: 80s r110 threading: 76s Uncompressed osm file, cache: No threading: 107s r109 threading: 63s r110 threading: 61s Pre-existing cache (so no osm file parsing required): No threading: 69s r109 threading: 23s r110 threading: 23s (the r110 threading isn't used in this situation) (for the record, I tried loading the cache in a background thread too but that didn't make any difference to performance) Enjoy, Chris

bz2 compressed file, no cache: No threading: 238s r109 threading: 193s r110 threading: 136s
bz2 compressed file, cache: No threading: 170s r109 threading: 123s r110 threading: 92s
Uncompressed osm file, no cache: No threading: 132s r109 threading: 80s r110 threading: 76s
Uncompressed osm file, cache: No threading: 107s r109 threading: 63s r110 threading: 61s
Pre-existing cache (so no osm file parsing required): No threading: 69s r109 threading: 23s r110 threading: 23s (the r110 threading isn't used in this situation)
Thanks for theese figures (and improvements) How long does it take to uncompress with bunzip or p7zip? I suppose uncompressing with external program beforehand is still faster, or has this switched around now? (until now, for me uncompressing with p7zip the geofabrik extracts and then running mkgmap with no cache on small countries, with cache on larger countries was the fastest way to do..).

Hello Felix,
How long does it take to uncompress with bunzip or p7zip? I suppose uncompressing with external program beforehand is still faster, or has this switched around now? (until now, for me uncompressing with p7zip the geofabrik extracts and then running mkgmap with no cache on small countries, with cache on larger countries was the fastest way to do..).
I haven't tried benchmarking that before sorry so I don't really know. Any numbers you might have would be appreciated. One thing I want to add is the ability to pipe the osm file in via stdin, which (depending on OS buffering) will allow external decompression at the same time as processing and it might prove to be the fastest approach. It will certainly be more reliable - I've seen the Java bz2 decompression fail with strange errors a few times now due to bugs in the Apache Java decompression code. Chris

On Tue, Jun 08, 2010 at 07:11:23AM +0000, Chris Miller wrote:
One thing I want to add is the ability to pipe the osm file in via stdin, which (depending on OS buffering) will allow external decompression at the same time as processing and it might prove to be the fastest approach.
I would appreciate that, because I have to do some preprocessing (moving end nodes of the coastline to the tile boundary, deleting auto-generated bad multipolygons) after decompressing the osm file. The sed or perl filter could sit between bzip2 and splitter, and all three processes could run in parallel. Marko

Hello Marko,
I would appreciate that, because I have to do some preprocessing (moving end nodes of the coastline to the tile boundary, deleting auto-generated bad multipolygons) after decompressing the osm file. The sed or perl filter could sit between bzip2 and splitter, and all three processes could run in parallel.
I've just checked in r111 which enables osm data to be read from stdin. For it to work, the following three conditions must be met: 1) there must be no osm files specified as parameters 2) a valid --cache parameter must be supplied 3) there must not be an existing cache from a previous run eg (with an empty cache dir): java -Xmx4000m -jar splitter.jar --cache=cache < united_kingdom.osm If you supply osm files or an existing cache, they'll be used as input instead and stdin will be ignored. If no cache directory is specified, stdin can't be used because it's not possible to perform the second pass - there wouldn't be any data to read. Hopefully that makes sense (and I've tried to provide clear log messages explaining what's going on), but if anything's not clear or you think of a way to improve it, please let me know. Chris

Hello Chris,
I've just checked in r111 which enables osm data to be read from stdin. For it to work, the following three conditions must be met:
1) there must be no osm files specified as parameters 2) a valid --cache parameter must be supplied 3) there must not be an existing cache from a previous run
eg (with an empty cache dir): java -Xmx4000m -jar splitter.jar --cache=cache < united_kingdom.osm
Thanks, I am trying it out now. This warning seems redundant for this use case: * WARNING: No valid existing cache found but caching was requested. * Because I am processing the input several times, for additional layers, I added "tee" to the pipe for writing the uncompressed output: rm -fr splitter-cache bzip2 -dc "$OSM_BZ2"| perl -e ...| tee "$OSM"| $JAVACMD $JAVACMD_OPTIONS -jar splitter.jar --split-file=areas.list \ --cache=splitter-cache It would be nice if mkgmap could produce several layers in a single pass or if the multipolygon processing could be disabled. My script continues like this (simplified): java -jar mkgmap.jar --transparent --style=control "$OSM" java -jar mkgmap.jar --transparent --style=routes "$OSM" java -jar mkgmap.jar -c mkgmap.args # processing the split tiles java -jar mkgmap.jar --gmapsupp *.img The "control" and "routes" maps are very sparse, so they can cover the entire country. I would generate all layers from the same tiles if mkgmap could produce several *.img with one parsing of the *.osm tile (using different output styles). The "control" and "routes" styles only generate points or lines, no polygons. Could we disable relation=multipolygon processing when there are no polygons defined in the style? Marko

Hello Marko,
Thanks, I am trying it out now. This warning seems redundant for this use case:
* WARNING: No valid existing cache found but caching was requested. *
Ahh... good catch. I hadn't considered the case where you supply an areas.list file. In that situation, if (and only if) a single pass is required during the split then there's no need to build a cache even when reading from stdin. I'll see if I can do something sensible, ie only create a cache if one is actually required.
It would be nice if mkgmap could produce several layers in a single pass or if the multipolygon processing could be disabled. My script continues like this (simplified):
I'll leave the mkgmap experts to comment on this aspect. Chris

Hi Marko,
* WARNING: No valid existing cache found but caching was requested. *
Ahh... good catch. I hadn't considered the case where you supply an areas.list file. In that situation, if (and only if) a single pass is required during the split then there's no need to build a cache even when reading from stdin. I'll see if I can do something sensible, ie only create a cache if one is actually required.
I've just checked in r112 which should address this. You no longer need to specify a --cache parameter with stdin and --split-file if there's only one pass required. Additionally, if you do specify a --cache parameter but only one pass ends up being required, no cache will be generated anyway since it would only slow things down. Chris

On 06/10/2010 01:21 AM, Chris Miller wrote:
pass required. Additionally, if you do specify a --cache parameter but only one pass ends up being required, no cache will be generated anyway since it would only slow things down.
I don't think this is a good idea. I often split the Europe excerpt, and only after running mkgmap I find out that one of the tiles is too big so I have to split again with a modified areas.list. Creating a cache during the first pass helps a lot there.

Hello Ralf,
I don't think this is a good idea. I often split the Europe excerpt, and only after running mkgmap I find out that one of the tiles is too big so I have to split again with a modified areas.list. Creating a cache during the first pass helps a lot there.
Hmmm I was wondering whether that might be a problem for someone but couldn't think of a good use case, so thanks for speaking up. This situation only crops up when --split-file is used - I take it you have a standard areas.list file that you hand edit and use even if you get an updated Europe osm file, hence there's no reason to regenerate areas.list (and hence the cache generation gets skipped)? I could either revert the change so that a cache is always regenerated, even if only one pass is required. Alternatively I could add a parameter to toggle the behaviour. I'm usually wary of adding more parameters but I guess it's justified in this situation. The decision then becomes what the default behaviour should be. I suppose the default should be to always cache, but --cacheOptional would avoid generating the cache if a single pass was detected. Thoughts? Chris

Hi Chris, thanks for the quick response. On Wed, Jun 09, 2010 at 11:21:21PM +0000, Chris Miller wrote:
I've just checked in r112 which should address this. You no longer need to specify a --cache parameter with stdin and --split-file if there's only one pass required.
I tried that, but splitter wrote almost empty *.osm.gz files (150ish bytes, just containing the <osm ...><bounds .../></osm>) and terminated before consuming all input: Processing 5 areas in a single pass Starting pass 1 of 1, processing 5 areas (63240001 to 63240005) Thread worker-0 has finished Thread worker-0 has finished Thread worker-0 has finished Thread worker-0 has finished Thread worker-0 has finished Wrote 0 nodes, 0 ways, 0 relations You can get my osm2img.sh script from http://www.polkupyoraily.net/osm/. Just remove the --cache parameter. Marko

Hi Marko,
I tried that, but splitter wrote almost empty *.osm.gz files (150ish bytes, just containing the <osm ...><bounds .../></osm>) and terminated before consuming all input:
Hmm, I encountered that during my testing but thought I'd fixed it. Guess I must have missed something, I'll take a look tonight. Sorry for the inconvenience in the meantime! Chris

Just to say: thanks WanMil and Chris! (now I need to upgrade to a quadcore to benefit from all these goodies...)

Hello Lambertus,
Just to say: thanks WanMil and Chris!
My pleasure :)
(now I need to upgrade to a quadcore to benefit from all these goodies...)
Just for the record, a dual core PC is enough to take full advantage of my osm loading changes. For WanMil's patch on the other hand, the more cores the better (it's not a linear improvement though since the more cores you add the more I/O bound it will become). Chris
participants (6)
-
Chris Miller
-
Felix Hartmann
-
Lambertus
-
Marko Mäkelä
-
Ralf Kleineisel
-
WanMil