
Date: Wed, 7 Nov 2012 19:19:00 +0100 From: osm@aighes.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] splitter that generates problem list
Am 07.11.2012 19:06, schrieb WanMil:
The profiling data shows that the CPU is most of the time busy with the pbf read and write routines, so I doubt that your disk is the bottleneck. From my experience profiling is a very time consuming (=> CPU consuming) task. So could it also be that the disk is not the bottleneck while you profile the application but it is the bottleneck while you don't profile it? (It's just a guess...) If I believe in TaskManager, splitter takes 17% CPU-usage. This means one core is completly busy.
HDD as bottleneck was only meant in general. I'm running splitter and mkgmap (and all input and output) on a SSD. But I think this is not typical. Maybe it would increase performance, if max-area keep smaller and splitter runs multiple times. E.g. I run two splitter with max-area=512. Will try this.
The parameter max-areas is directly related to the -Xmx parameter and the number of tiles in the split-file. If you want to find the best parms, I suggest to create the two split-files, execute splitter with --max-areas=1024 and -Xms4GB -Xmx8GB (one after the other) After that you can examine the log files with e.g. garbagecat http://code.google.com/a/eclipselabs.org/p/garbagecat/downloads/detail?name=... Your goal should be to give each splitter enough heep to process all tiles in the split-file in one pass, so if you have enough heap, max-areas should be higher than the number of tiles in the split-file.
I don't know much about java, but would be possible, that splitter detects that and runs several instances automatical? Eg. read max-heap, max-cpu-cores, number of tiles, size of input.
I thought about this as well. In the past it didn't work because we did not know anything about the input, but now we read it several times and may use the information somehow... Gerd