Re: [mkgmap-dev] Splitter pbf vs o5m processing

13 Dec 2012

      Hi WanMil,

I can't say for sure what the pbf reader is doing in detail, but it is for sure creating a
lot more temporary objects which have to be GCed. In the o5m reader I tried
to avoid that. In fact, in the current implementation, the o5m reader still
reads and saves the tags to the internal string table, so that is similar to the
pbf reader.

I'll look at your logs soon. I am working on a tuning guide for splitter, because
I found a lot of nonsense in the net searching for splitter.

Ciio,
Gerd
...
Date: Thu, 13 Dec 2012 15:27:07 +0100
From: wmgcnfg@web.de
To: mkgmap-dev@lists.mkgmap.org.uk
Subject: [mkgmap-dev] Splitter pbf vs o5m processing
...
Hi Steve,
Steve Ratcliffe wrote
...
Hello Gerd
...
no, it is not (yet). I plan to add o5m support to mkgmap soon. With my
patch you can use splitter
As an aside, what do you think it is about the o5m format that makes
it quicker than pbf?
Well, not easy to say. I think it's a combination of many small points:
1) pbf uses (by default) compressied blocks, so you have to unzip a complete
block before you can
use any information in the block.
2) pbf read routines create a lot of temporary objects, this seems to stress
GC
3) pbf doesn't allow to skip processing of node tags or way tags, but
splitters' read passes often don't need them. So, with pbf we create lists
of tags and return them to GC, with o5m we can simply skip them.
To be fair, using the --drop-version parm in osmconvert removes a lot of
info which is ignored by splitter and mkgmap. I did never try what effect is
has to use pbf input that was created with this parm.
When writing, o5m is probably only faster because it doesn't zip the data.
As long as mkgmap doesn't understand o5m I see no benefit in using this.
Maybe other computers show different results, esp. if the CPU is much faster
than mine and the Disk access is slower.
By the way: my patch also speeds up pbf reading a little bit.
Ciao,
Gerd
Hi Gerd
I've done some tests with the latest splitter version r255.
I have split the geofabrics europe extract in pbf and o5m format.
As you pointed out o5m processing is much quicker (8528s vs. 12939s).
I also observed that pbf seemed to use more memory than o5m and 
therefore I activated gc logging and checked it with garbagecat.
The interesting values are
Throughput
o5m: 94%
pbf: 61%
So 3400m seems to be too small for pbf processing to workout the europe 
extract so that the GC runs permanently.
Total Pause:
o5m:  527816ms =  528s
pbf: 5093916ms = 5094s
Wow, so for pbf GC requires 4566s more time.
Subtracting the GC time from the total processing time o5m and pbf need 
quite the same time:
o5m:  8528s -  528s = 8000s
pbf: 12939s - 5094s = 7845s
Obviously a part of the difference in GC time can be explained with your 
thoughts (pbf must extract all parts and must read tags which are thrown 
away directly afterwards). But do you think that the whole difference 
can be explained with that?
I will post my logfiles directly to you because they are too big to be 
posted on the mailing list.
WanMil
_______________________________________________
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Re: [mkgmap-dev] Splitter pbf vs o5m processing

Gerd Petermann