Different routing results using osm vs osm.pbf

Yesterday I tested pbf input for mkgmap for the first time. Map was built apparently without errors, but using the resulting map on MapSource I get a suboptimal route, compared with the one I get using osm as input. I used portugal.osm and portugal.osm.pbf from geofabrik for the test. Today geofabrik is offering corrupt excerpts, so I can't make further tests by now.

On Tue, Oct 19, 2010 at 04:58:26PM +0200, Carlos Dávila wrote:
Today geofabrik is offering corrupt excerpts, so I can't make further tests by now.
Today geofabrik is only offering *.osm.pbf files, no *.osm.bz2 files. Do you have any suggestion how to implement the following with the PBF format: bzip2 -dc "$OSM_BZ2"| perl -e \ 'my $del=0; while(<>){ $del=1 if (/<relation.* version="1".* user="usm78-gis"/); s/(<node id="28954644".*lat=)"60\.51564"/$1"59.326172"/; s/(<node id="29193143".*lon=)"24\.12826"/$1"19.072266"/; print unless $del; $del=0 if m|</relation>|; }'| tee "$OSM"| $JAVACMD $JAVACMD_OPTIONS -jar splitter.jar --split-file=areas.list Above, I remove some multipolygons in Russia (mostly broken ones) and move two coastline endpoints for generate-sea. That is done before splitting the map extract. I guess I could do this within the tiles, but it would get a little tricky. I guess I might want to preserve the *.osm format, or I would want mkgmap to produce multiple map sets from one parsing run. It seems that running mkgmap --style=routes on finland.osm.pbf is several times slower than running it on finland.osm. Marko

On 19.10.2010 21:28, Marko Mäkelä wrote:
On Tue, Oct 19, 2010 at 04:58:26PM +0200, Carlos Dávila wrote:
Today geofabrik is offering corrupt excerpts, so I can't make further tests by now. Today geofabrik is only offering *.osm.pbf files, no *.osm.bz2 files.
Do you have any suggestion how to implement the following with the PBF format:
bzip2 -dc "$OSM_BZ2"| perl -e \ 'my $del=0; while(<>){ $del=1 if (/<relation.* version="1".* user="usm78-gis"/); s/(<node id="28954644".*lat=)"60\.51564"/$1"59.326172"/; s/(<node id="29193143".*lon=)"24\.12826"/$1"19.072266"/; print unless $del; $del=0 if m|</relation>|; }'| tee "$OSM"| $JAVACMD $JAVACMD_OPTIONS -jar splitter.jar --split-file=areas.list
Above, I remove some multipolygons in Russia (mostly broken ones) and move two coastline endpoints for generate-sea. That is done before splitting the map extract. I guess I could do this within the tiles, but it would get a little tricky.
I guess I might want to preserve the *.osm format, or I would want mkgmap to produce multiple map sets from one parsing run. It seems that running mkgmap --style=routes on finland.osm.pbf is several times slower than running it on finland.osm.
Should the pbf format only be used as input for the splitter? Or is the splitter not needed anymore as we move onto osm.pbf??

Should the pbf format only be used as input for the splitter? Or is the splitter not needed anymore as we move onto osm.pbf??
Well mkgmap can now read .pbf, but it doesn't do any splitting, so if the file is large enough you still need to split it. Splitter will have to produce .pbf output (which it doesn't yet?) for the mkgmap support to be particularly useful. ..Steve

On Tue, Oct 19, 2010 at 3:08 PM, Steve Ratcliffe <steve@parabola.me.uk>wrote:
Should the pbf format only be used as input for the splitter? Or is the splitter not needed anymore as we move onto osm.pbf??
Well mkgmap can now read .pbf, but it doesn't do any splitting, so if the file is large enough you still need to split it. Splitter will have to produce .pbf output (which it doesn't yet?) for the mkgmap support to be particularly useful.
There's a branch in the splitter repository that supports reading pbf files, along with significant improvements in scalability and performance, but it still generates *.osm.gz files for output. Scott

On Tue, Oct 19, 2010 at 04:22:18PM -0500, Scott Crosby wrote:
There's a branch in the splitter repository that supports reading pbf files, along with significant improvements in scalability and performance, but it still generates *.osm.gz files for output.
Can you give the Subversion URL for that branch? It was not obvious to me when I tried to find it last night. Would it be possible to add some user-configureable pre-processing in splitter for omitting certain objects or moving nodes around, like my Perl script (posted earlier in this thread) does? Well, I guess it is always possible by patching the source, but I would prefer something plugin- or filter-like that allows me to run unmodified binaries. Something like a dynamic preprocessing library for splitter? It could also take care of rewrites, such as a more sophisticated form of mkgmap's --name-tag-list. Marko

On Wed, Oct 20, 2010 at 3:17 AM, Marko Mäkelä <marko.makela@iki.fi> wrote:
On Tue, Oct 19, 2010 at 04:22:18PM -0500, Scott Crosby wrote:
There's a branch in the splitter repository that supports reading pbf files, along with significant improvements in scalability and performance, but it still generates *.osm.gz files for output.
Can you give the Subversion URL for that branch? It was not obvious to me when I tried to find it last night.
https://svn.mkgmap.org.uk/svn/splitter/branches/crosby_integration That code also contains the various improvements I announced a month or two ago about faster splitter performance and doing >6000 regions/pass.
Would it be possible to add some user-configureable pre-processing in splitter for omitting certain objects or moving nodes around, like my Perl script (posted earlier in this thread) does? Well, I guess it is always possible by patching the source, but I would prefer something plugin- or filter-like that allows me to run unmodified binaries.
Nope. No plugins with the splitter with that functionality. You'll have to edit the code. However, part of my changes include a refactor that make it feasible to put in a small 'shim', where you can get entities before they're processed, where such a module may be cleanly added. Scott

Marko wrote:
Do you have any suggestion how to implement the following with the PBF format:
At least as a temporary solution, try changing bzip2 -dc "$OSM_BZ2"| to osmosis --rb "$OSM_PBF" --wx - | In other words you use osmosis to convert from .osm.pbf to .osm; and using pipeline features you can avoid writing the .osm file to disk if you wish. [Incidentally, I believe I have sorted out the problem which caused my replies to start new threads.]

On Wed, Oct 20, 2010 at 02:59:09PM +0100, Adrian wrote:
Marko wrote:
Do you have any suggestion how to implement the following with the PBF format:
At least as a temporary solution, try changing bzip2 -dc "$OSM_BZ2"| to osmosis --rb "$OSM_PBF" --wx - |
In other words you use osmosis to convert from .osm.pbf to .osm; and using pipeline features you can avoid writing the .osm file to disk if you wish.
Thanks, I will do that. Ultimately, I may do the custom filtering directly in splitter or osmosis, whichever works better for splitting the tiles for mkgmap. Does the osm.pbf format make Osmosis a more feasible option now? Does anyone use Osmosis for splitting rectangular tiles for mkgmap? If so, with which options?
[Incidentally, I believe I have sorted out the problem which caused my replies to start new threads.]
You seem to have. Something (mailing list software or your MUA) may still be destroying the space-stuffing of my text/plain; format=flowed messages, but I guess we can live with that. :-) Marko

On 19/10/10 15:58, Carlos Dávila wrote:
Yesterday I tested pbf input for mkgmap for the first time. Map was built apparently without errors, but using the resulting map on MapSource I get a suboptimal route, compared with the one I get using osm as input. I used portugal.osm and portugal.osm.pbf from geofabrik for the test. Today geofabrik is offering corrupt excerpts, so I can't make further tests by now.
That is interesting. If the .osm and .osm.pbf contain the same data then mkgmap should produce exactly the same map in both cases ignoring timestamps if you add --preserve-element-order in both cases. In the cases I tested this was true. If it doesn't then it is a bug. Now the fact that if you don't have --preserve-element-order there could be differences in the order of the elements within the maps and I suppose that it could affect the routing. If so that would be very interesting and might lead to improvements in routing in general. ..Steve

On 19.10.2010 22:06, Steve Ratcliffe wrote:
On 19/10/10 15:58, Carlos Dávila wrote:
Yesterday I tested pbf input for mkgmap for the first time. Map was built apparently without errors, but using the resulting map on MapSource I get a suboptimal route, compared with the one I get using osm as input. I used portugal.osm and portugal.osm.pbf from geofabrik for the test. Today geofabrik is offering corrupt excerpts, so I can't make further tests by now. That is interesting.
If the .osm and .osm.pbf contain the same data then mkgmap should produce exactly the same map in both cases ignoring timestamps if you add --preserve-element-order in both cases. In the cases I tested this was true.
How comes that --preserve-element-order is still doing anything??? As inside the style-file you can't place to rules to be enacted at the same time (on the condition whatever is first in the data) the --preserve-element-order shouldn't matter anymore (since the style-system got reorganized around half a year ago). From my understanding, if --preserve-element-order would still change something, then there has to be a bug somewhere (cause the rules are not run against the order inside the osm file, but the osm file is matched against the rules of the style-file depending on the rule order...).

How comes that --preserve-element-order is still doing anything??? As inside the style-file you can't place to rules to be enacted at the
It has nothing to do with the order that the style rules take effect. If you use the option then the elements will be written to the map in the same order as they are in the input file. This doesn't matter normally because the order of elements in the .osm file is not significant. The option exists for OSMComposer as the .osm files are written in a particular order to create the effect of different layers. ..Steve

On Tue, Oct 19, 2010 at 3:06 PM, Steve Ratcliffe <steve@parabola.me.uk>wrote:
On 19/10/10 15:58, Carlos Dávila wrote:
Yesterday I tested pbf input for mkgmap for the first time. Map was built apparently without errors, but using the resulting map on MapSource I get a suboptimal route, compared with the one I get using osm as input. I used portugal.osm and portugal.osm.pbf from geofabrik for the test. Today geofabrik is offering corrupt excerpts, so I can't make further tests by now.
That is interesting.
If the .osm and .osm.pbf contain the same data then mkgmap should produce exactly the same map in both cases ignoring timestamps
if you add --preserve-element-order in both cases. In the cases I tested this was true.
The results should be identical comparing OSM versus PBF with or without that flag. Converting from osm to pbf with the default flags should preserve everything in the origional OSM file, including precision of coordinates, element order, metadata, tags, timestamps, etc. (The format offers some options that produce smaller filesizes at the cost of not preserving everything, but those are not on by default.). If there are any differences between maps with and without --preserve-element-order, that is something related to mkgmap, not PBF.
If it doesn't then it is a bug.
Agreed. Could it be a round-off error? I do all arithmetic in integers, only multiplying against .000000001 at the very end. Scott

Hi Scott
The results should be identical comparing OSM versus PBF with or without that flag. Converting from osm to pbf with the default flags should preserve everything in the origional OSM file, including precision of coordinates, element order, metadata, tags, timestamps, etc. (The format offers some options that produce smaller filesizes at the cost of not preserving everything, but those are not on by default.). If there are any differences between maps with and without --preserve-element-order, that is something related to mkgmap, not PBF.
Yes, preserve-element-order is a mkgmap option. With the option a linked hash map is used, so that the order of the elements in the output is identical to that of the input. Without the option the order of the elements in the output is determined by the iteration order of a hash map and thus may depend on the implementation differences between the pbf and osm readers. But now you mention it I can't think of why there should be a difference, since all the code that matters is common between the two. I'll take a look. ..Steve

On Tue, Oct 19, 2010 at 11:13:37PM +0100, Steve Ratcliffe wrote:
But now you mention it I can't think of why there should be a difference, since all the code that matters is common between the two. I'll take a look.
Are the .osm.bz2 and .osm.pbf files identical to begin with? Today, Geofabrik offers files with quite different timestamps: portugal.osm.bz2 20-Oct-2010 00:20 13M portugal.osm.pbf 20-Oct-2010 07:17 7.1M Is there a tool that converts .osm.bz2 and .osm.pbf to a canonical format? If there is, then you could compare the canonical formats to see if the files are truly equivalent. Marko

On 20/10/10 11:04, Marko Mäkelä wrote:
Are the .osm.bz2 and .osm.pbf files identical to begin with? Today, Geofabrik offers files with quite different timestamps:
portugal.osm.bz2 20-Oct-2010 00:20 13M portugal.osm.pbf 20-Oct-2010 07:17 7.1M
I can't speak for the equivalence of the geofabrik extracts, but for my recent test I downloaded the pbf and used osmosis to convert between the formats. You can use: osmosis --read-pbf foo.osm.pbf --write-xml foo.osm.gz osmosis --read-xml foo.osm.gz --write-pbf foo.osm.pbf I also do not doubt that the produced maps contain the same elements, its just that they are in a different order depending on the input file. Since my previous message I have discovered one reason why this might be so, but there is still more reasons to be found.
Is there a tool that converts .osm.bz2 and .osm.pbf to a canonical format? If there is, then you could compare the canonical formats to see if the files are truly equivalent.
Marko

El 19/10/10 22:06, Steve Ratcliffe escribió:
On 19/10/10 15:58, Carlos Dávila wrote:
Yesterday I tested pbf input for mkgmap for the first time. Map was built apparently without errors, but using the resulting map on MapSource I get a suboptimal route, compared with the one I get using osm as input. I used portugal.osm and portugal.osm.pbf from geofabrik for the test. Today geofabrik is offering corrupt excerpts, so I can't make further tests by now.
That is interesting.
If the .osm and .osm.pbf contain the same data then mkgmap should produce exactly the same map in both cases ignoring timestamps if you add --preserve-element-order in both cases. In the cases I tested this was true.
If it doesn't then it is a bug.
Now the fact that if you don't have --preserve-element-order there could be differences in the order of the elements within the maps and I suppose that it could affect the routing. If so that would be very interesting and might lead to improvements in routing in general. I have repeated the test with today's portugal osm and pbf files from geofabrik and these are the results: -Calculated routes are the same with or without --preserve-element-order for each osm pair and pbf pair. -2 of 3 tested routes are worse with the pbf generated map. -pbf generated map is slightly smaller than osm one (11.3 vs 11.4 MB), so it seems that some information may be missing in the pbf map.

I have repeated the test with today's portugal osm and pbf files from geofabrik and these are the results: -Calculated routes are the same with or without --preserve-element-order for each osm pair and pbf pair. -2 of 3 tested routes are worse with the pbf generated map. -pbf generated map is slightly smaller than osm one (11.3 vs 11.4 MB), so it seems that some information may be missing in the pbf map.
What version of mkgmap was this with? I have made a change today that ensures that the output no longer depends on --preserve-element-order, for identical input files. Here is what I get with the following files and with mkgmap-r1719: I downloaded the two files: portugal.osm.pbf (dated 20-Oct-2010 07:17) portugal.osm.bz2 (dated 20-Oct-2010 11:21) from geofabrik and after converting the pbf to the .osm I verified that they were the same. The only difference was the generator and origin attributes due to differing version of osmosis. I then converted each file with mkgmap --route --remove-short-arcs The resulting maps were the same apart from the timestamps. Could you post the exact command line you were using? I assume that any difference must be down to one of the other options. ..Steve

El 20/10/10 17:15, Steve Ratcliffe escribió:
I have repeated the test with today's portugal osm and pbf files from geofabrik and these are the results: -Calculated routes are the same with or without --preserve-element-order for each osm pair and pbf pair. -2 of 3 tested routes are worse with the pbf generated map. -pbf generated map is slightly smaller than osm one (11.3 vs 11.4 MB), so it seems that some information may be missing in the pbf map.
What version of mkgmap was this with?
r1719
I have made a change today that ensures that the output no longer depends on --preserve-element-order, for identical input files.
Here is what I get with the following files and with mkgmap-r1719:
I downloaded the two files: portugal.osm.pbf (dated 20-Oct-2010 07:17) portugal.osm.bz2 (dated 20-Oct-2010 11:21)
from geofabrik and after converting the pbf to the .osm I verified that they were the same. The only difference was the generator and origin attributes due to differing version of osmosis.
I then converted each file with mkgmap --route --remove-short-arcs The resulting maps were the same apart from the timestamps.
Could you post the exact command line you were using? I assume that any difference must be down to one of the other options.
Commands are the same if both cases, with the only difference in the content of the files passed by -c (see below portugal.args and portugal_pbf.args) java -Xmx600m -enableassertions -Dlog.config=logging.properties -jar mkgmap.jar \ --generate-sea=polygons,extend-sea-sectors \ --route \ --tdbfile \ --latin1 \ --code-page=1252 \ --gmapsupp \ --series-name="OSM-Portugal" \ --index \ --road-name-pois \ --ignore-maxspeeds \ --remove-short-arcs \ --add-pois-to-areas \ --adjust-turn-headings \ --report-similar-arcs \ --link-pois-to-ways \ --location-autofill=1 \ --drive-on-right \ --check-roundabouts \ --check-roundabout-flares \ --style=mio \ typ/PORTU-22.TYP \ -c portugal.args args files are also the same, with the only difference in the input file: product-id=1 family-id=22 family-name=OSM Portugal country-name=PORTUGAL country-abbr=POR area-name=Portugal mapname: 63240006 description: PT-Lisboa input-file: portugal.osm / input-file: portugal.osm.pbf

Hi Thanks, by process of elimination I found that
--link-pois-to-ways \
causes the files to be different. This deals with nodes that have an access, barrier or highway tag, so could well affect routing. Does that make sense in the sub-optimal routes you see? This is in code that is common to both file formats so its not immediately obvious why there should be a difference but I will look at it some more tomorrow. ..Steve

El 21/10/10 00:09, steve@parabola.me.uk escribió:
Hi
--link-pois-to-ways \
I couldn't resist looking at it, and I have fixed the issue that caused the difference with this option.
Could you check the routing again? Great!!! Now routing works as expected. Thanks for the fix.
participants (7)
-
Adrian
-
Carlos Dávila
-
Felix Hartmann
-
Marko Mäkelä
-
Scott Crosby
-
Steve Ratcliffe
-
steve@parabola.me.uk