Commit: r1566: Drop all tags from the osm file that are not used

Version 1566 was commited by steve on 2010-02-08 12:33:09 +0000 (Mon, 08 Feb 2010) BRANCH: style-speed Drop all tags from the osm file that are not used in the applied style. Whether this makes a big difference or not depends on the country. In the UK which is mostly manually taged it doesn't make a great deal of difference. There are only an average of 2 tags per way and about zero per node. In countries with data imports, there are typically many tags per way (and even nodes) that refer back to the original data. Having tags on nodes is particularly bad for memory consumption since there are more nodes than ways and most are not POIs. We are forced to allocate a Tags object even though there is likely to be nothing to do as none of the tags are used. Denmark is the worst in this respect and so will show the best improvement with this patch. See also: http://www.mkgmap.org.uk/pipermail/mkgmap-dev/2009q3/003597.html For this to be successful you need an acurate list of all the tags that could be used. Tags are examined not just in the style itself, but there are hardwired tag lookups in the style system itself and there are non-style related usages as well. For the moment all the built in usages of tags are held in the builtin-tag-list file. I think I have got them all for normal use, but there may be options that require ones I've missed.

svn commit (svn@mkgmap.org.uk) wrote:
Version 1566 was commited by steve on 2010-02-08 12:33:09 +0000 (Mon, 08 Feb 2010) BRANCH: style-speed
[SNIP]
Denmark is the worst in this respect and so will show the best improvement with this patch. See also: http://www.mkgmap.org.uk/pipermail/mkgmap-dev/2009q3/003597.html
Ah, so this probably explains why when splitting Denmark I have to use a much much smaller --max-nodes than for other countries. -- Charlie

What might be nice is if this stripping can be done before (or during) the split. That way the splitter would have to do less work, and possibly would also be able to perform a better split. The downside would be that any style changes might necessitate another splitter run. Would it be difficult to refactor out the filtering logic from mkgmap so splitter could use it? Thoughts? Chris sc> Version 1566 was commited by steve on 2010-02-08 12:33:09 +0000 sc> (Mon, 08 Feb 2010) BRANCH: style-speed sc> sc> Drop all tags from the osm file that are not used sc> in the applied style. sc> Whether this makes a big difference or not depends on the country. sc> In the UK which is mostly manually taged it doesn't make a great sc> deal sc> of difference. There are only an average of 2 tags per way and sc> about zero per sc> node. sc> In countries with data imports, there are typically many tags per sc> way (and even nodes) sc> that refer back to the original data. Having tags on nodes is sc> particularly bad for memory sc> consumption since there are more nodes than ways and most are not sc> POIs. We are forced sc> to allocate a Tags object even though there is likely to be nothing sc> to do as none of the tags sc> are used. sc> Denmark is the worst in this respect and so will show the best sc> improvement with this patch. See also: sc> http://www.mkgmap.org.uk/pipermail/mkgmap-dev/2009q3/003597.html sc> sc> For this to be successful you need an acurate list of all the tags sc> that could be used. Tags are sc> sc> examined not just in the style itself, but there are hardwired tag sc> lookups in the style system itself sc> sc> and there are non-style related usages as well. For the moment all sc> the built in usages of tags sc> sc> are held in the builtin-tag-list file. I think I have got them all sc> for normal use, but there may be sc> sc> options that require ones I've missed. sc>

On Mon, Feb 08, 2010 at 01:30:13PM +0000, Chris Miller wrote:
What might be nice is if this stripping can be done before (or during) the split. That way the splitter would have to do less work, and possibly would also be able to perform a better split. The downside would be that any style changes might necessitate another splitter run. Would it be difficult to refactor out the filtering logic from mkgmap so splitter could use it? Thoughts?
What about this: mkgmap --style=some_style --generate-whitelist > whitelist.txt splitter --whitelist=whitelist.txt ... mkgmap --style=some_style ... Marko

MM> What about this: MM> MM> mkgmap --style=some_style --generate-whitelist > whitelist.txt MM> splitter --whitelist=whitelist.txt ... MM> mkgmap --style=some_style ... MM> Marko Sounds good to me, at least as far as the splitter is concerned. Certainly easier than trying to share the mkgmap code.

On 08.02.2010 14:35, Marko Mäkelä wrote:
mkgmap --style=some_style --generate-whitelist > whitelist.txt splitter --whitelist=whitelist.txt ... mkgmap --style=some_style ...
What about that new 'geotagman' OSM pre-processor tool which Toby Speight presented here a few days ago? This looks like an ideal candidate for this kind of work.

0> In article <4B706623.1080405@kleineisel.de>, 0> Ralf Kleineisel <URL:mailto:ralf@kleineisel.de> ("Ralf") wrote: Ralf> On 08.02.2010 14:35, Marko Mäkelä wrote:
mkgmap --style=some_style --generate-whitelist > whitelist.txt splitter --whitelist=whitelist.txt ... mkgmap --style=some_style ...
Ralf> What about that new 'geotagman' OSM pre-processor tool which Toby Ralf> Speight presented here a few days ago? This looks like an ideal Ralf> candidate for this kind of work. Yes it is - but the harder problem is getting the list of tags we want to keep.

Seems to work well for me. On Austria I noticed about 5% compilation time decrease (seems that the dreaded plan.at import is more and more cleaned up...). Denmark drop of course being even bigger (around 35% time decrease)

Version 1566 was commited by steve on 2010-02-08 12:33:09 +0000 (Mon, 08 Feb 2010) BRANCH: style-speed
Drop all tags from the osm file that are not used in the applied style.
Whether this makes a big difference or not depends on the country. In the UK which is mostly manually taged it doesn't make a great deal of difference. There are only an average of 2 tags per way and about zero per node.
In countries with data imports, there are typically many tags per way (and even nodes) that refer back to the original data. Having tags on nodes is particularly bad for memory consumption since there are more nodes than ways and most are not POIs. We are forced to allocate a Tags object even though there is likely to be nothing to do as none of the tags are used.
Denmark is the worst in this respect and so will show the best improvement with this patch. See also: http://www.mkgmap.org.uk/pipermail/mkgmap-dev/2009q3/003597.html
For this to be successful you need an acurate list of all the tags that could be used. Tags are examined not just in the style itself, but there are hardwired tag lookups in the style system itself and there are non-style related usages as well. For the moment all the built in usages of tags are held in the builtin-tag-list file. I think I have got them all for normal use, but there may be options that require ones I've missed.
The Osm5XMLHandler sometimes throw a NullPointerException in line 397. This is the key.equals("highway") part: if((val.equals("motorway_junction") || val.equals("services")) && key.equals("highway")) { exits.add(currentNode); currentNode.addTag("osm:id", "" + currentElementId); } It might be fixed by changing it to "highway".equals(key). WanMil

On 11.02.2010 19:59, WanMil wrote:
Version 1566 was commited by steve on 2010-02-08 12:33:09 +0000 (Mon, 08 Feb 2010) BRANCH: style-speed
Drop all tags from the osm file that are not used in the applied style.
Whether this makes a big difference or not depends on the country. In the UK which is mostly manually taged it doesn't make a great deal of difference. There are only an average of 2 tags per way and about zero per node.
In countries with data imports, there are typically many tags per way (and even nodes) that refer back to the original data. Having tags on nodes is particularly bad for memory consumption since there are more nodes than ways and most are not POIs. We are forced to allocate a Tags object even though there is likely to be nothing to do as none of the tags are used.
Denmark is the worst in this respect and so will show the best improvement with this patch. See also: http://www.mkgmap.org.uk/pipermail/mkgmap-dev/2009q3/003597.html
For this to be successful you need an acurate list of all the tags that could be used. Tags are examined not just in the style itself, but there are hardwired tag lookups in the style system itself and there are non-style related usages as well. For the moment all the built in usages of tags are held in the builtin-tag-list file. I think I have got them all for normal use, but there may be options that require ones I've missed.
The Osm5XMLHandler sometimes throw a NullPointerException in line 397. This is the key.equals("highway") part:
if((val.equals("motorway_junction") || val.equals("services"))&& key.equals("highway")) { exits.add(currentNode); currentNode.addTag("osm:id", "" + currentElementId); }
It might be fixed by changing it to "highway".equals(key).
Did you run "ant dist clean" I had the same problem when just using "ant dist make"! It's best to always run "dist clean" before compiling in ant.
WanMil _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

On Thu, Feb 11, 2010 at 07:59:01PM +0100, WanMil wrote:
The Osm5XMLHandler sometimes throw a NullPointerException in line 397. This is the key.equals("highway") part:
if((val.equals("motorway_junction") || val.equals("services")) && key.equals("highway")) { exits.add(currentNode); currentNode.addTag("osm:id", "" + currentElementId); }
It might be fixed by changing it to "highway".equals(key).
Right, java.lang.Object.equals(Object other) specifically says that you can pass other=null and the result will be false. On the other hand, invoking a method on a null reference will throw a NullPointerException. Marko

On Thu, Feb 11, 2010 at 07:59:01PM +0100, WanMil wrote:
The Osm5XMLHandler sometimes throw a NullPointerException in line 397. This is the key.equals("highway") part:
if((val.equals("motorway_junction") || val.equals("services"))&& key.equals("highway")) { exits.add(currentNode); currentNode.addTag("osm:id", "" + currentElementId); }
It might be fixed by changing it to "highway".equals(key).
Right, java.lang.Object.equals(Object other) specifically says that you can pass other=null and the result will be false. On the other hand, invoking a method on a null reference will throw a NullPointerException.
Marko
After the commit it's working! The speed improvements are great (I usually create maps using only very few tags). WanMil
participants (8)
-
charlie@cferrero.net
-
Chris Miller
-
Felix Hartmann
-
Marko Mäkelä
-
Ralf Kleineisel
-
svn commit
-
Toby Speight
-
WanMil