Making splitter and MultiPolygon code play together

Hi WanMil, I am considering to download a bigger map extract and cutting it to rectangular tiles with splitter, so that I can work around the issues with Geofabrik boundaries. Do you have an idea how to fix the problems at splitter tile boundaries, such as these when splitting at lat=62.226562: http://www.openstreetmap.org/browse/relation/302897 http://www.openstreetmap.org/browse/relation/306274 http://www.openstreetmap.org/browse/relation/311221 Marko

ideally splitter should keep the whole relation in each tile which contains nodes/ways of this relation. mkgmap can fail on incomplete relations On Tue, Feb 2, 2010 at 1:28 AM, Marko Mäkelä <marko.makela@iki.fi> wrote:
Hi WanMil,
I am considering to download a bigger map extract and cutting it to rectangular tiles with splitter, so that I can work around the issues with Geofabrik boundaries.
Do you have an idea how to fix the problems at splitter tile boundaries, such as these when splitting at lat=62.226562:
http://www.openstreetmap.org/browse/relation/302897 http://www.openstreetmap.org/browse/relation/306274 http://www.openstreetmap.org/browse/relation/311221
Marko _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Apollinaris, On Tue, Feb 02, 2010 at 02:35:26AM -0800, Apollinaris Schoell wrote:
ideally splitter should keep the whole relation in each tile which contains nodes/ways of this relation. mkgmap can fail on incomplete relations
Currently, splitter seems to preserve the relation, but it may discard relation members or nodes constituting ways in the relation. I investigated http://www.openstreetmap.org/browse/relation/311221 that mkgmap was complaining about. In the tile file, the multipolygon is defined as at http://www.openstreetmap.org/browse/relation/311221 but the Palosaari island http://www.openstreetmap.org/browse/way/43829168 is not defined in the tile file, because it is too far from the tile boundaries (tile:lat<62.226562, island:lat>62.274). Because mkgmap ignores unresolvable relation member references, it will treat the multipolygon as if it only had the lake in role=outer and no other members. The lake polygon is intact in the tile file: <way id='4717407'> <nd ref='30037150'/> <nd ref='335160092'/> ... <tag k='name' v='Palokkajärvi'/> <tag k='natural' v='water'/> </way> This matches what I got when downloading and saving in JOSM the following point at the coastline: http://www.openstreetmap.org/?lat=62.2807084&lon=25.7284731&zoom=18 Last, I grepped the node ids from the tile file. Only 46 of the 182 nodes are defined in the tile file. The northmost included point is this one: <node id='30037138' lat='62.2692188' lon='25.7421118'/> Interestingly, the entire lake falls outside the tile boundaries. The tile is 59.414063,19.116211 to 62.226562,31.596680 but the south tip of the lake is lat=62.2605. I can suggest two solutions to this issue: * splitter should flag and preserve all nodes that belong to ways that belong to multipolygon relations (I would not care about route relations, for example) * mkgmap should discard multipolygon relations that consist of only one way Best regards, Marko
On Tue, Feb 2, 2010 at 1:28 AM, Marko Mäkelä <marko.makela@iki.fi> wrote:
Hi WanMil,
I am considering to download a bigger map extract and cutting it to rectangular tiles with splitter, so that I can work around the issues with Geofabrik boundaries.
Do you have an idea how to fix the problems at splitter tile boundaries, such as these when splitting at lat=62.226562:
http://www.openstreetmap.org/browse/relation/302897 http://www.openstreetmap.org/browse/relation/306274 http://www.openstreetmap.org/browse/relation/311221
Marko _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

On Tue, Feb 02, 2010 at 11:34:50PM +0200, Marko Mäkelä wrote:
I can suggest two solutions to this issue:
* splitter should flag and preserve all nodes that belong to ways that belong to multipolygon relations (I would not care about route relations, for example)
* mkgmap should discard multipolygon relations that consist of only one way
I implemented the latter in r1555, and the fix makes the warnings for one of the three multipolygons go away. Splitter should still be fixed. A fairly cheap work-around in mkgmap could be to discard those ways whose resolvable nodes completely fall outside the bounding box when some of the nodes cannot be resolved. Marko

It's not a straightforward fix in the splitter however I'll see what I can do. I think if I make the cache generation compulsory it will be possible to handle this without too serious an impact on performance. Chris MM> On Tue, Feb 02, 2010 at 11:34:50PM +0200, Marko Mäkelä wrote: MM>
I can suggest two solutions to this issue:
* splitter should flag and preserve all nodes that belong to ways that belong to multipolygon relations (I would not care about route relations, for example)
* mkgmap should discard multipolygon relations that consist of only one way
MM> I implemented the latter in r1555, and the fix makes the warnings MM> for one of the three multipolygons go away. Splitter should still MM> be fixed. MM> MM> A fairly cheap work-around in mkgmap could be to discard those ways MM> whose resolvable nodes completely fall outside the bounding box when MM> some of the nodes cannot be resolved. MM> MM> Marko MM>

Hi Chris,
It's not a straightforward fix in the splitter however I'll see what I can do. I think if I make the cache generation compulsory it will be possible to handle this without too serious an impact on performance.
I understand that this would require deferring the writing of the nodes until the whole input (nodes, ways, and relations) has been consumed. I would greatly appreciate a fix. After that, it would be time to have the Geofabrik dumps corrected. By the way, I think that you should restrict this inclusion of all nodes only to select relation types (only multipolygons come to my mind). For instance, route relations (such as the international E road network) should be clipped at the tile borders. For what it is worth, here is my attempt at implementing this workaround: MM> A fairly cheap work-around in mkgmap could be to discard those ways MM> whose resolvable nodes completely fall outside the bounding box when MM> some of the nodes cannot be resolved. It did not make the warnings for the two other multipolygons go away. Marko

Hi Marko, MM> I understand that this would require deferring the writing of the MM> nodes MM> until the whole input (nodes, ways, and relations) has been MM> consumed. Currently if either or both of the --mixed and --cache parameters are supplied to the splitter, a complete pass is made over all nodes/ways/rels anyway, so during this pass it should be (almost) possible to determine which nodes belong to multipolygons and therefore need special handling. I'm thinking the best thing to do is to make the cache compulsory (which in turn would make --mixed redundant) and once the cache is generated and all the multipolygons have been found, an additional pass can be made over the ways cache file to determine which nodes fall in which multipolygons and dealt with accordingly. Without a compulsory cache in place this would be very expensive. The upside to a compulsory cache is that the code doesn't get too messy and performance doesn't suffer much, plus there will likely be other benefits in the future too. The downside is that a chunk of disk space will always be required by the splitter for writing the cache. Does anyone have any objections to this? If not I'll take a look sometime in the next few days. I'll also look at fixing the lack of support in the splitter for relations containing other relations. MM> By the way, I think that you should restrict this inclusion of all MM> nodes only to select relation types (only multipolygons come to my MM> mind). OK, I'll do that for starters and see where that gets us. We can always enhance the logic in the future if need be. Chris

Does anyone have any objections to this? If not I'll take a look sometime in the next few days. I'll also look at fixing the lack of support in the splitter for relations containing other relations. In all my struggles to deal with osm and mapmaking, disk space has been the least of my troubles. I have the impresion that the cache is significantly smaller than the osm.bz2 input file, in which case always having a disk cache seems ok.

Chris Miller wrote:
I'm thinking the best thing to do is to make the cache compulsory (which in turn would make --mixed redundant) and once the cache is generated and all the multipolygons have been found, an additional pass can be made over the ways cache file to determine which nodes fall in which multipolygons and dealt with accordingly. Without a compulsory cache in place this would be very expensive.
The upside to a compulsory cache is that the code doesn't get too messy and performance doesn't suffer much, plus there will likely be other benefits in the future too. The downside is that a chunk of disk space will always be required by the splitter for writing the cache.
Does anyone have any objections to this? If not I'll take a look sometime in the next few days. I'll also look at fixing the lack of support in the splitter for relations containing other relations.
No objections here. Disk space due to cache isn't really a problem, even when processing the entire planet file.

that will be great enhancement, disk space doesn't matter at all. On Wed, Feb 3, 2010 at 5:48 AM, Chris Miller <chris.miller@kbcfp.com> wrote:
Hi Marko,
MM> I understand that this would require deferring the writing of the MM> nodes MM> until the whole input (nodes, ways, and relations) has been MM> consumed.
Currently if either or both of the --mixed and --cache parameters are supplied to the splitter, a complete pass is made over all nodes/ways/rels anyway, so during this pass it should be (almost) possible to determine which nodes belong to multipolygons and therefore need special handling.
I'm thinking the best thing to do is to make the cache compulsory (which in turn would make --mixed redundant) and once the cache is generated and all the multipolygons have been found, an additional pass can be made over the ways cache file to determine which nodes fall in which multipolygons and dealt with accordingly. Without a compulsory cache in place this would be very expensive.
The upside to a compulsory cache is that the code doesn't get too messy and performance doesn't suffer much, plus there will likely be other benefits in the future too. The downside is that a chunk of disk space will always be required by the splitter for writing the cache.
Does anyone have any objections to this? If not I'll take a look sometime in the next few days. I'll also look at fixing the lack of support in the splitter for relations containing other relations.
MM> By the way, I think that you should restrict this inclusion of all MM> nodes only to select relation types (only multipolygons come to my MM> mind).
OK, I'll do that for starters and see where that gets us. We can always enhance the logic in the future if need be.
Chris
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
participants (5)
-
Apollinaris Schoell
-
Chris Miller
-
Greg Troxel
-
Lambertus
-
Marko Mäkelä