Bug in LocationHook?

GerdP

6 Jan 2012 6 Jan '12

10:26 a.m.

WanMil, LocationHook checks these hard coded tag names: private static final String[] LEVEL2_NAMES = new String[]{"name","name:en","int_name"}; but it doesn't add them in getUsedTags() as long as the user doesn't set them in parameter name-tag-list. Is this intended? Ciao, Gerd -- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7157897.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

Show replies by date

WanMil

6 Jan 6 Jan

6:45 p.m.

Gerd, it is intended. The tag names are used only to retrieve the name from the precompiled boundaries. Loading precompiled boundaries does not care about the getUsedTags() information so it is correct not to add them to this list. WanMil

...

WanMil,

LocationHook checks these hard coded tag names: private static final String[] LEVEL2_NAMES = new String[]{"name","name:en","int_name"};

but it doesn't add them in getUsedTags() as long as the user doesn't set them in parameter name-tag-list.

Is this intended?

Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7157897.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd Petermann

7 Jan 7 Jan

10:52 a.m.

Hi WanMil, okay, I see. By the way: I gave up changing the BoundaryPreparer regarding those admin_level=7 special cases. I got lost in too many complex special cases. So I started again lookin at LocationHook, and I think I found a way to speedup the processing by re-creating the quadtree when the processing of a complete admin_level did not change any element. The new quadtree contains typically < 1000 elements then. I just have to clean the code... Gerd

...

Date: Fri, 6 Jan 2012 19:45:58 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] Bug in LocationHook?

Gerd,

it is intended. The tag names are used only to retrieve the name from the precompiled boundaries. Loading precompiled boundaries does not care about the getUsedTags() information so it is correct not to add them to this list.

WanMil

...
WanMil,

LocationHook checks these hard coded tag names: private static final String[] LEVEL2_NAMES = new String[]{"name","name:en","int_name"};

but it doesn't add them in getUsedTags() as long as the user doesn't set them in parameter name-tag-list.

Is this intended?

Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7157897.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

WanMil

11:33 a.m.

Hi Gerd, that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case? I am happy to see new and interesting ideas to improve mkgmap so I am curious for your patch :-) WanMil

...

Hi WanMil,

okay, I see. By the way: I gave up changing the BoundaryPreparer regarding those admin_level=7 special cases. I got lost in too many complex special cases. So I started again lookin at LocationHook, and I think I found a way to speedup the processing by re-creating the quadtree when the processing of a complete admin_level did not change any element. The new quadtree contains typically < 1000 elements then. I just have to clean the code...

Gerd

...
Date: Fri, 6 Jan 2012 19:45:58 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] Bug in LocationHook?

Gerd,

it is intended. The tag names are used only to retrieve the name from the precompiled boundaries. Loading precompiled boundaries does not care about the getUsedTags() information so it is correct not to add them to this list.

WanMil

...
WanMil,

LocationHook checks these hard coded tag names: private static final String[] LEVEL2_NAMES = new String[]{"name","name:en","int_name"};

but it doesn't add them in getUsedTags() as long as the user doesn't set them in parameter name-tag-list.

Is this intended?

Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7157897.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

GerdP

2:21 p.m.

Hi, WanMil wrote

...

that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced. Ciao, Gerd -- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

WanMil

8:32 p.m.

...

Hi,

WanMil wrote

...
that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook. I will give it a try within the next days. WanMil

...

Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

GerdP

8:58 p.m.

Hi WanMil, I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that. Ciao, Gerd WanMil wrote

...

...
Hi,

WanMil wrote

...
that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

GerdP

9:01 p.m.

Oops, that was meant as an answer on http://gis.638310.n2.nabble.com/Question-reg-LocationHook-and-incomplete-dat... Gerd GerdP wrote

...

Hi WanMil,

I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that.

Ciao, Gerd

WanMil wrote

...
...
Hi,

WanMil wrote

...
that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162657.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

WanMil

8 Jan 8 Jan

7:16 a.m.

Hi Gerd, I don't see a problem just right now. The algorithm is as follows: 1. Load the precompiled bounds 2. Sort the bounds in the order 1. all bounds tagged with a postcode tag only 2. all bounds tagged with admin_level=11 3. all bounds tagged with admin_level=10 4. .. 5. all bounds tagged with admin_level=2 3. Go through the bounds and assign all tags and assign the referenced bounds tags also. 4. If an element is tagged with all remaining bounds tags it can be removed from the quadtree. Now comes the tricky thing: If a boundary with admin_level=4 also has a postcode tag it is ensured that no element without postcode tag is removed from the quadtree. Why? I assume that there must not be two overlapping postcode areas. So tagging admin_level=10 with postcode 12345 and admin_level=9 with postcode 12345 too would be an OSM error. There should be only one postcode area defining the whole area. With this assumption there can be only one bounds tagged with postcode only or one bounds tagged with an admin_level and the postcode. Postcode only bounds are handled first so after that it is only possible that an element lies in an area tagged with admin_level and postcode. So it's sufficient to check that the admin_level is set before removing it from the quadtree. WanMil

...

Hi WanMil,

I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that.

Ciao, Gerd

WanMil wrote

...
...
Hi,

WanMil wrote

...
that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

GerdP

7:54 a.m.

WanMil, the problem is this: We 1st process all boundaries with postcode-only tag. That's ok. We start processing boundaries that have an admin-level tag. The first boundary that has an admin-level tag but no postcode-tag switches the currLevel from postcode to something starting with mkgmap:admin-level, but we still may have other boundaries that have the postcode tag coming later. So, I think we must first process all boundaries that have a postcode tag. Gerd WanMil wrote

...

Hi Gerd,

I don't see a problem just right now.

The algorithm is as follows: 1. Load the precompiled bounds 2. Sort the bounds in the order 1. all bounds tagged with a postcode tag only 2. all bounds tagged with admin_level=11 3. all bounds tagged with admin_level=10 4. .. 5. all bounds tagged with admin_level=2 3. Go through the bounds and assign all tags and assign the referenced bounds tags also. 4. If an element is tagged with all remaining bounds tags it can be removed from the quadtree.

Now comes the tricky thing: If a boundary with admin_level=4 also has a postcode tag it is ensured that no element without postcode tag is removed from the quadtree. Why? I assume that there must not be two overlapping postcode areas. So tagging admin_level=10 with postcode 12345 and admin_level=9 with postcode 12345 too would be an OSM error. There should be only one postcode area defining the whole area. With this assumption there can be only one bounds tagged with postcode only or one bounds tagged with an admin_level and the postcode. Postcode only bounds are handled first so after that it is only possible that an element lies in an area tagged with admin_level and postcode. So it's sufficient to check that the admin_level is set before removing it from the quadtree.

WanMil

...
Hi WanMil,

I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that.

Ciao, Gerd

WanMil wrote

...
...
Hi,

WanMil wrote

...
that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164147.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

WanMil

7:59 a.m.

No. An element can lie only in one admin_level with each number bound (one for admin_level=11,10,9,8,7,etc.). There can be only one bound tagged with postcode. If an admin_level is tagged with a postcode there must be no other bound at the same position tagged with postcode. That means it is sufficient to check for the admin_level. That's tricky but it works if the input data is correct. WanMil

...

WanMil,

the problem is this: We 1st process all boundaries with postcode-only tag. That's ok. We start processing boundaries that have an admin-level tag. The first boundary that has an admin-level tag but no postcode-tag switches the currLevel from postcode to something starting with mkgmap:admin-level, but we still may have other boundaries that have the postcode tag coming later. So, I think we must first process all boundaries that have a postcode tag.

Gerd

WanMil wrote

...
Hi Gerd,

I don't see a problem just right now.

The algorithm is as follows: 1. Load the precompiled bounds 2. Sort the bounds in the order 1. all bounds tagged with a postcode tag only 2. all bounds tagged with admin_level=11 3. all bounds tagged with admin_level=10 4. .. 5. all bounds tagged with admin_level=2 3. Go through the bounds and assign all tags and assign the referenced bounds tags also. 4. If an element is tagged with all remaining bounds tags it can be removed from the quadtree.

Now comes the tricky thing: If a boundary with admin_level=4 also has a postcode tag it is ensured that no element without postcode tag is removed from the quadtree. Why? I assume that there must not be two overlapping postcode areas. So tagging admin_level=10 with postcode 12345 and admin_level=9 with postcode 12345 too would be an OSM error. There should be only one postcode area defining the whole area. With this assumption there can be only one bounds tagged with postcode only or one bounds tagged with an admin_level and the postcode. Postcode only bounds are handled first so after that it is only possible that an element lies in an area tagged with admin_level and postcode. So it's sufficient to check that the admin_level is set before removing it from the quadtree.

WanMil

...
Hi WanMil,

I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that.

Ciao, Gerd

WanMil wrote

...
...
Hi,

WanMil wrote

...
that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164147.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

GerdP

8:06 a.m.

Hi WanMil, okay, I try to find the source of the error: OSM or LocationHook ;-) Gerd WanMil wrote

...

No. An element can lie only in one admin_level with each number bound (one for admin_level=11,10,9,8,7,etc.). There can be only one bound tagged with postcode. If an admin_level is tagged with a postcode there must be no other bound at the same position tagged with postcode. That means it is sufficient to check for the admin_level.

That's tricky but it works if the input data is correct.

WanMil

...
WanMil,

the problem is this: We 1st process all boundaries with postcode-only tag. That's ok. We start processing boundaries that have an admin-level tag. The first boundary that has an admin-level tag but no postcode-tag switches the currLevel from postcode to something starting with mkgmap:admin-level, but we still may have other boundaries that have the postcode tag coming later. So, I think we must first process all boundaries that have a postcode tag.

Gerd

WanMil wrote

...
Hi Gerd,

I don't see a problem just right now.

The algorithm is as follows: 1. Load the precompiled bounds 2. Sort the bounds in the order 1. all bounds tagged with a postcode tag only 2. all bounds tagged with admin_level=11 3. all bounds tagged with admin_level=10 4. .. 5. all bounds tagged with admin_level=2 3. Go through the bounds and assign all tags and assign the referenced bounds tags also. 4. If an element is tagged with all remaining bounds tags it can be removed from the quadtree.

Now comes the tricky thing: If a boundary with admin_level=4 also has a postcode tag it is ensured that no element without postcode tag is removed from the quadtree. Why? I assume that there must not be two overlapping postcode areas. So tagging admin_level=10 with postcode 12345 and admin_level=9 with postcode 12345 too would be an OSM error. There should be only one postcode area defining the whole area. With this assumption there can be only one bounds tagged with postcode only or one bounds tagged with an admin_level and the postcode. Postcode only bounds are handled first so after that it is only possible that an element lies in an area tagged with admin_level and postcode. So it's sufficient to check that the admin_level is set before removing it from the quadtree.

WanMil

...
Hi WanMil,

I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that.

Ciao, Gerd

WanMil wrote

...
...
Hi,

WanMil wrote > > that sounds strange. I think recreating the quadtree would only be > benficial if recreating takes less time than removing elements. I > wonder > if that's the case? >

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164147.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164181.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

GerdP

8:35 a.m.

Hi WanMil, here is an example that shows the problem : Way 104052830 lies in two different boundaries with the same admin_level=8: r114493 (Creutzwald) and r1184868 (Völklingen) . I think this is correct, a way can cross boundaries. The way is not found in a postcode-only boundary. For r114493, mkgmap does not find the postcode tag, but for r1184868 it does. mgkmap finds the way first in r114493 when currLevel is admin_level=8 and finds all remaining tags in r114493, so it removes the way from the quadtree. If the way is not removed, we find it again in r1184868 which would set postcode as well. Of course, it makes no sense to set the postcode of r1184868 and the name of r114493, so mkgmap probably works all right. The problem is that it would also be correct to find r1184868 and tag the way with different data, and that makes it hard to compare results of mkgmap if I change the logic so that this happens :-( Anyway: I saw that r114493 contains a tag "addr:postcode=57150" which is ignored by LocationHook. What would be the right way to recognize that as a zip code? Gerd WanMil wrote

...

No. An element can lie only in one admin_level with each number bound (one for admin_level=11,10,9,8,7,etc.). There can be only one bound tagged with postcode. If an admin_level is tagged with a postcode there must be no other bound at the same position tagged with postcode. That means it is sufficient to check for the admin_level.

That's tricky but it works if the input data is correct.

WanMil

...
WanMil,

the problem is this: We 1st process all boundaries with postcode-only tag. That's ok. We start processing boundaries that have an admin-level tag. The first boundary that has an admin-level tag but no postcode-tag switches the currLevel from postcode to something starting with mkgmap:admin-level, but we still may have other boundaries that have the postcode tag coming later. So, I think we must first process all boundaries that have a postcode tag.

Gerd

WanMil wrote

...
Hi Gerd,

I don't see a problem just right now.

The algorithm is as follows: 1. Load the precompiled bounds 2. Sort the bounds in the order 1. all bounds tagged with a postcode tag only 2. all bounds tagged with admin_level=11 3. all bounds tagged with admin_level=10 4. .. 5. all bounds tagged with admin_level=2 3. Go through the bounds and assign all tags and assign the referenced bounds tags also. 4. If an element is tagged with all remaining bounds tags it can be removed from the quadtree.

Now comes the tricky thing: If a boundary with admin_level=4 also has a postcode tag it is ensured that no element without postcode tag is removed from the quadtree. Why? I assume that there must not be two overlapping postcode areas. So tagging admin_level=10 with postcode 12345 and admin_level=9 with postcode 12345 too would be an OSM error. There should be only one postcode area defining the whole area. With this assumption there can be only one bounds tagged with postcode only or one bounds tagged with an admin_level and the postcode. Postcode only bounds are handled first so after that it is only possible that an element lies in an area tagged with admin_level and postcode. So it's sufficient to check that the admin_level is set before removing it from the quadtree.

WanMil

...
Hi WanMil,

I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that.

Ciao, Gerd

WanMil wrote

...
...
Hi,

WanMil wrote > > that sounds strange. I think recreating the quadtree would only be > benficial if recreating takes less time than removing elements. I > wonder > if that's the case? >

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164147.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164242.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

WanMil

9:34 a.m.

...

Hi WanMil,

here is an example that shows the problem : Way 104052830 lies in two different boundaries with the same admin_level=8: r114493 (Creutzwald) and r1184868 (Völklingen) . I think this is correct, a way can cross boundaries.

The way is not found in a postcode-only boundary.

For r114493, mkgmap does not find the postcode tag, but for r1184868 it does.

mgkmap finds the way first in r114493 when currLevel is admin_level=8 and finds all remaining tags in r114493, so it removes the way from the quadtree.

If the way is not removed, we find it again in r1184868 which would set postcode as well. Of course, it makes no sense to set the postcode of r1184868 and the name of r114493, so mkgmap probably works all right.

The problem is that it would also be correct to find r1184868 and tag the way with different data, and that makes it hard to compare results of mkgmap if I change the logic so that this happens :-(

Ok, that's a very basic problem of the current LocationHook implementation. Elements that belong to more than one boundary are located to one of it. Also if the element only touches on boundary it is possible that it is located to it. It would be possible to change that (which also means ways/polygons needs to be splitted) but the performance would be terrible.

...

Anyway: I saw that r114493 contains a tag "addr:postcode=57150" which is ignored by LocationHook. What would be the right way to recognize that as a zip code?

This must be addressed by the style. LocationHook only adds the mkgmap:XXX tags which can be used by the style. The real assingment is done in the style. (in the default style) There are different rules for each possible tag. The first rule is the most prioritized rule. So if you move the addr:postcode rule before the mkgmap:postcode rule then the addr:postcode tag is preferred.

...

Gerd

WanMil wrote

...
No. An element can lie only in one admin_level with each number bound (one for admin_level=11,10,9,8,7,etc.). There can be only one bound tagged with postcode. If an admin_level is tagged with a postcode there must be no other bound at the same position tagged with postcode. That means it is sufficient to check for the admin_level.

That's tricky but it works if the input data is correct.

WanMil

...
WanMil,

the problem is this: We 1st process all boundaries with postcode-only tag. That's ok. We start processing boundaries that have an admin-level tag. The first boundary that has an admin-level tag but no postcode-tag switches the currLevel from postcode to something starting with mkgmap:admin-level, but we still may have other boundaries that have the postcode tag coming later. So, I think we must first process all boundaries that have a postcode tag.

Gerd

WanMil wrote

...
Hi Gerd,

I don't see a problem just right now.

The algorithm is as follows: 1. Load the precompiled bounds 2. Sort the bounds in the order 1. all bounds tagged with a postcode tag only 2. all bounds tagged with admin_level=11 3. all bounds tagged with admin_level=10 4. .. 5. all bounds tagged with admin_level=2 3. Go through the bounds and assign all tags and assign the referenced bounds tags also. 4. If an element is tagged with all remaining bounds tags it can be removed from the quadtree.

Now comes the tricky thing: If a boundary with admin_level=4 also has a postcode tag it is ensured that no element without postcode tag is removed from the quadtree. Why? I assume that there must not be two overlapping postcode areas. So tagging admin_level=10 with postcode 12345 and admin_level=9 with postcode 12345 too would be an OSM error. There should be only one postcode area defining the whole area. With this assumption there can be only one bounds tagged with postcode only or one bounds tagged with an admin_level and the postcode. Postcode only bounds are handled first so after that it is only possible that an element lies in an area tagged with admin_level and postcode. So it's sufficient to check that the admin_level is set before removing it from the quadtree.

WanMil

...
Hi WanMil,

I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that.

Ciao, Gerd

WanMil wrote

...
> Hi, > > > WanMil wrote >> >> that sounds strange. I think recreating the quadtree would only be >> benficial if recreating takes less time than removing elements. I >> wonder >> if that's the case? >> > > I think the current quadtree implementation has one problem: As long > as > it > still contains a few elements, a get(...) takes more or less the same > time. > I guess that's because it still calls a lot of intersect() > calculations > to > return a result. > I am still searching for an error which seems to slow down the > program > too > much, but my patch does this: > - It keeps the list elemList > - If currLevel is changed, it checks if the previous level caused any > changes to the elements in the quadTree. If not, the whole elemList > is > tested, all elements that are fully worked out are removed. > If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

> > Ciao, > Gerd > > > -- > View this message in context: > http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html > Sent from the Mkgmap Development mailing list archive at Nabble.com. > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@.org > http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164147.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164242.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

GerdP

9:50 a.m.

Hi WanMil, okay, thanks for the details. Reg. addr:postcode: Please note that this tag is set for the boundary, not the way: http://www.openstreetmap.org/browse/relation/114993 The tag "source:addr:postcode = source of postcode is from osm nodes" let's me assume that this is somehow generated and may be used by the LocationHook ? Ciao. Gerd WanMil wrote

...

...
Hi WanMil,

here is an example that shows the problem : Way 104052830 lies in two different boundaries with the same admin_level=8: r114493 (Creutzwald) and r1184868 (Völklingen) . I think this is correct, a way can cross boundaries.

The way is not found in a postcode-only boundary.

For r114493, mkgmap does not find the postcode tag, but for r1184868 it does.

mgkmap finds the way first in r114493 when currLevel is admin_level=8 and finds all remaining tags in r114493, so it removes the way from the quadtree.

If the way is not removed, we find it again in r1184868 which would set postcode as well. Of course, it makes no sense to set the postcode of r1184868 and the name of r114493, so mkgmap probably works all right.

The problem is that it would also be correct to find r1184868 and tag the way with different data, and that makes it hard to compare results of mkgmap if I change the logic so that this happens :-(

Ok, that's a very basic problem of the current LocationHook implementation. Elements that belong to more than one boundary are located to one of it. Also if the element only touches on boundary it is possible that it is located to it. It would be possible to change that (which also means ways/polygons needs to be splitted) but the performance would be terrible.

...
Anyway: I saw that r114493 contains a tag "addr:postcode=57150" which is ignored by LocationHook. What would be the right way to recognize that as a zip code?

This must be addressed by the style. LocationHook only adds the mkgmap:XXX tags which can be used by the style. The real assingment is done in the style. (in the default style) There are different rules for each possible tag. The first rule is the most prioritized rule. So if you move the addr:postcode rule before the mkgmap:postcode rule then the addr:postcode tag is preferred.

...
Gerd

WanMil wrote

...
No. An element can lie only in one admin_level with each number bound (one for admin_level=11,10,9,8,7,etc.). There can be only one bound tagged with postcode. If an admin_level is tagged with a postcode there must be no other bound at the same position tagged with postcode. That means it is sufficient to check for the admin_level.

That's tricky but it works if the input data is correct.

WanMil

...
WanMil,

the problem is this: We 1st process all boundaries with postcode-only tag. That's ok. We start processing boundaries that have an admin-level tag. The first boundary that has an admin-level tag but no postcode-tag switches the currLevel from postcode to something starting with mkgmap:admin-level, but we still may have other boundaries that have the postcode tag coming later. So, I think we must first process all boundaries that have a postcode tag.

Gerd

WanMil wrote

...
Hi Gerd,

I don't see a problem just right now.

The algorithm is as follows: 1. Load the precompiled bounds 2. Sort the bounds in the order 1. all bounds tagged with a postcode tag only 2. all bounds tagged with admin_level=11 3. all bounds tagged with admin_level=10 4. .. 5. all bounds tagged with admin_level=2 3. Go through the bounds and assign all tags and assign the referenced bounds tags also. 4. If an element is tagged with all remaining bounds tags it can be removed from the quadtree.

Now comes the tricky thing: If a boundary with admin_level=4 also has a postcode tag it is ensured that no element without postcode tag is removed from the quadtree. Why? I assume that there must not be two overlapping postcode areas. So tagging admin_level=10 with postcode 12345 and admin_level=9 with postcode 12345 too would be an OSM error. There should be only one postcode area defining the whole area. With this assumption there can be only one bounds tagged with postcode only or one bounds tagged with an admin_level and the postcode. Postcode only bounds are handled first so after that it is only possible that an element lies in an area tagged with admin_level and postcode. So it's sufficient to check that the admin_level is set before removing it from the quadtree.

WanMil

...
Hi WanMil,

I think the problem is that we sort the boundaries before analysing the lies_in tag. If we first fill the missing zip code tag (and maybe the admin_level tag as well) from the boundaries found in lies_in, we should not see the problem that elements are removed too early. I'm just trying that.

Ciao, Gerd

WanMil wrote > >> Hi, >> >> >> WanMil wrote >>> >>> that sounds strange. I think recreating the quadtree would only be >>> benficial if recreating takes less time than removing elements. I >>> wonder >>> if that's the case? >>> >> >> I think the current quadtree implementation has one problem: As >> long >> as >> it >> still contains a few elements, a get(...) takes more or less the >> same >> time. >> I guess that's because it still calls a lot of intersect() >> calculations >> to >> return a result. >> I am still searching for an error which seems to slow down the >> program >> too >> much, but my patch does this: >> - It keeps the list elemList >> - If currLevel is changed, it checks if the previous level caused >> any >> changes to the elements in the quadTree. If not, the whole elemList >> is >> tested, all elements that are fully worked out are removed. >> If the remaining list is much smaller, the quadtree is replaced. > > Sounds reasonable. There might be another solution within the > quadtree: > At the moments subtrees are not reduced. At an early stage of > implementation I did have implemented this but it did not have an > advantage. > Now the quadtree uses other datastructures which might be easier to > reduce. So a node which is not a leaf can be reduced after a > successful > remove operation if all childs are leafs and the sum of points in > the > leafs are lower than the maxsize. That should be not too complicted > to > implement and does not require any special handling in the > LocationHook. > > I will give it a try within the next days. > > WanMil > > >> >> Ciao, >> Gerd >> >> >> -- >> View this message in context: >> http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html >> Sent from the Mkgmap Development mailing list archive at >> Nabble.com. >> _______________________________________________ >> mkgmap-dev mailing list >> mkgmap-dev@.org >> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@.org > http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev >

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164147.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164242.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164352.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

WanMil

10:38 a.m.

Hi Gerd, Oh, I thought the way would be tagged. The problem with postcodes is that the tagging is not very consistent. In Germany lots of zip code areas are relations with boundary=postal_code. I don't know exactly what is the preffered tagging for other countries. In the end we should collect all ways to tag postalcodes and implement them in the precompilation step. This also means that you might get two or more areas with postalcode=12345 and you have to merge them in the precompilation. Because the postal codes are not very well supported by the garmin format mkgmap is able to write I did not continue to improve the handling. Maybe it's worthy to have a deeper look on it. WanMil

...

Hi WanMil,

okay, thanks for the details. Reg. addr:postcode: Please note that this tag is set for the boundary, not the way: http://www.openstreetmap.org/browse/relation/114993 The tag "source:addr:postcode = source of postcode is from osm nodes" let's me assume that this is somehow generated and may be used by the LocationHook ?

Ciao. Gerd

WanMil wrote

...
...
Hi WanMil,

here is an example that shows the problem : Way 104052830 lies in two different boundaries with the same admin_level=8: r114493 (Creutzwald) and r1184868 (Völklingen) . I think this is correct, a way can cross boundaries.

The way is not found in a postcode-only boundary.

For r114493, mkgmap does not find the postcode tag, but for r1184868 it does.

mgkmap finds the way first in r114493 when currLevel is admin_level=8 and finds all remaining tags in r114493, so it removes the way from the quadtree.

If the way is not removed, we find it again in r1184868 which would set postcode as well. Of course, it makes no sense to set the postcode of r1184868 and the name of r114493, so mkgmap probably works all right.

The problem is that it would also be correct to find r1184868 and tag the way with different data, and that makes it hard to compare results of mkgmap if I change the logic so that this happens :-(

Ok, that's a very basic problem of the current LocationHook implementation. Elements that belong to more than one boundary are located to one of it. Also if the element only touches on boundary it is possible that it is located to it. It would be possible to change that (which also means ways/polygons needs to be splitted) but the performance would be terrible.

...
Anyway: I saw that r114493 contains a tag "addr:postcode=57150" which is ignored by LocationHook. What would be the right way to recognize that as a zip code?

This must be addressed by the style. LocationHook only adds the mkgmap:XXX tags which can be used by the style. The real assingment is done in the style. (in the default style) There are different rules for each possible tag. The first rule is the most prioritized rule. So if you move the addr:postcode rule before the mkgmap:postcode rule then the addr:postcode tag is preferred.

...
Gerd

WanMil wrote

...
No. An element can lie only in one admin_level with each number bound (one for admin_level=11,10,9,8,7,etc.). There can be only one bound tagged with postcode. If an admin_level is tagged with a postcode there must be no other bound at the same position tagged with postcode. That means it is sufficient to check for the admin_level.

That's tricky but it works if the input data is correct.

WanMil

...
WanMil,

the problem is this: We 1st process all boundaries with postcode-only tag. That's ok. We start processing boundaries that have an admin-level tag. The first boundary that has an admin-level tag but no postcode-tag switches the currLevel from postcode to something starting with mkgmap:admin-level, but we still may have other boundaries that have the postcode tag coming later. So, I think we must first process all boundaries that have a postcode tag.

Gerd

WanMil wrote

...
Hi Gerd,

I don't see a problem just right now.

The algorithm is as follows: 1. Load the precompiled bounds 2. Sort the bounds in the order 1. all bounds tagged with a postcode tag only 2. all bounds tagged with admin_level=11 3. all bounds tagged with admin_level=10 4. .. 5. all bounds tagged with admin_level=2 3. Go through the bounds and assign all tags and assign the referenced bounds tags also. 4. If an element is tagged with all remaining bounds tags it can be removed from the quadtree.

Now comes the tricky thing: If a boundary with admin_level=4 also has a postcode tag it is ensured that no element without postcode tag is removed from the quadtree. Why? I assume that there must not be two overlapping postcode areas. So tagging admin_level=10 with postcode 12345 and admin_level=9 with postcode 12345 too would be an OSM error. There should be only one postcode area defining the whole area. With this assumption there can be only one bounds tagged with postcode only or one bounds tagged with an admin_level and the postcode. Postcode only bounds are handled first so after that it is only possible that an element lies in an area tagged with admin_level and postcode. So it's sufficient to check that the admin_level is set before removing it from the quadtree.

WanMil

> Hi WanMil, > > I think the problem is that we sort the boundaries before analysing > the > lies_in tag. If we first fill the missing zip code tag (and maybe the > admin_level tag as well) from the boundaries found in lies_in, we > should > not > see the problem that elements are removed too early. > I'm just trying that. > > Ciao, > Gerd > > > WanMil wrote >> >>> Hi, >>> >>> >>> WanMil wrote >>>> >>>> that sounds strange. I think recreating the quadtree would only be >>>> benficial if recreating takes less time than removing elements. I >>>> wonder >>>> if that's the case? >>>> >>> >>> I think the current quadtree implementation has one problem: As >>> long >>> as >>> it >>> still contains a few elements, a get(...) takes more or less the >>> same >>> time. >>> I guess that's because it still calls a lot of intersect() >>> calculations >>> to >>> return a result. >>> I am still searching for an error which seems to slow down the >>> program >>> too >>> much, but my patch does this: >>> - It keeps the list elemList >>> - If currLevel is changed, it checks if the previous level caused >>> any >>> changes to the elements in the quadTree. If not, the whole elemList >>> is >>> tested, all elements that are fully worked out are removed. >>> If the remaining list is much smaller, the quadtree is replaced. >> >> Sounds reasonable. There might be another solution within the >> quadtree: >> At the moments subtrees are not reduced. At an early stage of >> implementation I did have implemented this but it did not have an >> advantage. >> Now the quadtree uses other datastructures which might be easier to >> reduce. So a node which is not a leaf can be reduced after a >> successful >> remove operation if all childs are leafs and the sum of points in >> the >> leafs are lower than the maxsize. That should be not too complicted >> to >> implement and does not require any special handling in the >> LocationHook. >> >> I will give it a try within the next days. >> >> WanMil >> >> >>> >>> Ciao, >>> Gerd >>> >>> >>> -- >>> View this message in context: >>> http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html >>> Sent from the Mkgmap Development mailing list archive at >>> Nabble.com. >>> _______________________________________________ >>> mkgmap-dev mailing list >>> mkgmap-dev@.org >>> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev >> >> _______________________________________________ >> mkgmap-dev mailing list >> mkgmap-dev@.org >> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev >> > > > -- > View this message in context: > http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7162654.html > Sent from the Mkgmap Development mailing list archive at Nabble.com. > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@.org > http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164147.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164242.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7164352.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd Petermann

6:31 a.m.

Hello WanMil, Currently we are placing the elements into the quadtree and iterate over the boundaries, actually we place each point of a way into the quadtree. Did you ever consider to do the search the other way around? I mean, put the boundaries into something like a simple grid or quadtree or r-tree and iterate over the elements to search a boundary? I did not do a deep analysis of the complexity of both strategies, but I have the feeling that the latter could be much faster. ciao, Gerd

...

Date: Sat, 7 Jan 2012 21:32:40 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] Bug in LocationHook?

...
Hi,

WanMil wrote

...
that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I wonder if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

WanMil

7:04 a.m.

Hi Gerd, I remember that I tried the other strategy. I don't remember how I did organize the boundaries. But it was *magnitudes* slower (not just slower but magnitudes slower) so that I didn't started to improve my poor implementation. I also tried to use other data structures with the help of other libraries (like JSI or JTS). Some of the data structures were little faster than the ElementQuadTree (remember: the old one before our optimizations). But the improvements were too small to use an additional library in mkgmap. WanMil

...

Hello WanMil,

Currently we are placing the elements into the quadtree and iterate over the boundaries, actually we place each point of a way into the quadtree. Did you ever consider to do the search the other way around? I mean, put the boundaries into something like a simple grid or quadtree or r-tree and iterate over the elements to search a boundary? I did not do a deep analysis of the complexity of both strategies, but I have the feeling that the latter could be much faster.

ciao, Gerd

...
Date: Sat, 7 Jan 2012 21:32:40 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] Bug in LocationHook?

...
Hi,

WanMil wrote

...
that sounds strange. I think recreating the quadtree would only be benficial if recreating takes less time than removing elements. I

wonder

...
if that's the case?

I think the current quadtree implementation has one problem: As long as it still contains a few elements, a get(...) takes more or less the same time. I guess that's because it still calls a lot of intersect() calculations to return a result. I am still searching for an error which seems to slow down the program too much, but my patch does this: - It keeps the list elemList - If currLevel is changed, it checks if the previous level caused any changes to the elements in the quadTree. If not, the whole elemList is tested, all elements that are fully worked out are removed. If the remaining list is much smaller, the quadtree is replaced.

Sounds reasonable. There might be another solution within the quadtree: At the moments subtrees are not reduced. At an early stage of implementation I did have implemented this but it did not have an advantage. Now the quadtree uses other datastructures which might be easier to reduce. So a node which is not a leaf can be reduced after a successful remove operation if all childs are leafs and the sum of points in the leafs are lower than the maxsize. That should be not too complicted to implement and does not require any special handling in the LocationHook.

I will give it a try within the next days.

WanMil

...
Ciao, Gerd

-- View this message in context:

http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7161772.html

...
Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd Petermann

4:47 p.m.

Hi WanMil

...

I remember that I tried the other strategy. I don't remember how I did organize the boundaries. But it was *magnitudes* slower (not just slower but magnitudes slower) so that I didn't started to improve my poor implementation.

I tried it anyway and found some interesting results: Sometimes it is ~ 20% faster, sometimes it is much slower (maybe 100% ). So, I think it might be a good alternative if we can avoid the bad cases. I created a simple grid that stores all boundaries that intersect with each element (similar to the grid in splitter). This is done quicker than the creation of the quadtree. Using the grid, for each Coord I get a list of boundaries to test: ... Boundary b = currentBoundaries.get(blist.get(i)); if (b.getBbox().contains(co)){ if (b.getArea().contains(co.getLongitude(),co.getLatitude())){ return b; // return the area that contains the point } } ... the performance problem is in the b.getArea().contains(...) For some boundaries, this test performs very slowly (> 1ms for one test), and it seems to be related to the number of points that form the shape of the boundary. Some areas are built with < 20 points, others with > 8000. So, what I need is a quicker contains() or some kind of tree that allows to avoid it. Any idea? Ciao, Gerd

WanMil

9:17 p.m.

...

Hi WanMil

...
I remember that I tried the other strategy. I don't remember how I did organize the boundaries. But it was *magnitudes* slower (not just slower but magnitudes slower) so that I didn't started to improve my poor implementation.

I tried it anyway and found some interesting results: Sometimes it is ~ 20% faster, sometimes it is much slower (maybe 100% ). So, I think it might be a good alternative if we can avoid the bad cases.

I created a simple grid that stores all boundaries that intersect with each element (similar to the grid in splitter). This is done quicker than the creation of the quadtree. Using the grid, for each Coord I get a list of boundaries to test: ... Boundary b = currentBoundaries.get(blist.get(i)); if (b.getBbox().contains(co)){ if (b.getArea().contains(co.getLongitude(),co.getLatitude())){ return b; // return the area that contains the point } } ...

the performance problem is in the b.getArea().contains(...) For some boundaries, this test performs very slowly (> 1ms for one test), and it seems to be related to the number of points that form the shape of the boundary. Some areas are built with < 20 points, others with > 8000.

So, what I need is a quicker contains() or some kind of tree that allows to avoid it. Any idea?

I think the problem does not exist in the current quad tree because the areas are splitted into subareas. This also reduces the number of points and makes it quicker and easier to test. You could get the number of points from the boundary and decide if an area should be splitted into subareas. It would also be possible to save the splitted subareas as precompiled bounds. If your grid always have the same dimension this might improve the handling. You might also remove the merge step. For most tiles more than one precompiled file is loaded. The bounds are merged afterwards. This step sounds superfluous as the LocationHook only checks if one point of a way is in the boundary and not for all points. Therefore it is not necessary to have the boundary area as one complete area object. WanMil

...

Ciao, Gerd

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

WanMil

9:36 p.m.

...

...
Any idea?

Another (small) idea would be to improve the raster of the precompiled bounds. At the moment a bounds tile is 50000x50000 Garmin units big (BoundaryUtil.RASTER). I broadly remember that splitter uses a 2^n raster (?). Maybe it's beneficial to change the boundary raster to the splitter raster? WanMil

Marko Mäkelä

9 Jan 9 Jan

7:53 a.m.

On Sun, Jan 08, 2012 at 10:36:13PM +0100, WanMil wrote:

...

I broadly remember that splitter uses a 2^n raster (?). Maybe it's beneficial to change the boundary raster to the splitter raster?

Would it make any sense to combine splitter and mkgmap to a single process? I guess that splitter would still have to write output to disk in many cases, but in others it could directly pipe the tiles to the mkgmap code. Would it help if multipolygons and location info were processed directly in splitter? Marko

GerdP

9:25 a.m.

Hi Marko, I had the same idea while I was optimizing splitter, but I wanted to achive better throughput. I don't know if it also would allow better quality. Regaring the performance of the LocationHook I see no advantage doing this. I think WanMil introduced the precompiled bounds just because the normal input doesn't contain enough information (unclosed boundaries). Ciao, Gerd Marko Mäkelä wrote

...

On Sun, Jan 08, 2012 at 10:36:13PM +0100, WanMil wrote:

...
I broadly remember that splitter uses a 2^n raster (?). Maybe it's beneficial to change the boundary raster to the splitter raster?

Would it make any sense to combine splitter and mkgmap to a single process? I guess that splitter would still have to write output to disk in many cases, but in others it could directly pipe the tiles to the mkgmap code. Would it help if multipolygons and location info were processed directly in splitter?

Marko _______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7167257.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

GerdP

9:04 a.m.

Hi WanMil, WanMil wrote

...

...
...
Any idea?

Another (small) idea would be to improve the raster of the precompiled bounds. At the moment a bounds tile is 50000x50000 Garmin units big (BoundaryUtil.RASTER). I broadly remember that splitter uses a 2^n raster (?). Maybe it's beneficial to change the boundary raster to the splitter raster?

I think it is not needed to change the RASTER constant. Splitter calculates the tile size depending on the max-nodes parameter. It doesn't use fixed tile sizes. You can use --write-kml=xyz.kml to create a kml file which can be displayed in google earth. Gerd -- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7167197.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

Gerd Petermann

8 Jan 8 Jan

10:11 p.m.

Hi WanMil,

...

...
Any idea?

I think the problem does not exist in the current quad tree because the areas are splitted into subareas. This also reduces the number of points and makes it quicker and easier to test. You could get the number of points from the boundary and decide if an area should be splitted into subareas. It would also be possible to save the splitted subareas as precompiled bounds. If your grid always have the same dimension this might improve the handling.

Okay, it seems I have to look more closely to this. I did not understand that our quadtree allows to save areas (shapes). Or did I get this wrong? Splitting into subareas seems also to be a good option.

...

You might also remove the merge step. For most tiles more than one precompiled file is loaded. The bounds are merged afterwards. This step sounds superfluous as the LocationHook only checks if one point of a way is in the boundary and not for all points. Therefore it is not necessary to have the boundary area as one complete area object.

Okay, I'll look at this as well. Thanks for your help! Gerd

WanMil

10 Jan 10 Jan

9:45 p.m.

...

Hi WanMil,

...
...
Any idea?

I think the problem does not exist in the current quad tree because the areas are splitted into subareas. This also reduces the number of points and makes it quicker and easier to test. You could get the number of points from the boundary and decide if an area should be splitted into subareas. It would also be possible to save the splitted subareas as precompiled bounds. If your grid always have the same dimension this might improve the handling.

Okay, it seems I have to look more closely to this. I did not understand that our quadtree allows to save areas (shapes). Or did I get this wrong?

I think you already got it. I can add an area into the quadtree but only the points of the shapes are used for the query. So it's not a real support for areas (and lines).

...

Splitting into subareas seems also to be a good option.

...
You might also remove the merge step. For most tiles more than one precompiled file is loaded. The bounds are merged afterwards. This step sounds superfluous as the LocationHook only checks if one point of a way is in the boundary and not for all points. Therefore it is not necessary to have the boundary area as one complete area object.

Okay, I'll look at this as well.

Thanks for your help!

Gerd

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

GerdP

9 Jan 9 Jan

9:31 a.m.

Hi WanMil, I've coded a quadtree that allows to store the boundary areas. Now I always see better throughput, but memory requirement is a bit higher. I think the way is correct, but I'll need some time for testing and fine tuning. Ciao, Gerd WanMil wrote

...

...
Hi WanMil

...
I remember that I tried the other strategy. I don't remember how I did organize the boundaries. But it was *magnitudes* slower (not just slower but magnitudes slower) so that I didn't started to improve my poor implementation.

I tried it anyway and found some interesting results: Sometimes it is ~ 20% faster, sometimes it is much slower (maybe 100% ). So, I think it might be a good alternative if we can avoid the bad cases.

I created a simple grid that stores all boundaries that intersect with each element (similar to the grid in splitter). This is done quicker than the creation of the quadtree. Using the grid, for each Coord I get a list of boundaries to test: ... Boundary b = currentBoundaries.get(blist.get(i)); if (b.getBbox().contains(co)){ if (b.getArea().contains(co.getLongitude(),co.getLatitude())){ return b; // return the area that contains the point } } ...

the performance problem is in the b.getArea().contains(...) For some boundaries, this test performs very slowly (> 1ms for one test), and it seems to be related to the number of points that form the shape of the boundary. Some areas are built with < 20 points, others with > 8000.

So, what I need is a quicker contains() or some kind of tree that allows to avoid it. Any idea?

I think the problem does not exist in the current quad tree because the areas are splitted into subareas. This also reduces the number of points and makes it quicker and easier to test. You could get the number of points from the boundary and decide if an area should be splitted into subareas. It would also be possible to save the splitted subareas as precompiled bounds. If your grid always have the same dimension this might improve the handling.

You might also remove the merge step. For most tiles more than one precompiled file is loaded. The bounds are merged afterwards. This step sounds superfluous as the LocationHook only checks if one point of a way is in the boundary and not for all points. Therefore it is not necessary to have the boundary area as one complete area object.

WanMil

...
Ciao, Gerd

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7167271.html Sent from the Mkgmap Development mailing list archive at Nabble.com.

WanMil

10 Jan 10 Jan

9:42 p.m.

Hi Gerd, if you manage to provide a high performance data structure to check in which areas an element lies in then it might also be possible to move the LocationHook into the style system. So there might be a rule like: mkgmap:admin_level2 lies_in 'DEU' { set mkgmap:country='mkgmap:admin_level2' } Maybe this reduces the costs for assigning the boundary information a lot because only the boundaries really referenced by the style must be checked. WanMil

...

Hi WanMil,

I've coded a quadtree that allows to store the boundary areas. Now I always see better throughput, but memory requirement is a bit higher. I think the way is correct, but I'll need some time for testing and fine tuning.

Ciao, Gerd

WanMil wrote

...
...
Hi WanMil

...
I remember that I tried the other strategy. I don't remember how I did organize the boundaries. But it was *magnitudes* slower (not just slower but magnitudes slower) so that I didn't started to improve my poor implementation.

I tried it anyway and found some interesting results: Sometimes it is ~ 20% faster, sometimes it is much slower (maybe 100% ). So, I think it might be a good alternative if we can avoid the bad cases.

I created a simple grid that stores all boundaries that intersect with each element (similar to the grid in splitter). This is done quicker than the creation of the quadtree. Using the grid, for each Coord I get a list of boundaries to test: ... Boundary b = currentBoundaries.get(blist.get(i)); if (b.getBbox().contains(co)){ if (b.getArea().contains(co.getLongitude(),co.getLatitude())){ return b; // return the area that contains the point } } ...

the performance problem is in the b.getArea().contains(...) For some boundaries, this test performs very slowly (> 1ms for one test), and it seems to be related to the number of points that form the shape of the boundary. Some areas are built with< 20 points, others with> 8000.

So, what I need is a quicker contains() or some kind of tree that allows to avoid it. Any idea?

I think the problem does not exist in the current quad tree because the areas are splitted into subareas. This also reduces the number of points and makes it quicker and easier to test. You could get the number of points from the boundary and decide if an area should be splitted into subareas. It would also be possible to save the splitted subareas as precompiled bounds. If your grid always have the same dimension this might improve the handling.

You might also remove the merge step. For most tiles more than one precompiled file is loaded. The bounds are merged afterwards. This step sounds superfluous as the LocationHook only checks if one point of a way is in the boundary and not for all points. Therefore it is not necessary to have the boundary area as one complete area object.

WanMil

...
Ciao, Gerd

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7167271.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd Petermann

11 Jan 11 Jan

12:24 a.m.

Hi WanMil, hmm, sounds interesting, but I don't yet understand why we need a different structure for this. If I got this right, you want to avoid assigning tags that are not needed by the style. I think this can also be done with the existing structure, we just have to filter the map boundarySetTags . Or, if we know that only level=4 and level=2 are referenced, maybe we can omit searching some of the boundaries in the ElementquadTree. Besides that, my quadtree works, but it doesn't save as much time as expected, and I found a small bug in the Area.contains() method: If I search for a point that lies exactly on the right most corner or line of the area, contains() doesn't always find it. I thought this is because of rounding errors, but if I got it right, it is simply a wrong test (x + w < y instead of x + w <= y). If you like, I can prepare a small sample to show that. I thought that the result of my implementation of LocationHook should be exactly the same as that from trunk when I make sure that both do only use the same point of a way for searching. In many cases, it was like that, but because of the above error, it sometimes was not. Took me quite a while to track that down ;-) My solution is still not much faster than r2165 (maybe 5%-10%) because I create a new quadtree for each level and then iterate through the elements. This requires more or less the same number of calculations as r2165, when only one point of a way is put into the quadtree. I want to have a solution that creates the quadtree so that it stops if the (sub) tree is fully covered by one or more areas. It should be possible to do this in a short time. If I can manage to identify the intersections of different boundaries, I could merge the tags of them and that would mean that each elem has to be searched only once. That's what I am working on now. Ciao, Gerd

...

Date: Tue, 10 Jan 2012 22:42:21 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] Bug in LocationHook?

Hi Gerd,

if you manage to provide a high performance data structure to check in which areas an element lies in then it might also be possible to move the LocationHook into the style system. So there might be a rule like:

mkgmap:admin_level2 lies_in 'DEU' { set mkgmap:country='mkgmap:admin_level2' }

Maybe this reduces the costs for assigning the boundary information a lot because only the boundaries really referenced by the style must be checked.

WanMil

...
Hi WanMil,

I've coded a quadtree that allows to store the boundary areas. Now I always see better throughput, but memory requirement is a bit higher. I think the way is correct, but I'll need some time for testing and fine tuning.

Ciao, Gerd

WanMil wrote

...
...
Hi WanMil

...
I remember that I tried the other strategy. I don't remember how I did organize the boundaries. But it was *magnitudes* slower (not just slower but magnitudes slower) so that I didn't started to improve my poor implementation.

I tried it anyway and found some interesting results: Sometimes it is ~ 20% faster, sometimes it is much slower (maybe 100% ). So, I think it might be a good alternative if we can avoid the bad cases.

I created a simple grid that stores all boundaries that intersect with each element (similar to the grid in splitter). This is done quicker than the creation of the quadtree. Using the grid, for each Coord I get a list of boundaries to test: ... Boundary b = currentBoundaries.get(blist.get(i)); if (b.getBbox().contains(co)){ if (b.getArea().contains(co.getLongitude(),co.getLatitude())){ return b; // return the area that contains the point } } ...

the performance problem is in the b.getArea().contains(...) For some boundaries, this test performs very slowly (> 1ms for one test), and it seems to be related to the number of points that form the shape of the boundary. Some areas are built with< 20 points, others with> 8000.

So, what I need is a quicker contains() or some kind of tree that allows to avoid it. Any idea?

I think the problem does not exist in the current quad tree because the areas are splitted into subareas. This also reduces the number of points and makes it quicker and easier to test. You could get the number of points from the boundary and decide if an area should be splitted into subareas. It would also be possible to save the splitted subareas as precompiled bounds. If your grid always have the same dimension this might improve the handling.

You might also remove the merge step. For most tiles more than one precompiled file is loaded. The bounds are merged afterwards. This step sounds superfluous as the LocationHook only checks if one point of a way is in the boundary and not for all points. Therefore it is not necessary to have the boundary area as one complete area object.

WanMil

...
Ciao, Gerd

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7167271.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

WanMil

7:56 p.m.

...

Besides that, my quadtree works, but it doesn't save as much time as expected, and I found a small bug in the Area.contains() method: If I search for a point that lies exactly on the right most corner or line of the area, contains() doesn't always find it. I thought this is because of rounding errors, but if I got it right, it is simply a wrong test (x + w < y instead of x + w <= y). If you like, I can prepare a small sample to show that. I thought that the result of my implementation of LocationHook should be exactly the same as that from trunk when I make sure that both do only use the same point of a way for searching. In many cases, it was like that, but because of the above error, it sometimes was not. Took me quite a while to track that down ;-)

Have a look on the definition of "insideness" of the Shape interface: http://docs.oracle.com/javase/6/docs/api/java/awt/Shape.html This gives an explanation of the contains behaviour. WanMil

Gerd Petermann

12 Jan 12 Jan

9:54 a.m.

Hello WanMil, okay, thanks for the hint. I read it before but did not fully understand it , had to lookup some vocabels ;-) So, the behaiour of contains() is exactly as it is described there. The problem is that we hit this case quite often, because many coords are lying exactly on the way that build the boundary. I'll try to understand how this is solved in ElementQuadTree. Ciao, Gerd

...

Date: Wed, 11 Jan 2012 20:56:36 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] Bug in LocationHook?

...
Besides that, my quadtree works, but it doesn't save as much time as expected, and I found a small bug in the Area.contains() method: If I search for a point that lies exactly on the right most corner or line of the area, contains() doesn't always find it. I thought this is because of rounding errors, but if I got it right, it is simply a wrong test (x + w < y instead of x + w <= y). If you like, I can prepare a small sample to show that. I thought that the result of my implementation of LocationHook should be exactly the same as that from trunk when I make sure that both do only use the same point of a way for searching. In many cases, it was like that, but because of the above error, it sometimes was not. Took me quite a while to track that down ;-)

Have a look on the definition of "insideness" of the Shape interface: http://docs.oracle.com/javase/6/docs/api/java/awt/Shape.html

This gives an explanation of the contains behaviour.

WanMil _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd Petermann

11 Jan 11 Jan

8:37 a.m.

Hi WanMil, attached is my current implementation of the BoundaryQuadTree. Maybe you (or others) can use it as it is. I'm may have to change the code heavily to be able to store (and retrieve) the information of overlapping areas of different admin_levels. ciao, Gerd

...

Date: Tue, 10 Jan 2012 22:42:21 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] Bug in LocationHook?

Hi Gerd,

if you manage to provide a high performance data structure to check in which areas an element lies in then it might also be possible to move the LocationHook into the style system. So there might be a rule like:

mkgmap:admin_level2 lies_in 'DEU' { set mkgmap:country='mkgmap:admin_level2' }

Maybe this reduces the costs for assigning the boundary information a lot because only the boundaries really referenced by the style must be checked.

WanMil

...
Hi WanMil,

I've coded a quadtree that allows to store the boundary areas. Now I always see better throughput, but memory requirement is a bit higher. I think the way is correct, but I'll need some time for testing and fine tuning.

Ciao, Gerd

WanMil wrote

...
...
Hi WanMil

...
I remember that I tried the other strategy. I don't remember how I did organize the boundaries. But it was *magnitudes* slower (not just slower but magnitudes slower) so that I didn't started to improve my poor implementation.

I tried it anyway and found some interesting results: Sometimes it is ~ 20% faster, sometimes it is much slower (maybe 100% ). So, I think it might be a good alternative if we can avoid the bad cases.

I created a simple grid that stores all boundaries that intersect with each element (similar to the grid in splitter). This is done quicker than the creation of the quadtree. Using the grid, for each Coord I get a list of boundaries to test: ... Boundary b = currentBoundaries.get(blist.get(i)); if (b.getBbox().contains(co)){ if (b.getArea().contains(co.getLongitude(),co.getLatitude())){ return b; // return the area that contains the point } } ...

the performance problem is in the b.getArea().contains(...) For some boundaries, this test performs very slowly (> 1ms for one test), and it seems to be related to the number of points that form the shape of the boundary. Some areas are built with< 20 points, others with> 8000.

So, what I need is a quicker contains() or some kind of tree that allows to avoid it. Any idea?

I think the problem does not exist in the current quad tree because the areas are splitted into subareas. This also reduces the number of points and makes it quicker and easier to test. You could get the number of points from the boundary and decide if an area should be splitted into subareas. It would also be possible to save the splitted subareas as precompiled bounds. If your grid always have the same dimension this might improve the handling.

You might also remove the merge step. For most tiles more than one precompiled file is loaded. The bounds are merged afterwards. This step sounds superfluous as the LocationHook only checks if one point of a way is in the boundary and not for all points. Therefore it is not necessary to have the boundary area as one complete area object.

WanMil

...
Ciao, Gerd

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@.org http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

-- View this message in context: http://gis.638310.n2.nabble.com/Bug-in-LocationHook-tp7157897p7167271.html Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd Petermann

8:16 a.m.

Hi WanMil, I tried omitting the merge step, it did not show more than 1% improvement, and I wasn't sure about the side effects regarding the lies_in processing. I will continue analyzing that later. Gerd

...

Date: Sun, 8 Jan 2012 22:17:25 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] Bug in LocationHook?

...
Hi WanMil

...
I remember that I tried the other strategy. I don't remember how I did organize the boundaries. But it was *magnitudes* slower (not just slower but magnitudes slower) so that I didn't started to improve my poor implementation.

I tried it anyway and found some interesting results: Sometimes it is ~ 20% faster, sometimes it is much slower (maybe 100% ). So, I think it might be a good alternative if we can avoid the bad cases.

I created a simple grid that stores all boundaries that intersect with each element (similar to the grid in splitter). This is done quicker than the creation of the quadtree. Using the grid, for each Coord I get a list of boundaries to test: ... Boundary b = currentBoundaries.get(blist.get(i)); if (b.getBbox().contains(co)){ if (b.getArea().contains(co.getLongitude(),co.getLatitude())){ return b; // return the area that contains the point } } ...

the performance problem is in the b.getArea().contains(...) For some boundaries, this test performs very slowly (> 1ms for one test), and it seems to be related to the number of points that form the shape of the boundary. Some areas are built with < 20 points, others with > 8000.

So, what I need is a quicker contains() or some kind of tree that allows to avoid it. Any idea?

I think the problem does not exist in the current quad tree because the areas are splitted into subareas. This also reduces the number of points and makes it quicker and easier to test. You could get the number of points from the boundary and decide if an area should be splitted into subareas. It would also be possible to save the splitted subareas as precompiled bounds. If your grid always have the same dimension this might improve the handling.

You might also remove the merge step. For most tiles more than one precompiled file is loaded. The bounds are merged afterwards. This step sounds superfluous as the LocationHook only checks if one point of a way is in the boundary and not for all points. Therefore it is not necessary to have the boundary area as one complete area object.

WanMil

...
Ciao, Gerd

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

4791

Age (days ago)

4797

Last active (days ago)

List overview

32 comments

4 participants

participants (4)

Gerd Petermann
GerdP
Marko Mäkelä
WanMil