
Gerds patches inspired me to look for more things that could be improved. I found that the Quadtree used in the LocationHook is not very optimal. The patch is a first try to increase the performance. The time required for the LocationHook is reduced by 10-50% which is great. Warning: I haven't checked so far if the results are equal. So maybe there are big bugs in the patch... (and the speedup comes from the poor implementation) I will do some more tests and optimizations but maybe some of you can have a look on it, test it and comment it. Have fun! WanMil

Hello WanMil, I tested the patch on r2153, but sorry, I did not see any real improvement. Here are some numbers: Verzeichnis von f:\osm_out_ref\niedersachsen 28.10.2011 15:41 11.629.857 63240001.osm.pbf 28.10.2011 15:41 13.454.668 63240002.osm.pbf 28.10.2011 15:41 15.077.982 63240003.osm.pbf 28.10.2011 15:41 11.447.198 63240004.osm.pbf 28.10.2011 15:41 12.717.149 63240005.osm.pbf 28.10.2011 15:41 12.601.470 63240006.osm.pbf 28.10.2011 15:41 10.123.728 63240007.osm.pbf 7 Datei(en) 87.052.052 Bytes 0 Verzeichnis(se), 171.072.065.536 Bytes frei r2153 with original ElementQuadTreeNode.java: G:\mkgmap_all\trunk\dist>java -Xmx1600m -Xms1600m -jar mkgmap.jar --max-jobs=1 -c f:\osm\gerd_mkgmap.cfg -c f:\osm_out_ref\niedersachsen\template.args Time started: Fri Dec 30 09:43:09 CET 2011 SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240001.osm.pbf: Creating quadtree took 2594 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240001.osm.pbf: Location hook finished in 15294 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240002.osm.pbf: Creating quadtree took 3250 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240002.osm.pbf: Location hook finished in 20574 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240003.osm.pbf: Creating quadtree took 3906 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240003.osm.pbf: Location hook finished in 18184 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240004.osm.pbf: Creating quadtree took 2718 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240004.osm.pbf: Location hook finished in 12935 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240005.osm.pbf: Creating quadtree took 3905 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240005.osm.pbf: Location hook finished in 13372 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240006.osm.pbf: Creating quadtree took 2390 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240006.osm.pbf: Location hook finished in 12560 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240007.osm.pbf: Creating quadtree took 1578 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240007.osm.pbf: Location hook finished in 8123 ms Time finished: Fri Dec 30 09:50:22 CET 2011 Total time taken: 432735ms r2153 with patched ElementQuadTreeNode.java: G:\mkgmap_all\trunk\dist>java -Xmx1600m -Xms1600m -jar mkgmap.jar --max-jobs=1 -c f:\osm\gerd_mkgmap.cfg -c f:\osm_out_ref\niedersachsen\template.args Time started: Fri Dec 30 09:34:16 CET 2011 SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240001.osm.pbf: Creating quadtree took 3155 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240001.osm.pbf: Location hook finished in 12840 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240002.osm.pbf: Creating quadtree took 3155 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240002.osm.pbf: Location hook finished in 17402 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240003.osm.pbf: Creating quadtree took 3780 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240003.osm.pbf: Location hook finished in 15996 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240004.osm.pbf: Creating quadtree took 2296 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240004.osm.pbf: Location hook finished in 10732 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240005.osm.pbf: Creating quadtree took 2859 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240005.osm.pbf: Location hook finished in 12513 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240006.osm.pbf: Creating quadtree took 2561 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240006.osm.pbf: Location hook finished in 10309 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240007.osm.pbf: Creating quadtree took 2077 ms SCHWERWIEGEND (LocationHook): f:\osm_out_ref\niedersachsen\63240007.osm.pbf: Location hook finished in 7154 ms Time finished: Fri Dec 30 09:41:30 CET 2011 Total time taken: 433328ms Gerd -- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7137... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hello WanMil, I also thought about reducing the CPU time in LocationHook, but with a different approach: I think a lot of time is used because the quadtree first adds elements to resultList, (a HashSet) and 2nd a loop takes all these elements to add tags. Two ideas: 1) Optimization: It might save time if we could pass a function pointer to the quadtree. The function does the tagging for one element, and we don't have to manage resultlists. I implemented that, but the program was slower. I assume that was caused by the fact that I did not implement the part that removes elements from the quadtree when they were "fully worked out". 2) An completely different approach: I did not yet fully understand the algorithmn in LocationHook, but I got the impression that it might be possible to create a dictionary with combinations of admin-level tags, and save a reference to that dictionary instead of adding similar tags to each element. This could save space and time, but would require additional logic in class Element. Do you think this is worth trying? Ciao, Gerd -- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7137... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hi Gerd, the results are a bit heterogeneous. In some cases the LocationHook finishes much quicker, in some cases it is slower. I have some ideas what can be improved, so I will change the patch and have a additional try.
Hello WanMil,
I also thought about reducing the CPU time in LocationHook, but with a different approach: I think a lot of time is used because the quadtree first adds elements to resultList, (a HashSet) and 2nd a loop takes all these elements to add tags.
Two ideas: 1) Optimization: It might save time if we could pass a function pointer to the quadtree. The function does the tagging for one element, and we don't have to manage resultlists. I implemented that, but the program was slower. I assume that was caused by the fact that I did not implement the part that removes elements from the quadtree when they were "fully worked out".
I am not sure if managing the resultlists is slow. In the end it consists of adding elements to a HashSet (which I think should be quite fast?!)
2) An completely different approach: I did not yet fully understand the algorithmn in LocationHook, but I got the impression that it might be possible to create a dictionary with combinations of admin-level tags, and save a reference to that dictionary instead of adding similar tags to each element. This could save space and time, but would require additional logic in class Element. Do you think this is worth trying?
Of course you can try although I don't expect that it will improve much. The algorithm in LocationHook works as follows: 1. All elements (points and lines) having at least one tag are added to the QuadTree. This is expensive - it takes 20-50% of the complete LocationHook time. 2. All preprocessed bounds that overlaps the tile boundaries are loaded from file. There is only little improvement possible (maybe zipping could improve disk access time). 3. The bounds are sorted by their admin_level tag, so that the greatest admin_level (or postcode) comes first. 4. For all elements of the preprocessed bounds the covered elements (points and lines) are searched. All matching elements are tagged with the preprocessed bounds. The preprocessed bounds also contains internal references (mkgmap:lies_in) which are tagged also (e.g. the bounds of the town Cologne contains the information that it is located in region North Rhine Westfalia and Germany). 5. Elements having all tags from the remaining preprocessed bounds are removed from the quadtree because no more information can be added to them. Adding tags is a cheap method with a small memory footprint. At least the footprint is not much bigger than a reference to a dictionary. And you have to maintain the dictionary which I think is expensive. But I don't know exactly how you want to build up the dictionary so I cannot say for sure. Maybe one the following things can contain an improvement: * Analyzing the style rules might help to sort out some of the preprocessed boundaries * Changing the parameters of the Quadtree (max node size 1000) * The Quadtree query for a polygon (ElementQuadtreeNode: public Set<Element> get(ElementQuadTreePolygon polygon, Set<Element> resultList)) divides the polygon into bounds of the four subnodes. I expect that this is quite expensive. Maybe there is a better way to reduce the number of quadtree nodes that have to be queried. WanMil
Ciao, Gerd
-- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7137... Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil, thanks for the details regarding the algorithmn. WanMil wrote
Two ideas: 1) Optimization: It might save time if we could pass a function pointer to the quadtree. The function does the tagging for one element, and we don't have to manage resultlists. I implemented that, but the program was slower. I assume that was caused by the fact that I did not implement the part that removes elements from the quadtree when they were "fully worked out".
I am not sure if managing the resultlists is slow. In the end it consists of adding elements to a HashSet (which I think should be quite fast?!)
Well, sometimes the resultlist is quite big, e.g. 50% of all nodes in the quadtree. This involves some rehashing etc. These actions appear first in my java.hprof. Anyway, it will not give a performance boost to change that, only a few percent.
2) An completely different approach: I did not yet fully understand the algorithmn in LocationHook, but I got the impression that it might be possible to create a dictionary with combinations of admin-level tags, and save a reference to that dictionary instead of adding similar tags to each element. This could save space and time, but would require additional logic in class Element. Do you think this is worth trying?
Of course you can try although I don't expect that it will improve much.
I have to collect some numbers first to be sure about it. The dictionary that I have in mind is more or less what is kept in List<Boundary> boundaries What I don't understand yet regarding LocationHook is that most elements are NOT fully worked out after they appeared 1st in a result list. I'll investigate that today. If any element was worked out after the first time, we simply could save a reference to the entry in the boundary list instead of creating result lists and adding similar tags to many elements. Gerd -- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7139... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Am 31.12.2011 09:06, schrieb GerdP:
Hello WanMil,
thanks for the details regarding the algorithmn.
WanMil wrote
Two ideas: 1) Optimization: It might save time if we could pass a function pointer to the quadtree. The function does the tagging for one element, and we don't have to manage resultlists. I implemented that, but the program was slower. I assume that was caused by the fact that I did not implement the part that removes elements from the quadtree when they were "fully worked out".
I am not sure if managing the resultlists is slow. In the end it consists of adding elements to a HashSet (which I think should be quite fast?!)
Well, sometimes the resultlist is quite big, e.g. 50% of all nodes in the quadtree. This involves some rehashing etc. These actions appear first in my java.hprof. Anyway, it will not give a performance boost to change that, only a few percent.
2) An completely different approach: I did not yet fully understand the algorithmn in LocationHook, but I got the impression that it might be possible to create a dictionary with combinations of admin-level tags, and save a reference to that dictionary instead of adding similar tags to each element. This could save space and time, but would require additional logic in class Element. Do you think this is worth trying?
Of course you can try although I don't expect that it will improve much.
I have to collect some numbers first to be sure about it. The dictionary that I have in mind is more or less what is kept in List<Boundary> boundaries
What I don't understand yet regarding LocationHook is that most elements are NOT fully worked out after they appeared 1st in a result list. I'll investigate that today. If any element was worked out after the first time, we simply could save a reference to the entry in the boundary list instead of creating result lists and adding similar tags to many elements.
The reason is that the boundaries are not complete. Complete in two meanings: 1. Some boundaries are missing due to incomplete OSM data 2. In some areas there might be a boundary with admin_level=7, other areas do not have that admin_level. The LocationHook cannot know about that and therefore it can remove elements only if they contain all remaining tags. WanMil
Gerd

Hello WanMil, WanMil wrote
The reason is that the boundaries are not complete. Complete in two meanings: 1. Some boundaries are missing due to incomplete OSM data 2. In some areas there might be a boundary with admin_level=7, other areas do not have that admin_level. The LocationHook cannot know about that and therefore it can remove elements only if they contain all remaining tags.
ahh, incomplete data always makes things that much more complicated :-( I also just got to that admin_level=7. In "Saarland", one of my test cases, I see "Verbandsgemeinde Waldmohr" http://www.openstreetmap.org/browse/relation/897709 which is the only boundary that has admin_level=7. The interesting thing is: the returned result-list for this boundary is always empty. So, maybe there is a cheaper way to detect this first or does it means that there is a bug in quadtree or the selection of the boundaries? Gerd -- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7140... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hello WanMil,
WanMil wrote
The reason is that the boundaries are not complete. Complete in two meanings: 1. Some boundaries are missing due to incomplete OSM data 2. In some areas there might be a boundary with admin_level=7, other areas do not have that admin_level. The LocationHook cannot know about that and therefore it can remove elements only if they contain all remaining tags.
ahh, incomplete data always makes things that much more complicated :-(
I also just got to that admin_level=7. In "Saarland", one of my test cases,
admin_level=7 just an example. admin_levels are used differently in countries and in regions etc.
I see "Verbandsgemeinde Waldmohr" http://www.openstreetmap.org/browse/relation/897709 which is the only boundary that has admin_level=7. The interesting thing is: the returned result-list for this boundary is always empty. So, maybe there is a cheaper way to detect this first or does it means that there is a bug in quadtree or the selection of the boundaries?
Sounds more like a bug! But it's also possible that the higher admin_levels (8,9,10,11) references all other admin_levels and therefore there all elements of Waldmohr have been removed before querying for it. WanMil
Gerd
-- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7140... Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil,
admin_level=7 just an example. admin_levels are used differently in countries and in regions etc.
I see "Verbandsgemeinde Waldmohr" http://www.openstreetmap.org/browse/relation/897709 which is the only boundary that has admin_level=7. The interesting thing is: the returned result-list for this boundary is always empty. So, maybe there is a cheaper way to detect this first or does it means that there is a bug in quadtree or the selection of the boundaries?
Sounds more like a bug! But it's also possible that the higher admin_levels (8,9,10,11) references all other admin_levels and therefore there all elements of Waldmohr have been removed before querying for it.
hmm, I think you missed the point that we start with lower admin_levels: // Reverse the sorting because we want to start with the lowest admin level Collections.reverse(boundaries); and I found Waldmohr because it prevented the elements from being "fully worked out". So, I agree that this seems to be a bug. By the way, I use your bounds from europe_bounds_20111106.zip Ciao, Gerd

Am 31.12.2011 14:51, schrieb Gerd Petermann:
Hello WanMil,
admin_level=7 just an example. admin_levels are used differently in countries and in regions etc.
I see "Verbandsgemeinde Waldmohr" http://www.openstreetmap.org/browse/relation/897709 which is the only boundary that has admin_level=7. The interesting thing is: the returned result-list for this boundary is always empty. So, maybe there is a cheaper way to detect this first or does it means that there is a bug in quadtree or the selection of the boundaries?
Sounds more like a bug! But it's also possible that the higher admin_levels (8,9,10,11) references all other admin_levels and therefore there all elements of Waldmohr have been removed before querying for it.
hmm, I think you missed the point that we start with lower admin_levels: // Reverse the sorting because we want to start with the lowest admin level Collections.reverse(boundaries);
No, I didn't. I should have better written admin_levels (11,10,9,8) but the meaning is the same.
and I found Waldmohr because it prevented the elements from being "fully worked out".
Sounds reasonable. But then querying for Waldmohr should have returned some elements, shouldn't it?
So, I agree that this seems to be a bug. By the way, I use your bounds from europe_bounds_20111106.zip
Ciao, Gerd

Hi WanMil,
Date: Sat, 31 Dec 2011 14:55:08 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Am 31.12.2011 14:51, schrieb Gerd Petermann:
Hello WanMil,
admin_level=7 just an example. admin_levels are used differently in countries and in regions etc.
I see "Verbandsgemeinde Waldmohr" http://www.openstreetmap.org/browse/relation/897709 which is the only boundary that has admin_level=7. The interesting thing is: the returned result-list for this boundary is always empty. So, maybe there is a cheaper way to detect this first or does it means that there is a bug in quadtree or the selection of the boundaries?
Sounds more like a bug! But it's also possible that the higher admin_levels (8,9,10,11) references all other admin_levels and therefore there all elements of Waldmohr have been removed before querying for it.
hmm, I think you missed the point that we start with lower admin_levels: // Reverse the sorting because we want to start with the lowest admin level Collections.reverse(boundaries);
No, I didn't. I should have better written admin_levels (11,10,9,8) but the meaning is the same. No, not really. We check 2,4,5,6 first.
and I found Waldmohr because it prevented the elements from being "fully worked out".
Sounds reasonable. But then querying for Waldmohr should have returned some elements, shouldn't it?
No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case. Gerd

Am 31.12.2011 15:04, schrieb Gerd Petermann:
Hi WanMil,
Date: Sat, 31 Dec 2011 14:55:08 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Am 31.12.2011 14:51, schrieb Gerd Petermann:
Hello WanMil,
admin_level=7 just an example. admin_levels are used differently in countries and in regions etc.
I see "Verbandsgemeinde Waldmohr" http://www.openstreetmap.org/browse/relation/897709 which is the only boundary that has admin_level=7. The interesting thing is: the returned result-list for this boundary is always empty. So, maybe there is a cheaper way to detect this first or does it means that there is a bug in quadtree or the selection of the boundaries?
Sounds more like a bug! But it's also possible that the higher admin_levels (8,9,10,11) references all other admin_levels and therefore there all elements of Waldmohr have been removed before querying for it.
hmm, I think you missed the point that we start with lower admin_levels: // Reverse the sorting because we want to start with the lowest admin level Collections.reverse(boundaries);
No, I didn't. I should have better written admin_levels (11,10,9,8) but the meaning is the same. No, not really. We check 2,4,5,6 first.
That would be a bug. But I am very sure that the algorithm starts with admin_level 11. The BoundaryLevelCollator sorts the bounds with admin_level=2 first but then the bounds are reversed. Can you please check if I am wrong? That's important!
and I found Waldmohr because it prevented the elements from being
"fully
worked out".
Sounds reasonable. But then querying for Waldmohr should have returned some elements, shouldn't it? No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case.
Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi WanMil, sorry, you are right. I must have looked at availableLevels instead of remainingLevels :-( ciao, Gerd
Date: Sat, 31 Dec 2011 15:09:03 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Am 31.12.2011 15:04, schrieb Gerd Petermann:
Hi WanMil,
Date: Sat, 31 Dec 2011 14:55:08 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Am 31.12.2011 14:51, schrieb Gerd Petermann:
Hello WanMil,
admin_level=7 just an example. admin_levels are used differently in countries and in regions etc.
I see "Verbandsgemeinde Waldmohr" http://www.openstreetmap.org/browse/relation/897709 which is the only boundary that has admin_level=7. The interesting thing is: the returned result-list for this boundary is always empty. So, maybe there is a cheaper way to detect this first or does it means that there is a bug in quadtree or the selection of the boundaries?
Sounds more like a bug! But it's also possible that the higher admin_levels (8,9,10,11) references all other admin_levels and therefore there all elements of Waldmohr have been removed before querying for it.
hmm, I think you missed the point that we start with lower admin_levels: // Reverse the sorting because we want to start with the lowest admin level Collections.reverse(boundaries);
No, I didn't. I should have better written admin_levels (11,10,9,8) but the meaning is the same. No, not really. We check 2,4,5,6 first.
That would be a bug. But I am very sure that the algorithm starts with admin_level 11. The BoundaryLevelCollator sorts the bounds with admin_level=2 first but then the bounds are reversed.
Can you please check if I am wrong? That's important!
and I found Waldmohr because it prevented the elements from being
"fully
worked out".
Sounds reasonable. But then querying for Waldmohr should have returned some elements, shouldn't it? No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case.
Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Puuuh, but it's also a pity because that would have been an easy improvement ;-)
Hi WanMil,
sorry, you are right. I must have looked at availableLevels instead of remainingLevels :-(
ciao, Gerd
Date: Sat, 31 Dec 2011 15:09:03 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Am 31.12.2011 15:04, schrieb Gerd Petermann:
Hi WanMil,
Date: Sat, 31 Dec 2011 14:55:08 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Am 31.12.2011 14:51, schrieb Gerd Petermann:
Hello WanMil,
admin_level=7 just an example. admin_levels are used differently in countries and in regions etc.
> I see > "Verbandsgemeinde Waldmohr" > http://www.openstreetmap.org/browse/relation/897709 > which is the only boundary that has admin_level=7. The interesting thing is: > the returned result-list for this boundary is always empty. So, maybe there > is a cheaper way to detect this first or does it means that there is a bug > in quadtree or the selection of the boundaries?
Sounds more like a bug! But it's also possible that the higher admin_levels (8,9,10,11) references all other admin_levels and therefore there all elements of Waldmohr have been removed before querying for it.
hmm, I think you missed the point that we start with lower admin_levels: // Reverse the sorting because we want to start with the lowest admin level Collections.reverse(boundaries);
No, I didn't. I should have better written admin_levels (11,10,9,8) but the meaning is the same. No, not really. We check 2,4,5,6 first.
That would be a bug. But I am very sure that the algorithm starts with admin_level 11. The BoundaryLevelCollator sorts the bounds with admin_level=2 first but then the bounds are reversed.
Can you please check if I am wrong? That's important!
and I found Waldmohr because it prevented the elements from being
"fully
worked out".
Sounds reasonable. But then querying for Waldmohr should have
returned
some elements, shouldn't it? No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case.
Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil, I think I understand now what is so special with these admin_level=7 boundaries. They form a collection of admin_level=8 boundaries. So, after working through all admin_level=8 boundaries, the admin_level=7 is also done because the admin_level=8 boundaries have the lies_in tag for them, but LocationHook doesn't recognize that and continues to work through the admin_level=6 boundaries. For each of the level=6 boundaries we retrieve a large number of elements from quadTree, and most of them are already fully worked out. So, if I got this right, we could save a lot of time if we can detect that admin_level=7 should not be added to remainingLevels which would reduce the size of the quadtree earlier. I think we would need a function that determines if a boundary x is completely overlaped by boundaries with a higher admin_level that have the lies_in for x. Does that make sense? Do we have such a function? Gerd
and I found Waldmohr because it prevented the elements from being
"fully
worked out".
Sounds reasonable. But then querying for Waldmohr should have returned some elements, shouldn't it? No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case.

Hi Gerd, good idea! I am not sure if this additional preprocessing information has a great effect but it's worth trying that. Up to now we don't have such a function. The place for such a function would be to extend the BoundaryPreparer.workoutBoundaryRelations method. WanMil
Hello WanMil,
I think I understand now what is so special with these admin_level=7 boundaries. They form a collection of admin_level=8 boundaries. So, after working through all admin_level=8 boundaries, the admin_level=7 is also done because the admin_level=8 boundaries have the lies_in tag for them, but LocationHook doesn't recognize that and continues to work through the admin_level=6 boundaries. For each of the level=6 boundaries we retrieve a large number of elements from quadTree, and most of them are already fully worked out. So, if I got this right, we could save a lot of time if we can detect that admin_level=7 should not be added to remainingLevels which would reduce the size of the quadtree earlier. I think we would need a function that determines if a boundary x is completely overlaped by boundaries with a higher admin_level that have the lies_in for x.
Does that make sense? Do we have such a function?
Gerd
and I found Waldmohr because it prevented the elements from being
"fully
worked out".
Sounds reasonable. But then querying for Waldmohr should have returned some elements, shouldn't it? No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil, OK, I'll first try the effect by hard coding the wanted boundary list. If I really see a big improvement, I know how much complexety the new function can have. My first approach was to use subtract() for the Areas that are in the level=7 boundary, but that it much too slow. Ciao, Gerd
Date: Mon, 2 Jan 2012 20:40:32 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Hi Gerd,
good idea! I am not sure if this additional preprocessing information has a great effect but it's worth trying that.
Up to now we don't have such a function. The place for such a function would be to extend the BoundaryPreparer.workoutBoundaryRelations method.
WanMil
Hello WanMil,
I think I understand now what is so special with these admin_level=7 boundaries. They form a collection of admin_level=8 boundaries. So, after working through all admin_level=8 boundaries, the admin_level=7 is also done because the admin_level=8 boundaries have the lies_in tag for them, but LocationHook doesn't recognize that and continues to work through the admin_level=6 boundaries. For each of the level=6 boundaries we retrieve a large number of elements from quadTree, and most of them are already fully worked out. So, if I got this right, we could save a lot of time if we can detect that admin_level=7 should not be added to remainingLevels which would reduce the size of the quadtree earlier. I think we would need a function that determines if a boundary x is completely overlaped by boundaries with a higher admin_level that have the lies_in for x.
Does that make sense? Do we have such a function?
Gerd
and I found Waldmohr because it prevented the elements from being
"fully
worked out".
Sounds reasonable. But then querying for Waldmohr should have returned some elements, shouldn't it? No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Gerd, you have to do this step during the preprocessing. Time used in preprocessing doesn't matter because that have to be performed once only and most of the mkgmap users just download them and never have to run it themselves. All other aproaches won't work (I am quite sure ;-) WanMil
Hello WanMil,
OK, I'll first try the effect by hard coding the wanted boundary list. If I really see a big improvement, I know how much complexety the new function can have. My first approach was to use subtract() for the Areas that are in the level=7 boundary, but that it much too slow.
Ciao, Gerd
Date: Mon, 2 Jan 2012 20:40:32 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Hi Gerd,
good idea! I am not sure if this additional preprocessing information has a great effect but it's worth trying that.
Up to now we don't have such a function. The place for such a function would be to extend the BoundaryPreparer.workoutBoundaryRelations method.
WanMil
Hello WanMil,
I think I understand now what is so special with these admin_level=7 boundaries. They form a collection of admin_level=8 boundaries. So, after working through all admin_level=8 boundaries, the admin_level=7 is also done because the admin_level=8 boundaries have the lies_in tag for them, but LocationHook doesn't recognize that and continues to work through the admin_level=6 boundaries. For each of the level=6 boundaries we retrieve a large number of elements from quadTree, and most of them are already fully worked out. So, if I got this right, we could save a lot of time if we can detect that admin_level=7 should not be added to remainingLevels which would reduce the size of the quadtree earlier. I think we would need a function that determines if a boundary x is completely overlaped by boundaries with a higher admin_level that have the lies_in for x.
Does that make sense? Do we have such a function?
Gerd
> > and I found Waldmohr because it prevented the elements from being "fully > worked out".
Sounds reasonable. But then querying for Waldmohr should have returned some elements, shouldn't it? No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil, okay, I see. This will allow complex calculations. I have to read again how you create that boundary files for europe... Gerd
Date: Mon, 2 Jan 2012 20:59:52 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Hi Gerd,
you have to do this step during the preprocessing. Time used in preprocessing doesn't matter because that have to be performed once only and most of the mkgmap users just download them and never have to run it themselves.
All other aproaches won't work (I am quite sure ;-)
WanMil
Hello WanMil,
OK, I'll first try the effect by hard coding the wanted boundary list. If I really see a big improvement, I know how much complexety the new function can have. My first approach was to use subtract() for the Areas that are in the level=7 boundary, but that it much too slow.
Ciao, Gerd
Date: Mon, 2 Jan 2012 20:40:32 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Hi Gerd,
good idea! I am not sure if this additional preprocessing information has a great effect but it's worth trying that.
Up to now we don't have such a function. The place for such a function would be to extend the BoundaryPreparer.workoutBoundaryRelations method.
WanMil
Hello WanMil,
I think I understand now what is so special with these admin_level=7 boundaries. They form a collection of admin_level=8 boundaries. So, after working through all admin_level=8 boundaries, the admin_level=7 is also done because the admin_level=8 boundaries have the lies_in tag for them, but LocationHook doesn't recognize that and continues to work through the admin_level=6 boundaries. For each of the level=6 boundaries we retrieve a large number of elements from quadTree, and most of them are already fully worked out. So, if I got this right, we could save a lot of time if we can detect that admin_level=7 should not be added to remainingLevels which would reduce the size of the quadtree earlier. I think we would need a function that determines if a boundary x is completely overlaped by boundaries with a higher admin_level that have the lies_in for x.
Does that make sense? Do we have such a function?
Gerd
> > > > and I found Waldmohr because it prevented the elements from being "fully > > worked out". > > Sounds reasonable. But then querying for Waldmohr should have returned > some elements, shouldn't it? No, all elements are kept in the quadtree because they are not "fully worked out" until this admin_level=7 part is done, or to be more precise, until they are processed for admin_level=8 or higher, because for admin_level=7 nothing is changed in this special case.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd, attached BoundaryLister is a useful util when working with bounds. It lists the tags of all bounds in the preprocessed bounds files. Just give the bounds dir as parameter and the BoundaryLister creates a bounds.txt file containing all the information. WanMil
Hello WanMil,
okay, I see. This will allow complex calculations. I have to read again how you create that boundary files for europe...
Gerd
Date: Mon, 2 Jan 2012 20:59:52 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Hi Gerd,
you have to do this step during the preprocessing. Time used in preprocessing doesn't matter because that have to be performed once only and most of the mkgmap users just download them and never have to run it themselves.
All other aproaches won't work (I am quite sure ;-)
WanMil
Hello WanMil,
OK, I'll first try the effect by hard coding the wanted boundary list. If I really see a big improvement, I know how much complexety the new function can have. My first approach was to use subtract() for the Areas that are in the level=7 boundary, but that it much too slow.
Ciao, Gerd
Date: Mon, 2 Jan 2012 20:40:32 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Hi Gerd,
good idea! I am not sure if this additional preprocessing information has a great effect but it's worth trying that.
Up to now we don't have such a function. The place for such a function would be to extend the BoundaryPreparer.workoutBoundaryRelations method.
WanMil
Hello WanMil,
I think I understand now what is so special with these admin_level=7 boundaries. They form a collection of admin_level=8 boundaries. So, after working through all admin_level=8 boundaries, the admin_level=7 is also done because the admin_level=8 boundaries have the lies_in tag for them, but LocationHook doesn't recognize that and continues to work through the admin_level=6 boundaries. For each of the level=6 boundaries we retrieve a large number of elements from quadTree, and most of them are already fully worked out. So, if I got this right, we could save a lot of time if we can detect that admin_level=7 should not be added to remainingLevels which would reduce the size of the quadtree earlier. I think we would need a function that determines if a boundary x is completely overlaped by boundaries with a higher admin_level that have the lies_in for x.
Does that make sense? Do we have such a function?
Gerd
> > > > > > and I found Waldmohr because it prevented the elements from being > "fully > > > worked out". > > > > Sounds reasonable. But then querying for Waldmohr should have returned > > some elements, shouldn't it? > No, all elements are kept in the quadtree because they are not "fully > worked out" until this admin_level=7 part is done, or to be more > precise, until they are processed for admin_level=8 or higher, because > for admin_level=7 nothing is changed in this special case.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil, thanks for the util. I am still trying to produce the input for mkgmap with osmosis. The program produces very large temp files, and after hours it crashed because the disk was full :-( Maybe I am not using the best parms? Or doesn't it work on windows? I use this: osmosis -v --rb f:\osm\europe.osm.pbf --tf accept-relations boundary=administrative --used-node idTrackerType=BitSet --used-way idTrackerType=BitSet --wb boundary_relations.osm.pbf Maybe I can download the result somewhere? Ciao, Gerd -- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7147... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hi Gerd, don't use osmosis: it's *very* slow for such a job. Better use the osmconvert/osmfilter toolchain. That's magnitudes faster... You can find a how to at: http://wiki.openstreetmap.org/wiki/Mkgmap/help/options#Using_precompiled_bou... WanMil
Hello WanMil,
thanks for the util. I am still trying to produce the input for mkgmap with osmosis. The program produces very large temp files, and after hours it crashed because the disk was full :-(
Maybe I am not using the best parms? Or doesn't it work on windows? I use this: osmosis -v --rb f:\osm\europe.osm.pbf --tf accept-relations boundary=administrative --used-node idTrackerType=BitSet --used-way idTrackerType=BitSet --wb boundary_relations.osm.pbf
Maybe I can download the result somewhere?
Ciao, Gerd
-- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7147... Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil, Shame on me. I finally found that the description is on the new help page, the place I should have visited first ... By the way: On windows, r2159 is not able to process the european boundaries, but the patch for Coord.java seems to allow the action on windows. So I think this is an argument for it. Working with these large files is still no fun with mkgmap, so I hope I can find something which really improves performance ;-) Gerd G:\mkgmap_all\trunk\dist>java -Xmx1600m -jar mkgmap.jar --createboundsfile=f:\osm\europe-boundaries.osm Time started: Tue Jan 03 20:05:54 CET 2012 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at sun.awt.geom.Order1.getReversedCurve(Order1.java:198) at sun.awt.geom.Curve.getWithDirection(Curve.java:706) at sun.awt.geom.CurveLink.getSubCurve(CurveLink.java:56) at sun.awt.geom.AreaOp.pruneEdges(AreaOp.java:389) at sun.awt.geom.AreaOp.calculate(AreaOp.java:141) at java.awt.geom.Area.pathToCurves(Area.java:177) at java.awt.geom.Area.<init>(Area.java:108) at uk.me.parabola.util.Java2DConverter.createArea(Java2DConverter.java:60) at uk.me.parabola.mkgmap.reader.osm.boundary.BoundaryRelation.processElements(BoundaryRelation.java:276) at uk.me.parabola.mkgmap.reader.osm.boundary.BoundaryElementSaver.addRelation(BoundaryElementSaver.java:132) at uk.me.parabola.mkgmap.reader.osm.xml.Osm5XmlHandler$SaxHandler.endElement(Osm5XmlHandler.java:182) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.endElement(XIncludeHandler.java:1016) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScan nerImpl.java:1782) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocument FragmentScannerImpl.java:2939) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScanne rImpl.java:511) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) at javax.xml.parsers.SAXParser.parse(SAXParser.java:395) at javax.xml.parsers.SAXParser.parse(SAXParser.java:198) at uk.me.parabola.mkgmap.reader.osm.xml.Osm5MapDataSource.load(Osm5MapDataSource.java:72) at uk.me.parabola.mkgmap.reader.osm.boundary.BoundaryPreparer.run(BoundaryPreparer.java:105) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:341) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:126) at uk.me.parabola.mkgmap.main.Main.main(Main.java:114) Gerd
Date: Tue, 3 Jan 2012 20:38:42 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Hi Gerd,
don't use osmosis: it's *very* slow for such a job.
Better use the osmconvert/osmfilter toolchain. That's magnitudes faster... You can find a how to at: http://wiki.openstreetmap.org/wiki/Mkgmap/help/options#Using_precompiled_bou...
WanMil
Hello WanMil,
thanks for the util. I am still trying to produce the input for mkgmap with osmosis. The program produces very large temp files, and after hours it crashed because the disk was full :-(
Maybe I am not using the best parms? Or doesn't it work on windows? I use this: osmosis -v --rb f:\osm\europe.osm.pbf --tf accept-relations boundary=administrative --used-node idTrackerType=BitSet --used-way idTrackerType=BitSet --wb boundary_relations.osm.pbf
Maybe I can download the result somewhere?
Ciao, Gerd
-- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7147... Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd, don't worry. The link to the description was broken on the new help page. I've just corrected it before I replied to you.... The last time (beginning of December) I've created the boundaries for europe with -Xmx3g and that was *very* close to the OOME. But there is a workaround which I also use to create the world boundaries: One can workout smaller overlapping parts (containing complete countries!) and merge the results of the smaller parts: java uk.me.parabola.mkgmap.reader.osm.boundary.BoundaryMerger <boundsdir1> <boundsdir2> <resultdir> WanMil
Hello WanMil,
Shame on me. I finally found that the description is on the new help page, the place I should have visited first ...
By the way: On windows, r2159 is not able to process the european boundaries, but the patch for Coord.java seems to allow the action on windows. So I think this is an argument for it. Working with these large files is still no fun with mkgmap, so I hope I can find something which really improves performance ;-)
Gerd
G:\mkgmap_all\trunk\dist>java -Xmx1600m -jar mkgmap.jar --createboundsfile=f:\osm\europe-boundaries.osm Time started: Tue Jan 03 20:05:54 CET 2012 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at sun.awt.geom.Order1.getReversedCurve(Order1.java:198) at sun.awt.geom.Curve.getWithDirection(Curve.java:706) at sun.awt.geom.CurveLink.getSubCurve(CurveLink.java:56) at sun.awt.geom.AreaOp.pruneEdges(AreaOp.java:389) at sun.awt.geom.AreaOp.calculate(AreaOp.java:141) at java.awt.geom.Area.pathToCurves(Area.java:177) at java.awt.geom.Area.<init>(Area.java:108) at uk.me.parabola.util.Java2DConverter.createArea(Java2DConverter.java:60) at uk.me.parabola.mkgmap.reader.osm.boundary.BoundaryRelation.processElements(BoundaryRelation.java:276) at uk.me.parabola.mkgmap.reader.osm.boundary.BoundaryElementSaver.addRelation(BoundaryElementSaver.java:132) at uk.me.parabola.mkgmap.reader.osm.xml.Osm5XmlHandler$SaxHandler.endElement(Osm5XmlHandler.java:182) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:601) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.endElement(XIncludeHandler.java:1016) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScan nerImpl.java:1782) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocument FragmentScannerImpl.java:2939) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScanne rImpl.java:511) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) at javax.xml.parsers.SAXParser.parse(SAXParser.java:395) at javax.xml.parsers.SAXParser.parse(SAXParser.java:198) at uk.me.parabola.mkgmap.reader.osm.xml.Osm5MapDataSource.load(Osm5MapDataSource.java:72) at uk.me.parabola.mkgmap.reader.osm.boundary.BoundaryPreparer.run(BoundaryPreparer.java:105) at uk.me.parabola.mkgmap.main.Main.endOptions(Main.java:341) at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:126) at uk.me.parabola.mkgmap.main.Main.main(Main.java:114)
Gerd
Date: Tue, 3 Jan 2012 20:38:42 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] [PATCH v1] LocationHook speedup
Hi Gerd,
don't use osmosis: it's *very* slow for such a job.
Better use the osmconvert/osmfilter toolchain. That's magnitudes faster... You can find a how to at:
http://wiki.openstreetmap.org/wiki/Mkgmap/help/options#Using_precompiled_bou...
WanMil
Hello WanMil,
thanks for the util. I am still trying to produce the input for
mkgmap with
osmosis. The program produces very large temp files, and after hours it crashed because the disk was full :-(
Maybe I am not using the best parms? Or doesn't it work on windows? I use this: osmosis -v --rb f:\osm\europe.osm.pbf --tf accept-relations boundary=administrative --used-node idTrackerType=BitSet --used-way idTrackerType=BitSet --wb boundary_relations.osm.pbf
Maybe I can download the result somewhere?
Ciao, Gerd
-- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7147... Sent from the Mkgmap Development mailing list archive at Nabble.com. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

I tried to improve the first patch by removing anything not required in the Quadtree and by using a different internal data structure. I've seen performance improvements but please try and test yourself :-) The most time is now spend in the creation of the Quadtree. So if you want to search for more performance just start there. WanMil
Gerds patches inspired me to look for more things that could be improved.
I found that the Quadtree used in the LocationHook is not very optimal. The patch is a first try to increase the performance. The time required for the LocationHook is reduced by 10-50% which is great.
Warning: I haven't checked so far if the results are equal. So maybe there are big bugs in the patch... (and the speedup comes from the poor implementation)
I will do some more tests and optimizations but maybe some of you can have a look on it, test it and comment it.
Have fun! WanMil
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil, I tried it. With small input files I see no change. With larger tiles, it seems to be a bit faster, e.g. runtime decreased 270 to 265 secs. Gerd Date: Sat, 31 Dec 2011 01:18:18 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: [mkgmap-dev] [PATCH v2] LocationHook speedup I tried to improve the first patch by removing anything not required in the Quadtree and by using a different internal data structure. I've seen performance improvements but please try and test yourself :-) The most time is now spend in the creation of the Quadtree. So if you want to search for more performance just start there. WanMil
Gerds patches inspired me to look for more things that could be improved.
I found that the Quadtree used in the LocationHook is not very optimal. The patch is a first try to increase the performance. The time required for the LocationHook is reduced by 10-50% which is great.
Warning: I haven't checked so far if the results are equal. So maybe there are big bugs in the patch... (and the speedup comes from the poor implementation)
I will do some more tests and optimizations but maybe some of you can have a look on it, test it and comment it.
Have fun! WanMil
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Gerd, now we have the problem of how to measure the runtime performance. I have compared unpatched and patched mkgmap. Both versions contain the time output of the patched version. I have used one thread only with 15 european tiles (widely spread over europe) and compare the summarized runtime of the LocationHook only because that's what should be improved by the patch. I have done 4 runs of each version. The mean values are: r2154: 65280 ms for LocationHook, 335325 ms overall runtime patch: 55173 ms for LocationHook, 313082 ms overall runtime diff: 10107 ms improvement Hook, 22243 ms improvement overall The overall improvement is a bit problematic because I would expect that it is close to the LocationHook improvement but its twice. The patch uses less memory and therefore the GC (which probably runs outside the Hook) must do less work. But I am unsure if that's the reason for the good overall improvement. WanMil
Hello WanMil,
I tried it. With small input files I see no change. With larger tiles, it seems to be a bit faster, e.g. runtime decreased 270 to 265 secs.
Gerd
Date: Sat, 31 Dec 2011 01:18:18 +0100 From: wmgcnfg@web.de To: mkgmap-dev@lists.mkgmap.org.uk Subject: [mkgmap-dev] [PATCH v2] LocationHook speedup
I tried to improve the first patch by removing anything not required in the Quadtree and by using a different internal data structure.
I've seen performance improvements but please try and test yourself :-)
The most time is now spend in the creation of the Quadtree. So if you want to search for more performance just start there.
WanMil
Gerds patches inspired me to look for more things that could be improved.
I found that the Quadtree used in the LocationHook is not very optimal. The patch is a first try to increase the performance. The time required for the LocationHook is reduced by 10-50% which is great.
Warning: I haven't checked so far if the results are equal. So maybe there are big bugs in the patch... (and the speedup comes from the poor implementation)
I will do some more tests and optimizations but maybe some of you can have a look on it, test it and comment it.
Have fun! WanMil
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hello WanMil, I think you alew -- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7140... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hi WanMil, Yes , I think you always have to look at both values, esp. with java and GC. If you optimize a function by allocating a lot of temp. objects, you might see better runtime in this function but overall runtime may be worse because GC gets very busy after executing the function. Gerd WanMil wrote
now we have the problem of how to measure the runtime performance.
I have compared unpatched and patched mkgmap. Both versions contain the time output of the patched version. I have used one thread only with 15 european tiles (widely spread over europe) and compare the summarized runtime of the LocationHook only because that's what should be improved by the patch.
I have done 4 runs of each version. The mean values are:
r2154: 65280 ms for LocationHook, 335325 ms overall runtime patch: 55173 ms for LocationHook, 313082 ms overall runtime
diff: 10107 ms improvement Hook, 22243 ms improvement overall
The overall improvement is a bit problematic because I would expect that it is close to the LocationHook improvement but its twice. The patch uses less memory and therefore the GC (which probably runs outside the Hook) must do less work. But I am unsure if that's the reason for the good overall improvement.
-- View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7140... Sent from the Mkgmap Development mailing list archive at Nabble.com.

I've found and fixed another performance bottleneck: The init method of the LocationHook contained a check if there is any bounds file in the boundary directory. On my computer this check requires ~1200ms. This is done for each tile and must be perfomed once only. So attached patch saves additionally 1200ms for each tile :-) WanMil
I tried to improve the first patch by removing anything not required in the Quadtree and by using a different internal data structure.
I've seen performance improvements but please try and test yourself :-)
The most time is now spend in the creation of the Quadtree. So if you want to search for more performance just start there.
WanMil
Gerds patches inspired me to look for more things that could be improved.
I found that the Quadtree used in the LocationHook is not very optimal. The patch is a first try to increase the performance. The time required for the LocationHook is reduced by 10-50% which is great.
Warning: I haven't checked so far if the results are equal. So maybe there are big bugs in the patch... (and the speedup comes from the poor implementation)
I will do some more tests and optimizations but maybe some of you can have a look on it, test it and comment it.
Have fun! WanMil
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

I used another way to optimize that part. On my machine, the list() method is much faster than listFiles(): String [] boundaryFileNames = boundaryDir.list(); boolean foundBndFile = false; for (String s: boundaryFileNames){ if (s.endsWith(".bnd")) { foundBndFile = true; break; } } if (!foundBndFile) { log.error("Disable LocationHook because boundary directory contains files. Dir: " + boundaryDir); return false; } Gerd I've found and fixed another performance bottleneck: The init method of the LocationHook contained a check if there is any bounds file in the boundary directory. On my computer this check requires ~1200ms. This is done for each tile and must be perfomed once only. So attached patch saves additionally 1200ms for each tile :-) WanMil
I tried to improve the first patch by removing anything not required in the Quadtree and by using a different internal data structure.
I've seen performance improvements but please try and test yourself :-)
The most time is now spend in the creation of the Quadtree. So if you want to search for more performance just start there.
WanMil
Gerds patches inspired me to look for more things that could be improved.
I found that the Quadtree used in the LocationHook is not very optimal. The patch is a first try to increase the performance. The time required for the LocationHook is reduced by 10-50% which is great.
Warning: I haven't checked so far if the results are equal. So maybe there are big bugs in the patch... (and the speedup comes from the poor implementation)
I will do some more tests and optimizations but maybe some of you can have a look on it, test it and comment it.
Have fun! WanMil

Am 31.12.2011 14:58, schrieb Gerd Petermann:
I used another way to optimize that part. On my machine, the list() method is much faster than listFiles():
String [] boundaryFileNames = boundaryDir.list(); boolean foundBndFile = false; for (String s: boundaryFileNames){ if (s.endsWith(".bnd")) { foundBndFile = true; break; } } if (!foundBndFile) { log.error("Disable LocationHook because boundary directory contains files. Dir: " + boundaryDir); return false; }
Gerd Hi, you missed a "no" in the errormessage ;)
Henning

Great! (Although for a nitpicker your patch is wrong. One could create a directory containing one subdir named "123.bnd". This directory would be detected as sufficient bounds dir by your patch. But I think we can ignore this :-) WanMil
I used another way to optimize that part. On my machine, the list() method is much faster than listFiles():
String [] boundaryFileNames = boundaryDir.list(); boolean foundBndFile = false; for (String s: boundaryFileNames){ if (s.endsWith(".bnd")) { foundBndFile = true; break; } } if (!foundBndFile) { log.error("Disable LocationHook because boundary directory contains files. Dir: " + boundaryDir); return false; }
Gerd
I've found and fixed another performance bottleneck: The init method of the LocationHook contained a check if there is any bounds file in the boundary directory. On my computer this check requires ~1200ms. This is done for each tile and must be perfomed once only. So attached patch saves additionally 1200ms for each tile :-)
WanMil
I tried to improve the first patch by removing anything not required in the Quadtree and by using a different internal data structure.
I've seen performance improvements but please try and test yourself :-)
The most time is now spend in the creation of the Quadtree. So if you want to search for more performance just start there.
WanMil
Gerds patches inspired me to look for more things that could be improved.
I found that the Quadtree used in the LocationHook is not very optimal. The patch is a first try to increase the performance. The time required for the LocationHook is reduced by 10-50% which is great.
Warning: I haven't checked so far if the results are equal. So maybe there are big bugs in the patch... (and the speedup comes from the poor implementation)
I will do some more tests and optimizations but maybe some of you can have a look on it, test it and comment it.
Have fun! WanMil
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
participants (4)
-
Gerd Petermann
-
GerdP
-
Henning Scholland
-
WanMil