
In the last days there have been two questions around the mkgmap message "Area (xxx,yyy) to (zzz,qqq) contains ... but can't be split further". The message is printed during the subdivision creation process which means a tile is splitted into several subdivisions which do not exceed a special dimension and does not exceed a maximum size (which means a max number of nodes, lines and/or shapes). I have started to look how this works because I think there is some work to do. Such errors will occurr more often with increasing density of map information. The split process does not seem to be optimal. I want to start with a list of questions and observations and hope someone of you is able to answer/comment them or is willing to dive into the source code to get an idea of what's going on there. 1. Class Area definition An area is defined with minLat, minLong, maxLat, maxLong. It is not possible to define an area with minLat==maxLat or minLong==maxLong. Is there any reason for this? 2. Class Area split The split method divides the area into a given number of (rather) equal sized subareas. An area [(0,0) to (100,100)] would be split (xsplit==2, ysplit==1) into [(0,0) to (50,100)] [(50,0) to (100,100)] I think this is wrong, because the two areas are overlapping at x==50. Does anyone know why the split method does not split into distinct areas? 3. Class MapSplitter As far as I understand lines and shapes are put to the subdivision that contain their center points. Assuming the situation (I describe a line example but the same applies to shapes) E ------- + | | + | x | + ------- + + S+++++++++ S = start point of a line E = end point of a line + = line x = center point -| = borders of the subdivision The line is defined in the subdivision although it does not intersect the subdivision. But its center point is located in the subdivision. Is that correct? The coordinates of the line points are stored as diffs to the center of the subdivision as a kind of compression. The diffs have a higher and lower limit. If a line is quite long and one point is far away from the center of the subdivision it could not be stored. What happens in such a case? I think a warning message is printed in the constructor of the Subdivision class but what happens to the line point? Would it be better to split a line in such a case? Or would it be better to split a line so that the complete line fits into a subdivision? 4. Subdivision splitting In case a subdivision is too full it is splitted into equal sized areas. Would it be a good first approach to improve the splitting so that the algorithm tries to split the subdivision into rather equal filled subdivisions (e.g. each splitted subdivision should contain the same number of points plus center points of lines/shapes? Have fun! WanMil

Hi It would be good if we had a tool to visualise the strategies used to allocate elements to the subdivisions. Something that displays the subdivision boundaries and highlights the elements that belong to a subdivision in some way. I really have no idea of the best way to do this.
1. Class Area definition An area is defined with minLat, minLong, maxLat, maxLong. It is not possible to define an area with minLat==maxLat or minLong==maxLong. Is there any reason for this?
Perhaps it is but then it would not be able to contain anything except perhaps a vertical or horizontal line. Why do you think it is important that it should be possible.
2. Class Area split The split method divides the area into a given number of (rather) equal sized subareas. An area [(0,0) to (100,100)] would be split (xsplit==2, ysplit==1) into [(0,0) to (50,100)] [(50,0) to (100,100)] I think this is wrong, because the two areas are overlapping at x==50. Does anyone know why the split method does not split into distinct areas?
This is just a grid used to allocate elements to subdivision. In reality the subdivision is expanded to contain its elements. So they always overlap by quite a bit. When a division is split, then each element is allocated in one of the two new divisions. When the splitting is finished the real subdivision bounds are calculated based on the actual contents. The only important thing is that each element can be allocated into exactly one of the new divisions.
3. Class MapSplitter As far as I understand lines and shapes are put to the subdivision that contain their center points.
I can't remember the whole history here, I think at one time I used the first point in the line, but it didn't produce particularly uniform results.
Assuming the situation (I describe a line example but the same applies to shapes)
E ------- + | | + | x | + ------- + + S+++++++++
S = start point of a line E = end point of a line + = line x = center point -| = borders of the subdivision
The line is defined in the subdivision although it does not intersect the subdivision. But its center point is located in the subdivision. Is that correct?
Because the grid is only used to allocate elements to subdivisions, it doesn't matter that no point is actually in the grid section. The subdivision will always be sized to contain its contents. I'm not saying that its a good thing though. Possible room for improvement there if it happens often.
The coordinates of the line points are stored as diffs to the center of the subdivision as a kind of compression. The diffs have a higher and lower limit. If a line is quite long and one point is far away from the center of the subdivision it could not be stored. What happens in such a case? I think a warning message is printed in the constructor of the Subdivision class but what happens to the line point? Would it be better to split a line in such a case? Or would it be better to split a line so that the complete line fits into a subdivision?
Well this is the main problem. We split lines earlier on. Originally I split lines so that in the worst case they would never be too big to fit into a subdivision. Perhaps we should split them after we know roughly where the subdivision boundaries are going to be.
4. Subdivision splitting In case a subdivision is too full it is splitted into equal sized areas. Would it be a good first approach to improve the splitting so that the algorithm tries to split the subdivision into rather equal filled subdivisions (e.g. each splitted subdivision should contain the same number of points plus center points of lines/shapes?
Probably yes! I think our existing approach gives sub-divisions that overlap a lot more than Garmin or cgpsmapper produced maps. It is certainly something that needs looking at again and I encourage you to do so. ..Steve

Hi
It would be good if we had a tool to visualise the strategies used to allocate elements to the subdivisions. Something that displays the subdivision boundaries and highlights the elements that belong to a subdivision in some way.
I really have no idea of the best way to do this.
Sounds complicated. The only idea I have is to create a separate osm file for each subdivision. The osm file(s) could be loaded into JOSM for inspection. But this is only useful to check some special subdivisions and not the complete process.
1. Class Area definition An area is defined with minLat, minLong, maxLat, maxLong. It is not possible to define an area with minLat==maxLat or minLong==maxLong. Is there any reason for this?
Perhaps it is but then it would not be able to contain anything except perhaps a vertical or horizontal line. Why do you think it is important that it should be possible.
Nothing special. I just wondered if that's correct...
2. Class Area split The split method divides the area into a given number of (rather) equal sized subareas. An area [(0,0) to (100,100)] would be split (xsplit==2, ysplit==1) into [(0,0) to (50,100)] [(50,0) to (100,100)] I think this is wrong, because the two areas are overlapping at x==50. Does anyone know why the split method does not split into distinct areas?
This is just a grid used to allocate elements to subdivision. In reality the subdivision is expanded to contain its elements. So they always overlap by quite a bit.
When a division is split, then each element is allocated in one of the two new divisions. When the splitting is finished the real subdivision bounds are calculated based on the actual contents. The only important thing is that each element can be allocated into exactly one of the new divisions.
Yeah, I see. Summary: it is allowed that subdivision sizes overlap. Maybe there is room for improvement to reduce overlapping size. While thinking about it I guess that less overlapping means faster map drawing because MapSource and the devices have to read less data if subdivisions do not overlap so much?!
3. Class MapSplitter As far as I understand lines and shapes are put to the subdivision that contain their center points.
I can't remember the whole history here, I think at one time I used the first point in the line, but it didn't produce particularly uniform results.
Center point is a good choice because that minimizes the maximum value of the lat/long diffs.
Assuming the situation (I describe a line example but the same applies to shapes)
E ------- + | | + | x | + ------- + + S+++++++++
S = start point of a line E = end point of a line + = line x = center point -| = borders of the subdivision
The line is defined in the subdivision although it does not intersect the subdivision. But its center point is located in the subdivision. Is that correct?
Because the grid is only used to allocate elements to subdivisions, it doesn't matter that no point is actually in the grid section. The subdivision will always be sized to contain its contents.
I'm not saying that its a good thing though. Possible room for improvement there if it happens often.
I agree.
The coordinates of the line points are stored as diffs to the center of the subdivision as a kind of compression. The diffs have a higher and lower limit. If a line is quite long and one point is far away from the center of the subdivision it could not be stored. What happens in such a case? I think a warning message is printed in the constructor of the Subdivision class but what happens to the line point? Would it be better to split a line in such a case? Or would it be better to split a line so that the complete line fits into a subdivision?
Well this is the main problem. We split lines earlier on. Originally I split lines so that in the worst case they would never be too big to fit into a subdivision. Perhaps we should split them after we know roughly where the subdivision boundaries are going to be.
Yes, that's it. The filters are processed AFTER the subdivisions are created. From my point of view this should be the other way round. I have a tile containing a polygon with 38350 points. This single polygon exceeds the limits of a subdivision. Because the polygon filters run after the subdivison creation the algorithm to create the subdivisions fails. It would be better to run some or all filters before so that the smaller elements can be used to create the subdivisions. I will try to change that as a next step.
4. Subdivision splitting In case a subdivision is too full it is splitted into equal sized areas. Would it be a good first approach to improve the splitting so that the algorithm tries to split the subdivision into rather equal filled subdivisions (e.g. each splitted subdivision should contain the same number of points plus center points of lines/shapes?
Probably yes!
I tried to do that and implemented an easy approach. But this works only after it is ensured that there is no element that exceeds the subdivision limits by putting the filter chain before the subdivision creation.
I think our existing approach gives sub-divisions that overlap a lot more than Garmin or cgpsmapper produced maps.
It is certainly something that needs looking at again and I encourage you to do so.
..Steve
WanMil

The patch changes the way how subdivisions are splitted. 1.: The LineMergeFilter, the LineSplitterFilter and the PolygonSplitterFilter runs now before subdivions are built. This avoids the problem that single lines/polygons are bigger than allowed in one subdivision. ToDo: The filters parameters use very small values (e.g. 250 points for lines) which is only needed after a subdivision has been built. So maybe it is good to run these filters twice. First with subdivision limits before creating subdivisions and second after splitting each subdivision with lower RGN(?) limits. 2.: Subdivisions are no longer split in equal bounds sizes but into rather equal item sizes. ToDo: At the moment a too large subdivision is split into 4 subdivisions. This algorithm might be improved to compute it dynamically. Also points, lines and shapes should have a different weight in the split algorithm. This patch is a very early patch but if fixes the "Area (xxx,yyy) to (zzz,qqq) contains ... but can't be split further" messages. At the moment I am thinking about how to continue: Maybe it is reasonable to run an initial filter (filters that are not resolution dependant) set just before starting the subdivision creation. [This is what the patch already does.] The result of this filter set will not be changed afterwards and is used as input of each subdivision creation step. For each resolution the subdivisions of the previous resolution are copied but the contents are removed. Then the resolution dependant filter sets are applied (DP filter, SizeFilter, etc.) and the results are assigned to the subdivisions. As a last step too big subdivisions are splitted. [This swaps the current processing.] Possible problems: A line l in resolution 15 is subdivision A. l is splitted into 3 parts l1 to l3 in resolution 18. Is it necessary that l1 to l3 are assigned to child subdivisions of A or is it possible that e.g. l3 is assigned to a child B1 of subdivision B because the splitted line l3 fits better to B1? Are there any routing problems? I have found the thread http://www.mail-archive.com/mkgmap-dev@lists.mkgmap.org.uk/msg07335.html which indicates that my idea does not work. WanMil
participants (2)
-
Steve Ratcliffe
-
WanMil