open issues in the housenumber2 branch

Hi all, during the last days I've analysed the reasons for the error messages reported by some of you. I came to the conclusion that I have to - change the handling of addr:interpolation ways completely - add code to detect the "random number" case earlier and - if detected - use different methods to split the number intervals Both are rather complex changes, so I'll need a few days to code this. One open question for you: How should we handle "missing" information? If mkgmap finds the numbers 1,3,9,11,17 in that order on the left side of a road called ABC, it can create different housenumber informations. One could be like "odd numbers from 1 to 17 on the left", another could be a more complex sequence 1) "odd numbers from 1 to 3 on the left", 2)"odd numbers from 9 to 11 on the left", 3) "odd number 17 on the left" A search for ABC 5 would either show a point between 3 and 9 or two entries with the numbers 3 and 9. The latter tells you that OSM probably doesn't contain the exact information and let's you decide where to search for ABC 5. The trunk version tends to the simple info, while r3486 is more likely to produce the complex one. I think trunk is better here, the complex case should only occur if the "random number" case was detected. Do you agree? Gerd

Gerd, can you explain how addresses are stored in the Garmin maps? Are they millions of individual nodes, or "interpolation ways" with a start and an end? Are they just coordinates, or are they internally linked to a road in some way? Can it handle non-numeric house numbers? Understanding the "target data model" might help to assess all the possibilities we have in interpreting the data from OSM. Colin On 2015-03-01 09:42, Gerd Petermann wrote:
Hi all,
during the last days I've analysed the reasons for the error messages reported by some of you. I came to the conclusion that I have to - change the handling of addr:interpolation ways completely - add code to detect the "random number" case earlier and - if detected - use different methods to split the number intervals
Both are rather complex changes, so I'll need a few days to code this.
One open question for you: How should we handle "missing" information? If mkgmap finds the numbers 1,3,9,11,17 in that order on the left side of a road called ABC, it can create different housenumber informations. One could be like "odd numbers from 1 to 17 on the left", another could be a more complex sequence 1) "odd numbers from 1 to 3 on the left", 2)"odd numbers from 9 to 11 on the left", 3) "odd number 17 on the left" A search for ABC 5 would either show a point between 3 and 9 or two entries with the numbers 3 and 9. The latter tells you that OSM probably doesn't contain the exact information and let's you decide where to search for ABC 5.
The trunk version tends to the simple info, while r3486 is more likely to produce the complex one. I think trunk is better here, the complex case should only occur if the "random number" case was detected.
Do you agree?
Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev [1]
Links: ------ [1] http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Colin, I think I tried it already, but here is my current knowledge: The Garmin format stores addresses like interpolation ways along roads (=routable ways) The corresponding information in mkgmap is stored in class Numbers, which has the fields start, end, and style, once for the left, once for the right side of the road. The style tells you what numbers are between start and end, one of "odd,even,both,none". The log shows this as e.g. (n4),B,183,194,B,174,182 which means something like "road segment number 4 has B="both" numbers from 183 to 194 on the left and B="both" numbers from 174 to 182 on the right" Another example: (n3),O,1587,1587,E,764,894 means "road segment number 3 has O="odd" number 1587 on the left and E="even" numbers from 764 to 894 on the right" The special case (n4),N,0,0,N,0,0 means "no numbers in road segment 4" As you can see we can describe single numbers as well as intervals. The numbers for the whole road is stored as a list of the above intervals, starting with the first segment. This looks like 0,N,0,0,B,782,219 + (n1),B,240,1119,B,300,1161 + (n2),... The tricky part: Where does a segment start and end? Each (number) segment starts and ends with a so called number node. Each crossing or road junction is always a number node, any other point on the way can be flagged as a number node. When you search for an addresss using a street name and house number, the Garmin software seems to check each possible road starting with the first segment. If the number falls into one of the segments (and matches the odd/even criteria) the corresponding position in that road segment is interpolated and shown as a result, else the closest matches are shown. My goal is to produce a reasonable list of segments so that the interpolated positions are close to the point on the road which you would like to use to "park your car" without blowing up the img size too much and without adding number nodes that cause visible angles (zig-zagging). Gerd Date: Sun, 1 Mar 2015 10:31:03 +0100 From: colin.smale@xs4all.nl To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] open issues in the housenumber2 branch Gerd, can you explain how addresses are stored in the Garmin maps? Are they millions of individual nodes, or "interpolation ways" with a start and an end? Are they just coordinates, or are they internally linked to a road in some way? Can it handle non-numeric house numbers? Understanding the "target data model" might help to assess all the possibilities we have in interpreting the data from OSM. Colin On 2015-03-01 09:42, Gerd Petermann wrote: Hi all, during the last days I've analysed the reasons for the error messages reported by some of you. I came to the conclusion that I have to - change the handling of addr:interpolation ways completely - add code to detect the "random number" case earlier and - if detected - use different methods to split the number intervals Both are rather complex changes, so I'll need a few days to code this. One open question for you: How should we handle "missing" information? If mkgmap finds the numbers 1,3,9,11,17 in that order on the left side of a road called ABC, it can create different housenumber informations. One could be like "odd numbers from 1 to 17 on the left", another could be a more complex sequence 1) "odd numbers from 1 to 3 on the left", 2)"odd numbers from 9 to 11 on the left", 3) "odd number 17 on the left" A search for ABC 5 would either show a point between 3 and 9 or two entries with the numbers 3 and 9. The latter tells you that OSM probably doesn't contain the exact information and let's you decide where to search for ABC 5. The trunk version tends to the simple info, while r3486 is more likely to produce the complex one. I think trunk is better here, the complex case should only occur if the "random number" case was detected. Do you agree? Gerd _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Gerd, Thanks for the explanation, and sorry if I missed it before. So we have two extremes: on the one hand accuracy which would best be served by having individual addresses in the output; on the other hand we have efficiency (storage and lookup performance?) which would call for "interpolation ways" as long as possible. I guess Garmin assumes the intermediate numbers in such a way are equally spaced. Combining a set of individual address nodes for e.g. 1,3,5,7,9 into a single segment will lose the exact placement of {3,5,7}. In practice I guess the inaccuracy will only be a matter of a few metres, so acceptable for most purposes. Maybe we could have two modes: accuracy mode, where individual addresses are mapped to individual Garmin address nodes (but interpolation ways could stay as they are, but split at any explicit intermediate nodes with an address) and small mode, where an attempt is made to match the individual addresses into ranges, corresponding to the routable way segments which stay as they are? In the first case the road segments get chopped up by the address information, in the second case the address information is made to fit the road segments. If we are to decide between these two approaches on a per segment basis, could we look at the results given by the second approach (consolidating individual addresses into a range) and examine the positional error with any intermediate explicit address nodes? If the error is less than, say, 10m, then we can accept the consolidated range, and otherwise consider splitting at the address with the largest positional error and re-assessing? Take a road segment and find the numbers in OSM closest to the end points on each side of the road. Interpolate between the end points. For each intermediate address, if OSM has an explicit location AND this is more than 10m from the interpolated location: complete the scan to find the biggest such error, split the segment there and recurse. Colin On 2015-03-01 11:02, Gerd Petermann wrote:
Hi Colin,
I think I tried it already, but here is my current knowledge: The Garmin format stores addresses like interpolation ways along roads (=routable ways) The corresponding information in mkgmap is stored in class Numbers, which has the fields start, end, and style, once for the left, once for the right side of the road. The style tells you what numbers are between start and end, one of "odd,even,both,none". The log shows this as e.g. (n4),B,183,194,B,174,182 which means something like "road segment number 4 has B="both" numbers from 183 to 194 on the left and B="both" numbers from 174 to 182 on the right"
Another example: (n3),O,1587,1587,E,764,894 means "road segment number 3 has O="odd" number 1587 on the left and E="even" numbers from 764 to 894 on the right" The special case (n4),N,0,0,N,0,0 means "no numbers in road segment 4"
As you can see we can describe single numbers as well as intervals. The numbers for the whole road is stored as a list of the above intervals, starting with the first segment. This looks like 0,N,0,0,B,782,219 + (n1),B,240,1119,B,300,1161 + (n2),...
The tricky part: Where does a segment start and end? Each (number) segment starts and ends with a so called number node. Each crossing or road junction is always a number node, any other point on the way can be flagged as a number node.
When you search for an addresss using a street name and house number, the Garmin software seems to check each possible road starting with the first segment. If the number falls into one of the segments (and matches the odd/even criteria) the corresponding position in that road segment is interpolated and shown as a result, else the closest matches are shown.
My goal is to produce a reasonable list of segments so that the interpolated positions are close to the point on the road which you would like to use to "park your car" without blowing up the img size too much and without adding number nodes that cause visible angles (zig-zagging).
Gerd
------------------------- Date: Sun, 1 Mar 2015 10:31:03 +0100 From: colin.smale@xs4all.nl To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] open issues in the housenumber2 branch
Gerd, can you explain how addresses are stored in the Garmin maps? Are they millions of individual nodes, or "interpolation ways" with a start and an end? Are they just coordinates, or are they internally linked to a road in some way? Can it handle non-numeric house numbers? Understanding the "target data model" might help to assess all the possibilities we have in interpreting the data from OSM. Colin
On 2015-03-01 09:42, Gerd Petermann wrote:
Hi all,
during the last days I've analysed the reasons for the error messages reported by some of you. I came to the conclusion that I have to - change the handling of addr:interpolation ways completely - add code to detect the "random number" case earlier and - if detected - use different methods to split the number intervals
Both are rather complex changes, so I'll need a few days to code this.
One open question for you: How should we handle "missing" information? If mkgmap finds the numbers 1,3,9,11,17 in that order on the left side of a road called ABC, it can create different housenumber informations. One could be like "odd numbers from 1 to 17 on the left", another could be a more complex sequence 1) "odd numbers from 1 to 3 on the left", 2)"odd numbers from 9 to 11 on the left", 3) "odd number 17 on the left" A search for ABC 5 would either show a point between 3 and 9 or two entries with the numbers 3 and 9. The latter tells you that OSM probably doesn't contain the exact information and let's you decide where to search for ABC 5.
The trunk version tends to the simple info, while r3486 is more likely to produce the complex one. I think trunk is better here, the complex case should only occur if the "random number" case was detected.
Do you agree?
Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev [1]
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev [1]
Links: ------ [1] http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Colin, reg. the algo my approach is roughly like this: 1) find all roads having the same name 2) if there are multiple roads, find out if they are building clusters 3) for each cluster , find the house numbers which are close 4) for each house number element, find the closest road 5) handle obvious OSM errors like duplicate house numbers at different places, single even numbers that are on the odd side 6) handle cases were a house is close to different segments of an (L-shaped) road, e.g. No. 12 appears between a sequence 2,4,6 and is missing later in 10,14. This can happen in one road, but also in two different roads forming the L-shape. If the house number fits better to the other segment, it is "moved" there. 7) build the intervals 8) check if any interval in the cluster is overlapping another one. If so, split one. This happens quite often when a bunch of houses is only reachable via an unnamed service road, but also when numbers have no order(the "random" case). repeat 8) until no overlap is found. 9) optimize interval lengths to further improve results This part is not that complex. So, in short, the big problem is to separate wrong OSM data from wrongly interpreted correct OSM data, the smaller problem is the final optimization. Gerd Colin Smale wrote
Gerd,
Thanks for the explanation, and sorry if I missed it before.
So we have two extremes: on the one hand accuracy which would best be served by having individual addresses in the output; on the other hand we have efficiency (storage and lookup performance?) which would call for "interpolation ways" as long as possible. I guess Garmin assumes the intermediate numbers in such a way are equally spaced. Combining a set of individual address nodes for e.g. 1,3,5,7,9 into a single segment will lose the exact placement of {3,5,7}. In practice I guess the inaccuracy will only be a matter of a few metres, so acceptable for most purposes.
Maybe we could have two modes: accuracy mode, where individual addresses are mapped to individual Garmin address nodes (but interpolation ways could stay as they are, but split at any explicit intermediate nodes with an address) and small mode, where an attempt is made to match the individual addresses into ranges, corresponding to the routable way segments which stay as they are? In the first case the road segments get chopped up by the address information, in the second case the address information is made to fit the road segments.
If we are to decide between these two approaches on a per segment basis, could we look at the results given by the second approach (consolidating individual addresses into a range) and examine the positional error with any intermediate explicit address nodes? If the error is less than, say, 10m, then we can accept the consolidated range, and otherwise consider splitting at the address with the largest positional error and re-assessing?
Take a road segment and find the numbers in OSM closest to the end points on each side of the road. Interpolate between the end points. For each intermediate address, if OSM has an explicit location AND this is more than 10m from the interpolated location: complete the scan to find the biggest such error, split the segment there and recurse.
Colin
On 2015-03-01 11:02, Gerd Petermann wrote:
Hi Colin,
I think I tried it already, but here is my current knowledge: The Garmin format stores addresses like interpolation ways along roads (=routable ways) The corresponding information in mkgmap is stored in class Numbers, which has the fields start, end, and style, once for the left, once for the right side of the road. The style tells you what numbers are between start and end, one of "odd,even,both,none". The log shows this as e.g. (n4),B,183,194,B,174,182 which means something like "road segment number 4 has B="both" numbers from 183 to 194 on the left and B="both" numbers from 174 to 182 on the right"
Another example: (n3),O,1587,1587,E,764,894 means "road segment number 3 has O="odd" number 1587 on the left and E="even" numbers from 764 to 894 on the right" The special case (n4),N,0,0,N,0,0 means "no numbers in road segment 4"
As you can see we can describe single numbers as well as intervals. The numbers for the whole road is stored as a list of the above intervals, starting with the first segment. This looks like 0,N,0,0,B,782,219 + (n1),B,240,1119,B,300,1161 + (n2),...
The tricky part: Where does a segment start and end? Each (number) segment starts and ends with a so called number node. Each crossing or road junction is always a number node, any other point on the way can be flagged as a number node.
When you search for an addresss using a street name and house number, the Garmin software seems to check each possible road starting with the first segment. If the number falls into one of the segments (and matches the odd/even criteria) the corresponding position in that road segment is interpolated and shown as a result, else the closest matches are shown.
My goal is to produce a reasonable list of segments so that the interpolated positions are close to the point on the road which you would like to use to "park your car" without blowing up the img size too much and without adding number nodes that cause visible angles (zig-zagging).
Gerd
------------------------- Date: Sun, 1 Mar 2015 10:31:03 +0100 From:
colin.smale@
To:
mkgmap-dev@.org
Subject: Re: [mkgmap-dev] open issues in the housenumber2 branch
Gerd, can you explain how addresses are stored in the Garmin maps? Are they millions of individual nodes, or "interpolation ways" with a start and an end? Are they just coordinates, or are they internally linked to a road in some way? Can it handle non-numeric house numbers? Understanding the "target data model" might help to assess all the possibilities we have in interpreting the data from OSM. Colin
On 2015-03-01 09:42, Gerd Petermann wrote:
Hi all,
during the last days I've analysed the reasons for the error messages reported by some of you. I came to the conclusion that I have to - change the handling of addr:interpolation ways completely - add code to detect the "random number" case earlier and - if detected - use different methods to split the number intervals
Both are rather complex changes, so I'll need a few days to code this.
One open question for you: How should we handle "missing" information? If mkgmap finds the numbers 1,3,9,11,17 in that order on the left side of a road called ABC, it can create different housenumber informations. One could be like "odd numbers from 1 to 17 on the left", another could be a more complex sequence 1) "odd numbers from 1 to 3 on the left", 2)"odd numbers from 9 to 11 on the left", 3) "odd number 17 on the left" A search for ABC 5 would either show a point between 3 and 9 or two entries with the numbers 3 and 9. The latter tells you that OSM probably doesn't contain the exact information and let's you decide where to search for ABC 5.
The trunk version tends to the simple info, while r3486 is more likely to produce the complex one. I think trunk is better here, the complex case should only occur if the "random number" case was detected.
Do you agree?
Gerd
_______________________________________________ mkgmap-dev mailing list
mkgmap-dev@.org
_______________________________________________ mkgmap-dev mailing list
mkgmap-dev@.org
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list
mkgmap-dev@.org
Links: ------ [1] http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list
mkgmap-dev@.org
-- View this message in context: http://gis.19327.n5.nabble.com/open-issues-in-the-housenumber2-branch-tp5835... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hi Gerd, I agree with you that the trunk approach is the better solution. I think that the second case is difficult to develop and maintain and adds little value in terms of navigation. Alexandre 2015-03-01 5:42 GMT-03:00 Gerd Petermann <gpetermann_muenchen@hotmail.com>:
Hi all,
during the last days I've analysed the reasons for the error messages reported by some of you. I came to the conclusion that I have to - change the handling of addr:interpolation ways completely - add code to detect the "random number" case earlier and - if detected - use different methods to split the number intervals
Both are rather complex changes, so I'll need a few days to code this.
One open question for you: How should we handle "missing" information? If mkgmap finds the numbers 1,3,9,11,17 in that order on the left side of a road called ABC, it can create different housenumber informations. One could be like "odd numbers from 1 to 17 on the left", another could be a more complex sequence 1) "odd numbers from 1 to 3 on the left", 2)"odd numbers from 9 to 11 on the left", 3) "odd number 17 on the left" A search for ABC 5 would either show a point between 3 and 9 or two entries with the numbers 3 and 9. The latter tells you that OSM probably doesn't contain the exact information and let's you decide where to search for ABC 5.
The trunk version tends to the simple info, while r3486 is more likely to produce the complex one. I think trunk is better here, the complex case should only occur if the "random number" case was detected.
Do you agree?
Gerd
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Alexandre, thanks. Maybe I should point out in what case the complex case is better: 1) assume a second road also named ABC and very close to the first one contains numbers 13, 5,15 in that order. This is what I call the "random number" case for which mkgmap has to produce many small intervals in both roads, else Garmin software will show multiple possible places for each number. 2) assume there is no other road named ABC in the near, but the numbers 1 to 11 are all at the beginning of the road (say first 50 m) while 17 is 500m away on the very end of the road. This is the case where I think mkgmap should add a number node so that the interpolated positions are closer to the houses. Gerd Date: Sun, 1 Mar 2015 07:03:22 -0300 From: alexandre.loss@gmail.com To: mkgmap-dev@lists.mkgmap.org.uk Subject: Re: [mkgmap-dev] open issues in the housenumber2 branch Hi Gerd, I agree with you that the trunk approach is the better solution. I think that the second case is difficult to develop and maintain and adds little value in terms of navigation. Alexandre 2015-03-01 5:42 GMT-03:00 Gerd Petermann <gpetermann_muenchen@hotmail.com>: Hi all, during the last days I've analysed the reasons for the error messages reported by some of you. I came to the conclusion that I have to - change the handling of addr:interpolation ways completely - add code to detect the "random number" case earlier and - if detected - use different methods to split the number intervals Both are rather complex changes, so I'll need a few days to code this. One open question for you: How should we handle "missing" information? If mkgmap finds the numbers 1,3,9,11,17 in that order on the left side of a road called ABC, it can create different housenumber informations. One could be like "odd numbers from 1 to 17 on the left", another could be a more complex sequence 1) "odd numbers from 1 to 3 on the left", 2)"odd numbers from 9 to 11 on the left", 3) "odd number 17 on the left" A search for ABC 5 would either show a point between 3 and 9 or two entries with the numbers 3 and 9. The latter tells you that OSM probably doesn't contain the exact information and let's you decide where to search for ABC 5. The trunk version tends to the simple info, while r3486 is more likely to produce the complex one. I think trunk is better here, the complex case should only occur if the "random number" case was detected. Do you agree? Gerd _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Gerd,
How should we handle "missing" information?
You can't guess if there is no house with missing number or there is one but not mapped in OSM data. So none solution would cover all cases. I would take the solution, which looks better in your code ;) What do you think about implementing precise address search in mkgmap? I mean: change each address point into a short line (like 10m) of predefined type and give a single house number to this line. There could be some more complex cases, like house number "3/5" which can be supported too. The line for address doesn't need to contain routing data in NOD, only house number in NET. -- Best regards, Andrzej

Hi Andrzej, reg. missing data: The question is if the users prefers to see that we are guessing or not. reg. "precise address search": interesting idea. I did not think about that until now. Is is possible to have roads in NET and index etc. without referencing them in NOD? I guess yes, because I don't see any code line in the housenumber code that makes sure that the road is rendered at level 0. Gerd popej wrote
Hi Gerd,
How should we handle "missing" information?
You can't guess if there is no house with missing number or there is one but not mapped in OSM data. So none solution would cover all cases. I would take the solution, which looks better in your code ;)
What do you think about implementing precise address search in mkgmap?
I mean: change each address point into a short line (like 10m) of predefined type and give a single house number to this line. There could be some more complex cases, like house number "3/5" which can be supported too.
The line for address doesn't need to contain routing data in NOD, only house number in NET.
-- Best regards, Andrzej _______________________________________________ mkgmap-dev mailing list
mkgmap-dev@.org
-- View this message in context: http://gis.19327.n5.nabble.com/open-issues-in-the-housenumber2-branch-tp5835... Sent from the Mkgmap Development mailing list archive at Nabble.com.

Hi Gerd,
The question is if the users prefers to see that we are guessing or not.
I think that general rule should be to create maps faithful to input data. But in this case I'm not against relaxing the rule, if this simplify map a bit.
Is is possible to have roads in NET and index etc. without referencing them in NOD?
It works in practice. You can strip NOD from img and map remains functional with working address search. Only routing disappears. I don't know details of NOD structure and I'm not sure about hybrid map, where only selected roads are referenced in NOD, but I guess it could work too. -- Best regards, Andrzej
participants (5)
-
Alexandre Loss
-
Andrzej Popowski
-
Colin Smale
-
Gerd Petermann
-
GerdP