address search and case significance of street name
data:image/s3,"s3://crabby-images/f0134/f0134b5004a2a90c1324ff9331e4ce1f20ff1c83" alt=""
Hi all, in some areas, esp. in CZ, I see quite a lot of houses where the addr:street name is slightly different to that of the nearest road. Sample: http://www.openstreetmap.org/way/296555634 has name="K Zahrádkám" (with capital "Z") while some houses, e.g. http://www.openstreetmap.org/way/53658380 have addr:street="K zahrádkám" I thought that this is handled by a bot, but this difference exists now for quite a while. I assume that mkgmap should always ignore case when it compares these tags? Or should that depend on the option "--lower-case" ? Gerd
data:image/s3,"s3://crabby-images/f0134/f0134b5004a2a90c1324ff9331e4ce1f20ff1c83" alt=""
Hi all, sorry, answering my own qestion again... The same problem can occur with city names and place names and so on, also two connectect roads may have that problem, so RoadMerger would not merge them. If mkgmap could change all names to upper (or lower) case, no problem, but that is probably not what we want. On the other hand, mkgmap cannot decide which spelling is correct. The only simple solution that I see is to use a data structure like this: For each kind of string (city name, place name, region name,street name, ...) we create a map with TreeMap<String, String>(String.CASE_INSENSITIVE_ORDER); The maps are filled while processing the OSM elements, so the first entry is used for all elements. Example: Relations are processed first, so image that one type=street relation has a name=Abc Street. This string is saved in the map. If any element processed later has a name like "ABC street" or "abc Street" which we consider as a street name, we will use "Abc Street" again. I've just tried that and the performance impact is small, so I think this is better, and the additional code for that is only ~30 lines. Gerd From: gpetermann_muenchen@hotmail.com To: mkgmap-dev@lists.mkgmap.org.uk Date: Sat, 18 Apr 2015 07:32:21 +0200 Subject: [mkgmap-dev] address search and case significance of street name Hi all, in some areas, esp. in CZ, I see quite a lot of houses where the addr:street name is slightly different to that of the nearest road. Sample: http://www.openstreetmap.org/way/296555634 has name="K Zahrádkám" (with capital "Z") while some houses, e.g. http://www.openstreetmap.org/way/53658380 have addr:street="K zahrádkám" I thought that this is handled by a bot, but this difference exists now for quite a while. I assume that mkgmap should always ignore case when it compares these tags? Or should that depend on the option "--lower-case" ? Gerd _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/4d1a2/4d1a2cc1ca7193135c2a10650420a3ff228913ee" alt=""
Hi Gerd,
If any element processed later has a name like "ABC street" or "abc Street" which we consider as a street name, we will use "Abc Street" again.
I'm not sure what for are used names from this table. I don't think that case could be important for comparison of street names. But I would prefer to see street name on a map with the original spelling. Otherwise mkgmap could propagate spelling errors, which would be difficult to trace. -- Best regards, Andrzej
data:image/s3,"s3://crabby-images/11666/11666a46c8d52240027ff143c63bf5a11b57613f" alt=""
Hi Andrzej, On Sat, Apr 18, Andrzej Popowski wrote:
If any element processed later has a name like "ABC street" or "abc Street" which we consider as a street name, we will use "Abc Street" again.
I'm not sure what for are used names from this table. I don't think that case could be important for comparison of street names. But I would prefer to see street name on a map with the original spelling.
But what is the original spelling? And more important, what's the right one? name of highway? name of route=street? addr:street? Yes, some people add addr:street to highways ... During the last days I looked at the warnings from mkgmap in regards to mismatch of street names, and I can only say, that typos are everywhere. Thorsten -- Thorsten Kukuk, Senior Architect SLES & Common Code Base SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nürnberg)
data:image/s3,"s3://crabby-images/f0134/f0134b5004a2a90c1324ff9331e4ce1f20ff1c83" alt=""
Hi Gert, yes, I am sure these problems were partly solved in each company to deduplicate adresses, but we probably cannot code that in Java for the whole world. Besides that I think now that mkgmap is not the right place to do it, or at least not the housenumber2 branch ;-) Gerd From: thesurveyor@wolke7.net To: mkgmap-dev@lists.mkgmap.org.uk Date: Sat, 18 Apr 2015 14:40:48 +0200 Subject: Re: [mkgmap-dev] address search and case significance of street name Hi, oooh, comparing street names, thats a never ending story :-( I assume you will find every typographical error you can think of. And I'm sure you/we won't find a rule to correct those errors. So from my point of view the only thing we can do is to simplify the name to eliminate the typical typos. I've done this for a database of street names, just for Germany a long time ago. The data haven't been from OSM, it was long before OSM started. In that system we - we replaced all special chars like, ".,;-<>!§$%&/()=?#*+:" with nothing - we replaced more than one blank with one blank - we replaced other white characters, like TAB with a blank - we replaced the German "ß" (scharfes s) with "s", "ä" with "ae" and so on - we replaced all double chars e.g. "aa", "bb", "cc", ... with the single char "a", "b", "c", ... - compared all street names case-insensitiv (in fact we did this by replacing all upper chars with the lower char) and then we used that string for all comparisons of the street names. But we displayed the original string. Maybe that helps you a little bit. Regards, Gert Gesendet: Samstag, 18. April 2015 um 13:53 Uhr Von: "Thorsten Kukuk" <kukuk@suse.de> An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] address search and case significance of street name Hi Andrzej, On Sat, Apr 18, Andrzej Popowski wrote:
If any element processed later has a name like "ABC street" or "abc
Street" which we consider as a street name, we will use "Abc Street"
again.
I'm not sure what for are used names from this table. I don't think
that case could be important for comparison of street names. But I
would prefer to see street name on a map with the original spelling.
But what is the original spelling? And more important, what's the right one? name of highway? name of route=street? addr:street? Yes, some people add addr:street to highways ... During the last days I looked at the warnings from mkgmap in regards to mismatch of street names, and I can only say, that typos are everywhere. Thorsten -- Thorsten Kukuk, Senior Architect SLES & Common Code Base SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nürnberg) _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/f0134/f0134b5004a2a90c1324ff9331e4ce1f20ff1c83" alt=""
Hi, okay, good points. For those who use mkgmap logs for quality improvements I could make sure to report the mismatches. Sample : INFO: uk.me.parabola.mkgmap.osmstyle.StyledConverter e:\osm_out_work\czech\20150223_092244\63240079.o5m: case difference: using Za ovčínem instead of Za Ovčínem for mkgmap:street http://www.openstreetmap.org/node/296650164 INFO: uk.me.parabola.mkgmap.osmstyle.StyledConverter e:\osm_out_work\czech\20150223_092244\63240079.o5m: case difference: using Za ovčínem instead of Za Ovčínem for mkgmap:street http://www.openstreetmap.org/node/296650164 INFO: uk.me.parabola.mkgmap.osmstyle.StyledConverter e:\osm_out_work\czech\20150223_092244\63240079.o5m: case difference: using Na Chobotě instead of Na chobotě for mkgmap:street http://www.openstreetmap.org/node/296660550 INFO: uk.me.parabola.mkgmap.osmstyle.StyledConverter e:\osm_out_work\czech\20150223_092244\63240079.o5m: case difference: using Na Chobotě instead of Na chobotě for mkgmap:street http://www.openstreetmap.org/node/296660550 BUT: I have problems to remove the duplicates here, as some tags are evaluated multiplie times, and I can not show where the original value comes from without blowing up the data structures, and I see no way to find out which spelling is correct. Of course one could count occurances and use the one that appears most often, but that would mean one or more additional loops and still would allow errors when 10 houses give the wrong name while only one highway element shows the right one. Aother problem is that other obvious typos are not handled, e.g. "Ilmer Weg" and "Ilmerweg". Is it "Bahnhofstraße" or "Bahnhofsstraße" ? So, I'll remove that code again and simply report a name mismatch, no matter what the reason is :-( I'll try to put as much information as possible to messages like these WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberRoad e:\osm_out_work\czech\20150223_092244\63240079.o5m: found no plausible street for address Rymáň 63(6) http://www.openstreetmap.org/node/1440812595 WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberRoad e:\osm_out_work\czech\20150223_092244\63240079.o5m: found no plausible street for address Rymáň 67(6) http://www.openstreetmap.org/node/1440812610 WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberRoad e:\osm_out_work\czech\20150223_092244\63240079.o5m: found no plausible street for address Jungmannova 102(0) http://www.openstreetmap.org/node/1444790204 WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberRoad e:\osm_out_work\czech\20150223_092244\63240079.o5m: found no plausible street for address Jungmannova 191(0) http://www.openstreetmap.org/node/1444791033 WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberRoad e:\osm_out_work\czech\20150223_092244\63240079.o5m: found no plausible street for address Melicharova 199(2) http://www.openstreetmap.org/node/1444791045 WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberRoad e:\osm_out_work\czech\20150223_092244\63240079.o5m: found no plausible street for address Jungmannova 224(0) http://www.openstreetmap.org/node/1444791086 WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberRoad e:\osm_out_work\czech\20150223_092244\63240079.o5m: found no plausible street for address Jungmannova 243(0) http://www.openstreetmap.org/node/1444791117 WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberRoad e:\osm_out_work\czech\20150223_092244\63240079.o5m: found no plausible street for address K Ovčínu 116(14) http://www.openstreetmap.org/node/3050704625 but one has to analyse the element to find out what's wrong (program or OSM data or boundary data, to start with that) Gerd
If any element processed later has a name like "ABC street" or "abc Street" which we consider as a street name, we will use "Abc Street" again.
I'm not sure what for are used names from this table. I don't think that case could be important for comparison of street names. But I would prefer to see street name on a map with the original spelling.
But what is the original spelling? And more important, what's the right one? name of highway? name of route=street? addr:street?
Yes, some people add addr:street to highways ...
During the last days I looked at the warnings from mkgmap in regards to mismatch of street names, and I can only say, that typos are everywhere.
Thorsten
-- Thorsten Kukuk, Senior Architect SLES & Common Code Base SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nürnberg) _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/4d1a2/4d1a2cc1ca7193135c2a10650420a3ff228913ee" alt=""
Hi Thorsten,
But what is the original spelling?
Original is the spelling form source data. What I mean: errors can be introduced by OSM mappers or by program that process data. Good program shouldn't add errors. If we can't verify spelling, then we should display original string from source data. -- Best regards, Andrzej
data:image/s3,"s3://crabby-images/11666/11666a46c8d52240027ff143c63bf5a11b57613f" alt=""
On Sat, Apr 18, Andrzej Popowski wrote:
Hi Thorsten,
But what is the original spelling?
Original is the spelling form source data.
What I mean: errors can be introduced by OSM mappers or by program that process data. Good program shouldn't add errors. If we can't verify spelling, then we should display original string from source data.
Exactly that was the question: what is the source data? For a highway, you have a name and a route=street with a name. According to the wiki, both are correct names for the same street. But what if they differ? -- Thorsten Kukuk, Senior Architect SLES & Common Code Base SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nürnberg)
data:image/s3,"s3://crabby-images/f0134/f0134b5004a2a90c1324ff9331e4ce1f20ff1c83" alt=""
Hi Thorsten,
But what is the original spelling?
Original is the spelling form source data.
What I mean: errors can be introduced by OSM mappers or by program that process data. Good program shouldn't add errors. If we can't verify spelling, then we should display original string from source data.
Exactly that was the question: what is the source data? For a highway, you have a name and a route=street with a name. According to the wiki, both are correct names for the same street. But what if they differ?
Yes, same problem with type=street and type=associatedStreet relations. I've added diverse checks to make sure that they are ignored when the names do not match, but these checks will only work when your style rules don't use the relations to name relation members without checking. So, I think rules like type=associatedStreet { apply role=house { add addr:street='${name}' }} should be avoided since r3359 if you want to use the messages produced by mkgmap to correct wrong data. So, my conclusion: It the style author who decides what's right or wrong, the value in mkgmap:street gives the street name for address search. Gerd
participants (4)
-
Andrzej Popowski
-
Gerd Petermann
-
thesurveyor@wolke7.net
-
Thorsten Kukuk