Search addresses for latin countries (help on reg exp)
data:image/s3,"s3://crabby-images/649db/649dbe6ec905e0226b380de6b49e5333cf1f5d53" alt=""
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc) So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc. Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name. This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc). Could you help me on making that regular expression for the style? “str1 str2… strN” -> “strN, str1 str2… strN” Thanks! Enrico
data:image/s3,"s3://crabby-images/023a9/023a9098d5847ef2b288898f55b229c476c05b2f" alt=""
El 05/08/13 12:50, Enrico Liboni escribió:
Folks, as you know -- this comes up time to time -- address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work "of" is used (Avenue de Bobigny, Corso del Popolo etc)
So what is a simple Mozartstrasse in Austria would look like "Via Wolfgang Amadeus Mozart" in Italy or "Rue Wolfgang Amadeus Mozart" in France but possibly also "Av.da de Mozart" etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I'd like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries -- so it might be a default style to add to IT, FR, ES, BR, MX... etc).
Could you help me on making that regular expression for the style?
"str1 str2... strN" -> "strN, str1 str2... strN"
Thanks!
Enrico
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
I'm working on that since a few weeks ago for Spanish and Catalan, but it takes quite a long time to find all combination possibilities and also to fix all errors in the names I'm finding in the OSM data. As an example, you can see the line highway=* & name ~ '[Aa]venida [Dd]e [Ee]l .*' { add streettype:movedend='${name|subst:Avenida De El |subst:Avenida De el |subst:Avenida de El |subst:Avenida de el |subst:avenida De El |subst:avenida De el |subst:avenida de El |subst:avenida de el }, Avenida de El'} It would be great if we can build and share rules for several languages
data:image/s3,"s3://crabby-images/e44cb/e44cb4f7e0092e7cf5766c42740c31f899660f49" alt=""
Am 05.08.2013 13:00, schrieb Carlos Dávila:
highway=* & name ~ '[Aa]venida [Dd]e [Ee]l .*' { add streettype:movedend='${name|subst:Avenida De El |subst:Avenida De el |subst:Avenida de El |subst:Avenida de el |subst:avenida De El |subst:avenida De el |subst:avenida de El |subst:avenida de el }, Avenida de El'} It seems to be a huge improvement if subst: would also be able to read reg exp.
Henning
data:image/s3,"s3://crabby-images/649db/649dbe6ec905e0226b380de6b49e5333cf1f5d53" alt=""
Carlos, Bueno! Thanks for your reply. Actually I believe trying to find all the combinations in the various languages would really be hard and I'm concerned about having hard-codes in for languages. The approach I'd like to follow to facilitate search is just to take the last word of a street name and concatenate it as a suffix to the street name itself, so the regex should do something like 1. get the string after the last occurrence of a blank in the full street name 2. if it exists, rename the street using this string, plus comma, plus the full street name So, say, "Calle Doctor Maranon" will become "Maranon, Calle Doctor Maranon": no matter how calle is spelled and if followed by a "de" or "de el" it woudl make the research easy. This would require a reg exp wizard but I'm sure here we can find someone ;) Enrico
From Carlos Dávila cdavilam at orangecorreo.es on Mon Aug 5 12:00:59 BST 2013 El 05/08/13 12:50, Enrico Liboni escribió:
I'm working on that since a few weeks ago for Spanish and Catalan, but it takes quite a long time to find all combination possibilities and also to fix all errors in the names I'm finding in the OSM data. As an example, you can see the line highway=* & name ~ '[Aa]venida [Dd]e [Ee]l .*' { add streettype:movedend='${name|subst:Avenida De El |subst:Avenida De el |subst:Avenida de El |subst:Avenida de el |subst:avenida De El |subst:avenida De el |subst:avenida de El |subst:avenida de el }, Avenida de El'} It would be great if we can build and share rules for several languages
On Mon, Aug 5, 2013 at 12:50 PM, Enrico Liboni wrote:
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico
data:image/s3,"s3://crabby-images/e44cb/e44cb4f7e0092e7cf5766c42740c31f899660f49" alt=""
Am 05.08.2013 15:17, schrieb Enrico Liboni:
Carlos, Bueno! Thanks for your reply. Actually I believe trying to find all the combinations in the various languages would really be hard and I'm concerned about having hard-codes in for languages. I don't think that this is a good solution. 1: Also therefore you have to find all combinations 2: It's not very flexible and transparent. 3: You can share your styles and everyone can include this in his own style. 4: Changing a Style is pretty easy, changing it in the code is pretty hard, because you have to know java
Henning
data:image/s3,"s3://crabby-images/4d1a2/4d1a2cc1ca7193135c2a10650420a3ff228913ee" alt=""
Hi, I think there are 2 problems there. First is how does name look on the map and how it is pronounced by navigations. There could be different rules depending on country. To some extend names could be standardized in style definitions. Other problem is search for address. This is a problem of proper indexing of street names and it can't be done in style definition. I think currently mkgmap uses whole string for indexing. Better way would be create index by all words in street name. Or maybe by even all sequences of words in street name. -- Best regards, Andrzej Popowski
data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
Hi
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
The Garmin index format has a way of dealing with this problem and earlier this year I made a branch that creates an index with the extra information to show where the interesting part of the name starts. The latest version indexes every word in the name separately so you could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'. So this will always work for any language, but at the cost of a much larger index. It would be great if someone could try it out as it is, then if useful, its more likely that someone would improve it. By devising a suitable way to cut down the useless entries. Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download page.
So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/023a9/023a9098d5847ef2b288898f55b229c476c05b2f" alt=""
El 05/08/13 19:42, Steve Ratcliffe escribió:
Hi
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
The Garmin index format has a way of dealing with this problem and earlier this year I made a branch that creates an index with the extra information to show where the interesting part of the name starts.
The latest version indexes every word in the name separately so you could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
So this will always work for any language, but at the cost of a much larger index.
It would be great if someone could try it out as it is, then if useful, its more likely that someone would improve it. By devising a suitable way to cut down the useless entries.
Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download page. I knew you had worked on it, but didn't know about that branch. I'll test it later this evening. For sure working on it will be in the right direction to get a definite solution for this problem.
So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico
data:image/s3,"s3://crabby-images/649db/649dbe6ec905e0226b380de6b49e5333cf1f5d53" alt=""
Steve you are the man! I'm rebuilding my map now. By the way, I believe that indexing just the last word make more sense to avoid a lot of useless entries, since words in the middle are usually first names or prepositions. I'll have a try and let you know. Thanks! Enrico On Mon, Aug 5, 2013 at 7:42 PM, Steve Ratcliffe <steve@parabola.me.uk>wrote:
Hi
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
The Garmin index format has a way of dealing with this problem and earlier this year I made a branch that creates an index with the extra information to show where the interesting part of the name starts.
The latest version indexes every word in the name separately so you could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
So this will always work for any language, but at the cost of a much larger index.
It would be great if someone could try it out as it is, then if useful, its more likely that someone would improve it. By devising a suitable way to cut down the useless entries.
Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download page.
So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/649db/649dbe6ec905e0226b380de6b49e5333cf1f5d53" alt=""
I tried r2662 but... no luck. Index search is worse: using "Via Wolfgang Amadeus Mozart" as example, actually if I type "Via W" and tap done nothing is found, same for "Via Wo", then if I type "Via Wol" magically (without tapping on done) two entries appears (indeed just the two entries in the city starting with "Via Wol"). If I try to search by "Mozart" or "Wolfgang" or whatever and tap on done nothing is found. So just the "autosearch" seems to work and on the street full name only :( I've a Nuvi510. Not sure if it helps, but the final mkgmap output was: === FIRST t1=0, t2=139294 first av 197031/20, last 0/10 ALLEE : 15180 VIALE : 27286 CHEMIN : 68618 PIAZZA : 22592 ROUTE : 39198 RUE : 98746 STRADA : 57822 AVENUE : 26016 VICOLO : 12536 VIA : 720774 IMPASSE : 19896 BOULEVARD : 8114 === LAST STRASSE : 15568 ULICA : 7700 Here the file size, the r2662 is 60MB bigger (as expected) -rw-r--r-- 1 enrico enrico 1069744128 Jul 14 16:01 gmapsupp.img r2656 -rw-r--r-- 1 enrico enrico 1126793216 Aug 5 22:33 gmapsupp.img r2662 and I got some errors related to SeaGenerator that did not appear with r2656: SEVERE (SeaGenerator): ./data/63240001.osm.pbf: Disable precompiled sea due to missing index.txt file in precompiled sea directory sea_20130701.zip but the file index.txt is there (in .gz format) in WanMil's zipped sea_20130701.zip Any clue? Thanks again Enrico On Mon, Aug 5, 2013 at 9:50 PM, Enrico Liboni <eliboni@gmail.com> wrote:
Steve you are the man! I'm rebuilding my map now. By the way, I believe that indexing just the last word make more sense to avoid a lot of useless entries, since words in the middle are usually first names or prepositions. I'll have a try and let you know. Thanks!
Enrico
On Mon, Aug 5, 2013 at 7:42 PM, Steve Ratcliffe <steve@parabola.me.uk>wrote:
Hi
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
The Garmin index format has a way of dealing with this problem and earlier this year I made a branch that creates an index with the extra information to show where the interesting part of the name starts.
The latest version indexes every word in the name separately so you could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
So this will always work for any language, but at the cost of a much larger index.
It would be great if someone could try it out as it is, then if useful, its more likely that someone would improve it. By devising a suitable way to cut down the useless entries.
Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download page.
So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
On 05/08/13 22:03, Enrico Liboni wrote:
I tried r2662 but... no luck. Index search is worse: using "Via Wolfgang Amadeus Mozart" as example, actually if I type "Via W" and tap done nothing is found, same for "Via Wo", then if I type "Via Wol" magically (without tapping on done) two entries appears (indeed just the two entries in the city starting with "Via Wol"). If I try to search by "Mozart" or "Wolfgang" or whatever and tap on done nothing is found. So just the "autosearch" seems to work and on the street full name only :( I've a Nuvi510.
I must appologise, the code doesn't create an entry for every word just the first and second one. So a search for Via Wolfgang... and Wolfgang... could be expected to work. I will change it so that all words are done. Also I would recommend trying on a small area first, because there may be a problem caused by so many 'Via' names and also it is easier to find problems from a smaller area. ..Steve
data:image/s3,"s3://crabby-images/023a9/023a9098d5847ef2b288898f55b229c476c05b2f" alt=""
El 05/08/13 23:03, Henning Scholland escribió:
Am 05.08.2013 21:50, schrieb Enrico Liboni:
I believe that indexing just the last word make more sense to avoid a lot of useless entries, since words in the middle are usually first names or prepositions At least in Germany this wont work well. ;)
Henning Neither in Spain. Using only the last word is too restrictive.
data:image/s3,"s3://crabby-images/649db/649dbe6ec905e0226b380de6b49e5333cf1f5d53" alt=""
Henning, I see, for Germany sometime the last word is "strasse" so it would not be useful... but the search by full street name should be almost fine right now. Carlos, not sure about Spain - I agree it can be sometime restrictive - "Calle Naciones Unidas" is an example but if one knows that searching by last word is possible, beside the full street name search, I believe it would be extremely useful. On the other side, having the index built for each word, as it is supposed to work in 2662, would really be the final solution if performance of searching is not heavily impacted, as I suppose. On Mon, Aug 5, 2013 at 11:13 PM, Carlos Dávila <cdavilam@orangecorreo.es>wrote:
El 05/08/13 23:03, Henning Scholland escribió:
Am 05.08.2013 21:50, schrieb Enrico Liboni:
I believe that indexing just the last word make more sense to avoid a lot of useless entries, since words in the middle are usually first names or prepositions At least in Germany this wont work well. ;)
Henning Neither in Spain. Using only the last word is too restrictive.
mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/11666/11666a46c8d52240027ff143c63bf5a11b57613f" alt=""
On Tue, Aug 06, Enrico Liboni wrote:
Henning, I see, for Germany sometime the last word is "strasse" so it would not be useful... but the search by full street name should be almost fine right now.
I made already some time ago the proposal: 1. Split a streetname into it's single words 2. Remove all common names for that country, like "Street", "way", "via", ... 3. Add the remaining words to the index for that street. With mkgmap:country we should be able to decide on 2, which list of words should be ignored for this country in the index. Thorsten -- Thorsten Kukuk, Senior Architect SLES & Common Code Base SUSE LINUX Products GmbH, Maxfeldstr. 5, D-90409 Nuernberg GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)
data:image/s3,"s3://crabby-images/f334b/f334b31dc987476ffd5728a12c263c451ec5b72d" alt=""
On 2013-08-06 09:51, Thorsten Kukuk wrote:
With mkgmap:country we should be able to decide on 2, which list of words should be ignored for this country in the index.
Would it not also need to be dependent on a language selection for multilingual countries such as Belgium? The treatment of name:nl=Koning Albertlaan will need to be different from name:fr=Avenue du Roi Albert. Colin
data:image/s3,"s3://crabby-images/023a9/023a9098d5847ef2b288898f55b229c476c05b2f" alt=""
El 05/08/13 19:42, Steve Ratcliffe escribió:
Hi
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
The Garmin index format has a way of dealing with this problem and earlier this year I made a branch that creates an index with the extra information to show where the interesting part of the name starts.
The latest version indexes every word in the name separately so you could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
So this will always work for any language, but at the cost of a much larger index.
It would be great if someone could try it out as it is, then if useful, its more likely that someone would improve it. By devising a suitable way to cut down the useless entries.
Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download page.
So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico First result with the mixed-index branch, processing Spain with default style Total time taken: 391216ms vs 449649ms with r2661 index size: 29 MB vs 21.6 MB with r2661 Apart from the numbers, the address search doesn't work by now. Entries in the index are not unique and are not ordered (see screenshot 1). When you type a letter search results don't change accordingly (screenshot 2). This is the console output, if it is of any help: === FIRST t1=0, t2=55013 first av 96203/24, last 0/12 AVENIDA : 32380 CAMINO : 14816 PLAZA : 12864 CARRETERA : 28180 CALLE : 288500 RÚA : 9130 CARRER : 117140 AVINGUDA : 11602 === LAST KALEA : 9682 AUZOA : 11604
data:image/s3,"s3://crabby-images/023a9/023a9098d5847ef2b288898f55b229c476c05b2f" alt=""
El 05/08/13 23:09, Carlos Dávila escribió:
El 05/08/13 19:42, Steve Ratcliffe escribió:
Hi
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
The Garmin index format has a way of dealing with this problem and earlier this year I made a branch that creates an index with the extra information to show where the interesting part of the name starts.
The latest version indexes every word in the name separately so you could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
So this will always work for any language, but at the cost of a much larger index.
It would be great if someone could try it out as it is, then if useful, its more likely that someone would improve it. By devising a suitable way to cut down the useless entries.
Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download page.
So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico First result with the mixed-index branch, processing Spain with default style Total time taken: 391216ms vs 449649ms with r2661 index size: 29 MB vs 21.6 MB with r2661 Apart from the numbers, the address search doesn't work by now. Entries in the index are not unique and are not ordered (see screenshot 1). When you type a letter search results don't change accordingly (screenshot 2). This is the console output, if it is of any help: === FIRST t1=0, t2=55013 first av 96203/24, last 0/12 AVENIDA : 32380 CAMINO : 14816 PLAZA : 12864 CARRETERA : 28180 CALLE : 288500 RÚA : 9130 CARRER : 117140 AVINGUDA : 11602 === LAST KALEA : 9682 AUZOA : 11604 I have compiled the same input data with the same command and strangely now it seems to work better. Typing "C" in the search field selects all streets with a "C" as first letter in their name after calle, avenida or whatever (see screenshot), apart from the 3 first entries in the list.
data:image/s3,"s3://crabby-images/023a9/023a9098d5847ef2b288898f55b229c476c05b2f" alt=""
El 06/08/13 00:02, Carlos Dávila escribió:
El 05/08/13 23:09, Carlos Dávila escribió:
El 05/08/13 19:42, Steve Ratcliffe escribió:
Hi
Folks, as you know – this comes up time to time – address search is unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
The Garmin index format has a way of dealing with this problem and earlier this year I made a branch that creates an index with the extra information to show where the interesting part of the name starts.
The latest version indexes every word in the name separately so you could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
So this will always work for any language, but at the cost of a much larger index.
It would be great if someone could try it out as it is, then if useful, its more likely that someone would improve it. By devising a suitable way to cut down the useless entries.
Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download page.
So what is a simple Mozartstrasse in Austria would look like “Via Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico First result with the mixed-index branch, processing Spain with default style Total time taken: 391216ms vs 449649ms with r2661 index size: 29 MB vs 21.6 MB with r2661 Apart from the numbers, the address search doesn't work by now. Entries in the index are not unique and are not ordered (see screenshot 1). When you type a letter search results don't change accordingly (screenshot 2). This is the console output, if it is of any help: === FIRST t1=0, t2=55013 first av 96203/24, last 0/12 AVENIDA : 32380 CAMINO : 14816 PLAZA : 12864 CARRETERA : 28180 CALLE : 288500 RÚA : 9130 CARRER : 117140 AVINGUDA : 11602 === LAST KALEA : 9682 AUZOA : 11604 I have compiled the same input data with the same command and strangely now it seems to work better. Typing "C" in the search field selects all streets with a "C" as first letter in their name after calle, avenida or whatever (see screenshot), apart from the 3 first entries in the list. Doing some more test, it seems that the new index is able to find streets by their second word. For example, searching for Calle Naciones Unidas (United Nations Street) it is found typing both "calle nac" and "nacione" but not typing "unidas"
data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
Hi
Doing some more test, it seems that the new index is able to find streets by their second word. For example, searching for Calle Naciones
Yes, sorry, it was indeed only working for the second word. The new version I just committed now really works with all words. (Well it also omits the word CALLE altogether, but that was just an experiment). I made a change another change that seems to improve the sorting and the number of streets found. Its still not perfect. ..Steve
data:image/s3,"s3://crabby-images/bb5e3/bb5e3b9e60ece791f425c2c1c146f189a3568f3b" alt=""
Hi Steve, When you collect the data for the index you could also increment a count for each word. Then only add the word to the index if the count is less than a optional value (default say 10000). This should work for most languages and reduce the size of the index, although it will require more memory for compiling the map. Regards, Geoff. Steve Ratcliffe <steve@parabola.me.uk> wrote:
Hi
Doing some more test, it seems that the new index is able to find streets by their second word. For example, searching for Calle Naciones
Yes, sorry, it was indeed only working for the second word. The new version I just committed now really works with all words. (Well it also omits the word CALLE altogether, but that was just an experiment).
I made a change another change that seems to improve the sorting and the number of streets found.
Its still not perfect.
..Steve
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
On 06/08/13 18:57, Geoff Sherlock wrote:
Hi Steve,
When you collect the data for the index you could also increment a count for each word. Then only add the word to the index if the count is less than a optional value (default say 10000). This should work for most languages and reduce the size of the index, although it will require more memory for compiling the map.
I was looking into doing something like that. Turns out though that it is not as easy as it sounds. So for example, in English, the words 'the' and 'square' are top words that could be removed. Yet there are names such as 'The Square' and there are a whole bunch of similar problems. Ideally we need methods that fail in a safe way by only rejecting a word if it it (reasonably) certain that it should not be there. At the moment I am thinking that this will probably require language specific rules. ..Steve
data:image/s3,"s3://crabby-images/bb5e3/bb5e3b9e60ece791f425c2c1c146f189a3568f3b" alt=""
Yes I thought of High Street and Victoria Street shortly after sending the email. But you could get rid of High, Victoria and Street from the index and still keep the full name in the index. It would work in English but not very well where street and avenue is at the beginning. Hhhm, probably better to have a country exclusion list not to discard the likes of Victoria and High in the UK if the counting algorithm is used. Geoff. Steve Ratcliffe <steve@parabola.me.uk> wrote:
On 06/08/13 18:57, Geoff Sherlock wrote:
Hi Steve,
When you collect the data for the index you could also increment a count for each word. Then only add the word to the index if the count is less than a optional value (default say 10000). This should work for most languages and reduce the size of the index, although it will require more memory for compiling the map.
I was looking into doing something like that. Turns out though that it is not as easy as it sounds. So for example, in English, the words 'the' and 'square' are top words that could be removed. Yet there are names such as 'The Square' and there are a whole bunch of similar problems.
Ideally we need methods that fail in a safe way by only rejecting a word if it it (reasonably) certain that it should not be there. At the moment I am thinking that this will probably require language specific rules.
..Steve
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/649db/649dbe6ec905e0226b380de6b49e5333cf1f5d53" alt=""
I tried with 2663 but I did not succeed :( Address search is not returning any value when I type some letters and tap on "done", I tried with my usual "Via Wolfgang Amadeus Mozart" but neither with "Via" "Via W" "Via Wo" "Wolf" "Wolfgang" "Amadeus" "Mozart" etc. the result is the same: no street found However if I don't type anything and tap on done the (unordered) list of streets appears and by scrolling I can find it. Also, if I start typing the street full name at a "Via Wol" some results appears, i.e. the two street in the city startign with "via Wolf...". So for me the behaviour is the same I was experiencing with 2662. Not sure if I can do some further tests, I tried to reduce the pbf size limiting to a single Italian region (pbf < 150MB), it took just 5mins to build and apparently with no errors. Time started: Tue Aug 06 21:44:44 CEST 2013 === FIRST t1=0, t2=17727 first av 80657/66, last 0/15 VIA : 227700 === LAST Time finished: Tue Aug 06 21:49:59 CEST 2013 On Tue, Aug 6, 2013 at 3:46 PM, Steve Ratcliffe <steve@parabola.me.uk>wrote:
Hi
Doing some more test, it seems that the new index is able to find streets by their second word. For example, searching for Calle Naciones
Yes, sorry, it was indeed only working for the second word. The new version I just committed now really works with all words. (Well it also omits the word CALLE altogether, but that was just an experiment).
I made a change another change that seems to improve the sorting and the number of streets found.
Its still not perfect.
..Steve
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
Hi
I tried with 2663 but I did not succeed :( Address search is not returning any value when I type some letters and tap on "done", I tried with my usual "Via Wolfgang Amadeus Mozart" but neither with "Via" "Via W" "Via Wo" "Wolf" "Wolfgang" "Amadeus" "Mozart" etc. the result is the same: no street found
I should have mentioned that I am testing on mapsource. It is very possible that it does not work as well on a GPS device, since the index is quite different. In fact it may not work at all thinking about it. I will have to make it work on Mapsource/basecamp first and then see what differences there are on the devices if it still doesn't work there. ..Steve
data:image/s3,"s3://crabby-images/649db/649dbe6ec905e0226b380de6b49e5333cf1f5d53" alt=""
got it - this explains why I was getting different results if compared to Carlos... By the way - I got another idea in the meantime that could partially help. I'm opening another thread for it. Thanks, Enrico On Wed, Aug 7, 2013 at 12:13 AM, Steve Ratcliffe <steve@parabola.me.uk>wrote:
Hi
I tried with 2663 but I did not succeed :( Address search is not returning any value when I type some letters and tap on "done", I tried with my usual "Via Wolfgang Amadeus Mozart" but neither with "Via" "Via W" "Via Wo" "Wolf" "Wolfgang" "Amadeus" "Mozart" etc. the result is the same: no street found
I should have mentioned that I am testing on mapsource. It is very possible that it does not work as well on a GPS device, since the index is quite different. In fact it may not work at all thinking about it.
I will have to make it work on Mapsource/basecamp first and then see what differences there are on the devices if it still doesn't work there.
..Steve
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
data:image/s3,"s3://crabby-images/023a9/023a9098d5847ef2b288898f55b229c476c05b2f" alt=""
El 06/08/13 15:46, Steve Ratcliffe escribió:
Hi
Doing some more test, it seems that the new index is able to find streets by their second word. For example, searching for Calle Naciones Yes, sorry, it was indeed only working for the second word. The new version I just committed now really works with all words. (Well it also omits the word CALLE altogether, but that was just an experiment).
I made a change another change that seems to improve the sorting and the number of streets found.
Its still not perfect.
..Steve A test with a smaller input data shows the following: Typing "a" gives a list of streets all of them with a word starting with "a" in different positions of the name and correctly sorted (Aceuchal, Acevedo, Achicorial, Acim, Adarve), but some streets that should be at the beginning of the list (Abades, Abadía, Abajo, etc.) are missing. Screenshot 4. When "ac" is typed, the streets in the previous list go down several positions and new ones containing Acacia, Acapulco, Acceso, Acebo, etc. appear on top of the list. Screenshot 5. Results vary quite a lot depending on the letter you search for. Typing "g" seems to work very well, displayed names are sorted and start with the correct street. Screenshot 6. But typing "m" doesn't show any street starting by that letter although they are present in the map. Typing "l" shows a lot of streets which only word starting by l is "la" which is an article that should be avoided in the index. Apart from that behavior, it's important to note that if any of the streets displayed is selected and Find button is clicked, nothing is found.
data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
HI Thanks for testing. I'm testing on a single tile at the moment and the results are variable, although I do find many streets.
Apart from that behavior, it's important to note that if any of the streets displayed is selected and Find button is clicked, nothing is found.
For any name? In my tests I do find the street in most although not all cases. ..Steve
data:image/s3,"s3://crabby-images/649db/649dbe6ec905e0226b380de6b49e5333cf1f5d53" alt=""
That's weird... we did the same tests and it fails, but now it is eems it is partially working fo ryoru... I'll give another try tonight. On Tue, Aug 6, 2013 at 12:02 AM, Carlos Dávila <cdavilam@orangecorreo.es>wrote:
El 05/08/13 23:09, Carlos Dávila escribió:
El 05/08/13 19:42, Steve Ratcliffe escribió:
Hi
Folks, as you know – this comes up time to time – address search is
unpractical in most Latin countries where the street/square name usually starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida, Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.) followed by the full name of - usually - the person naming the street. Nevertheless the street names sometime appears abbreviated (V.le, Av.da, Bld. etc), sometime the Middle name is skipped, sometime the work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
The Garmin index format has a way of dealing with this problem and earlier this year I made a branch that creates an index with the extra information to show where the interesting part of the name starts.
The latest version indexes every word in the name separately so you could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
So this will always work for any language, but at the cost of a much larger index.
It would be great if someone could try it out as it is, then if useful, its more likely that someone would improve it. By devising a suitable way to cut down the useless entries.
Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download page.
So what is a simple Mozartstrasse in Austria would look like “Via
Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in France but possibly also “Av.da de Mozart” etc.
Now, everyone knows the street/square by its last name and it would be much more practical to search by it: I’d like to have a style that just pick the last full word of the street/square name and put it as a suffix followed by a comma and the original name.
This would really boost address search for Latin countries – so it might be a default style to add to IT, FR, ES, BR, MX… etc).
Could you help me on making that regular expression for the style?
“str1 str2… strN” -> “strN, str1 str2… strN”
Thanks!
Enrico
First result with the mixed-index branch, processing Spain with default style Total time taken: 391216ms vs 449649ms with r2661 index size: 29 MB vs 21.6 MB with r2661 Apart from the numbers, the address search doesn't work by now. Entries in the index are not unique and are not ordered (see screenshot 1). When you type a letter search results don't change accordingly (screenshot 2). This is the console output, if it is of any help: === FIRST t1=0, t2=55013 first av 96203/24, last 0/12 AVENIDA : 32380 CAMINO : 14816 PLAZA : 12864 CARRETERA : 28180 CALLE : 288500 RÚA : 9130 CARRER : 117140 AVINGUDA : 11602 === LAST KALEA : 9682 AUZOA : 11604
I have compiled the same input data with the same command and strangely now it seems to work better. Typing "C" in the search field selects all streets with a "C" as first letter in their name after calle, avenida or whatever (see screenshot), apart from the 3 first entries in the list.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
participants (8)
-
Andrzej Popowski
-
Carlos Dávila
-
Colin Smale
-
Enrico Liboni
-
Geoff Sherlock
-
Henning Scholland
-
Steve Ratcliffe
-
Thorsten Kukuk