1. Yes - had I set notepad to default to UTF-8 I probably would have evaded the bug. (as long as you don't use create new document dialog on right click in Windows - they will always be in ANSI except if you do some registry hacks).
And yes - the mkgmap style-file is in UTF-8 - but as a windows user you usually don't notice. Because it is without BOM - so as long as there is no Umlaut or other special character in it, notepad++ or probably most windows user will open the file as ANSI because as long as you don't use any such character - it is actually still identical. Where the mkgmap style-file in UTF-8 with BOM, it would be clearer... (but I don't want to start a with or without BOM discussion here).

So right now only the address file in the style is quite safe - because recently there were some special characters added.
mkgmap:country=POL & mkgmap:region!=* & mkgmap:admin_level4=* { set mkgmap:region='${mkgmap:admin_level4|subst:województwo =>}' }


But as long as there is no working check - and mkgmap default style-file comes in UTF-8 without BOM - there is quite big danger the bug will happen to others too... (for my style I now set it to UTF8 plus for added security (though it won't matter) I added a line : #this is a UTF-8 check - ÖÄÜè
so should any editor actually change the encoding to ANSI - I would directly notice... So such a line at the start could be an alternative to UTF-8 with BOM..


2. about the patch:
Mmmh - that patch goes a bit too far... - it actually stops at errors on input file (not style) too I think (note the time stamp 30 seconds later):
14:49:25 china cn 6555 this is run101 starting to compile openmtmbap with mkgmap
Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089): Bad character in input, file probably not in utf-8
        at uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)
        at uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)
        at uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)
        at uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)
        at uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)
        at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)
        at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)
        at uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)
        at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)
        at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)
        at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)
Could Not Find C:\OpenMTBMap\maps\ovm_6555*.img
14:49:55 china cn 6555 Finished Compiling Openmtbmap - this is run101
mapsetbuilding failed - to few maxnodes??
Press any key to continue . . .


vs (input file in ANSI):
15:11:38 china cn 6555 this is run101 starting to compile openmtmbap with mkgmap
Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089): Bad character in input, file probably not in utf-8
        at uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)
        at uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)
        at uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)
        at uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)
        at uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)
        at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)
        at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)
        at uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)
        at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)
        at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)
        at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)
Could Not Find C:\OpenMTBMap\maps\ovm_6555*.img
15:11:42 china cn 6555 Finished Compiling Openmtbmap - this is run101
mapsetbuilding failed - to few maxnodes??



However now that I once had a file in ANSI - (even though changed back to UTF-8) some residue in memory means I always get directly the error - even on default style...

C:\OpenMTBMap\maps>start /low /b /wait java -jar -XX:StringTableSize=100003 -Xms6000M -Xmx10300M c:\openmtbmap\mkgmap.jar --max-jobs=8 "--generate-sea" "--code-page=65001" "--precomp-sea=c:\openmtbmap\maps\sea.zip"  --nsis --index --levels="0:24, 1:2
3, 2:22, 3:21, 4:20, 5:19, 6:18" --overview-levels="7:17, 8:16, 9:15, 10:14, 11:13, 12:12" --adjust-turn-headings --add-pois-to-areas --reduce-point-density=3.4 --reduce-point-density-polygon=6 --housenumbers --link-pois-to-ways --ignore-turn-restric
tions --polygon-size-limits="24:16, 23:14, 22:12, 21:11, 20:10, 19:9, 18:8, 17:7, 16:6, 15:5, 14:4, 13:3, 12:2, 11:0, 10:0" --description=openmtbmap_gcc --show-profiles=1  --location-autofill=bounds,is_in,nearest  --bounds=c:\openmtbmap\maps\bounds.z
ip --route --country-abbr=gcc --country-name=gcc-states --mapname=65560000 --family-id=6556 --product-id=1 --series-name=openmtbmap_gcc-states_27.07.2014 --family-name=mtbmap_gcc_27.07.2014 --tdbfile --overview-mapname=mapsetc --keep-going --area-nam
e="gcc-states_27.07.2014_openmtbmap.org" -c e:\openmtbmap\maps\template.gcc-states 7*.img  1>NUL
Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: Error: (stream:10089): Bad character in input, file probably not in utf-8
        at uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)
        at uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)
        at uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)
        at uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)
        at uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)
        at uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)
        at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)
        at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)
        at uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)
        at uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)
        at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)
        at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)


On 27.07.2014 12:32, Steve Ratcliffe wrote:
On 26/07/14 18:43, Felix Hartmann wrote:
Okay - I used ANSI. Could there maybe be a check for this in the check
styles routine, or in general?
I do suppose that must have been the problem.

Although it is not always possible to tell if a file is in the wrong
encoding, it should have been in this case.  I see that the ì
character gets converted to a unicode replacement character (0xfffd)

If you had done:
    echo 'Shì'

it would have come out something like: Sh� (hope that works in email)
and shown the problem.
yes - clearly. (and works in email somehow).

There are a couple of ways to make bad characters an error, rather
than getting replaced.  The attached patch allows them to
be replaced and then throws and error when seen. This has the
advantage of giving you file name and line number of the error.
It might interfere with something valid, so give it a try.

I don't use notepad++, but these links might be useful:

http://superuser.com/questions/292086/how-can-i-enforce-so-notepad-uses-utf-8-every-time-i-create-a-new-file

http://stackoverflow.com/questions/5090845/change-the-default-encoding-for-notepad

..Steve

-- 
keep on biking and discovering new trails

Felix
openmtbmap.org & www.velomap.org