
On 08/10/13 13:42, keenonkites wrote:
Converting the file to UTF-8 without BOM runs through properly. The resulting TYP is usable but doesn't show correct strings... looks like it interpretes the encoding incorrect, though..... you need a screenprint ?
I have a patch for detecting files that begin with a BOM and forcing it to be read as utf-8 The second issue is that we allow the typ txt file to be written in the same character set as its declared CodePage as well as utf-8. As it is not possible to tell what encoding/character set a file uses this leads to errors if the file does not follow those rules. The patch now makes utf-8 a much stronger default. So if trying to read with CodePage fails it will revert to utf-8. The third issue is what should happen if the --code-page is different to the CodePage in the file. It would be consistent with the behaviour in other parts of the program for any file CodePage to be ignored if a command line --code-page is given. This is still needs doing. ..Steve