data:image/s3,"s3://crabby-images/802f4/802f43eb70afc2c91d48f43edac9b0f56b0ec4a4" alt=""
Hi Ticker
Problem is that resources/sort/cp65001.txt doesn't give ordering to lots of characters; it looks like it covers only about 10,500 of the 1,112,064 possible code-points. Many of these non-ordered characters are being used by the names in the tile in question.
I used the program in extra/src/uk/me/parabola/util/CollationRules.java to generate some of the tables. This uses the file "allkeys.txt" which can be obtained from https://www.unicode.org/Public/UCA/latest/allkeys.txt The document explaining the unicode collation rules that references that file is: http://www.unicode.org/reports/tr10/ It includes a section for programmatically deriving the weights for characters that do not have explicit entries in the table.
Assuming the actual ordering of unspecified code-points doesn't really matter, I propose to change the logic slightly so undefined Unicode is sorted on its 16-bit value after the range of known sorts.
I think that is a good initial approach to get things working. Steve