
Hi Gerd Here it is Ticker On Tue, 2021-10-19 at 09:22 +0000, Gerd Petermann wrote:
Hi Ticker,
yes, please remove all unrelated optimizations.
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Ticker Berkin <rwb-mkgmap@jagit.co.uk> Gesendet: Dienstag, 19. Oktober 2021 11:03 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] java.lang.AssertionError while building index from unicode tiles
Hi Gerd
I'd removed the change relating to clearing the reference to the Sort object to allow garbage garbage collection; as you said, this won't happen because Sort is shared. I do notice, however, that on a typical mkgmap run, Sort is created/read 3 times - it isn't shared as fully as possible.
The other changes (LargeListSorter) are slight improvements to memory usage and/or processing time - I can remove them if you want.
Ticker
On Tue, 2021-10-19 at 08:13 +0000, Gerd Petermann wrote:
Hi Ticker,
please remove the unrelated changes. I think we discussed them with patch mdrSort.patch in May, subject "MDR building out-of-memory".
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Ticker Berkin <rwb-mkgmap@jagit.co.uk> Gesendet: Montag, 18. Oktober 2021 16:36 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] java.lang.AssertionError while building index from unicode tiles
Hi Gerd
Here is first version of the changes to improve MDR unicode and stop the crash.
It always provides a PRIMARY strength sort value, both in the key for sorting and direct comparison when using the collator. Previously neither of these would have anything for a unicode character not mentioned in the sort/cp65001.txt file
In an attempt to stop ordering clashes between the specified sort and the ones fudged from the actual unicode value, it orders anything unknown after the known values. Unfortunately these can then become larger than 2 bytes - and, as this is all the space available without re-structuring, they have to wrap onto the known sort region. I only found 1 character that did this and I don't know if it conflicted with an existing sort.
Regardless of the character set used, in all the places where sorting is used for de-dupe, I've used the SECONDARY strength collator to detect similar record instead of name.equals(lastName)
I also noticed that my source base included optimisation for LargeListSorter, its use of a key cache and some tidy-up of this in mdr7 & mdr11 so these are here as well.
Ticker
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev