
On Wed, 31 Mar 2010 21:13:49 +0200, WanMil <wmgcnfg@web.de> writes:
I noticed that mkgmap does not intern any strings. In particular, this tile, generated by the splitter, fails to build with -Xmx3000m on 64-bit jdk under linux. With my patch, mkgmap generates the tile with -Xmx1000m.
<bounds minlat='55.1953125' minlon='9.4921875' maxlat='56.6015625' maxlon='11.513671875'/>
This tile has 1m nodes. Among the nodes and ways on this tile, there are 12m tags, yet only 100k distinct tag key/value pairs; on average each value occurs 120 times.
I explicitly do not use normal string interning because String.intern() strings are kept forever, and I want these strings to be GC'able after the tile is done. I trade GCability for having the occasional string duplicated in memory by flushing the interning table every 10k unique strings.
This code is not presently multithread safe; Ideally there should be one string interning table for each parser/thread.
Scott
Hi Scott!
I think that's a good idea to intern the strings. As far as I know the LossyIntern class is not needed. The .intern() function of a string does exactly the same.
You are right. String intern does not intern forever at least since Java 1.2.
Some time ago I sent a very similar patch to the mailing list which is not yet committed. Could you please test with your use case if it performs a similar memory reduction?
You can run it if you want, but from the numbers I gave above for this tile, interning values as in my patch will decrease the number of strings in RAM from 12M to <100k values. Interning only keys would reduce the number of Strings in RAM from 24M to 12M.
The patch is thread safe and does not intern all strings. In my opinion the value of a name tag should not be interned because there is a high probability that this tag is used once only.
Thats probably true for many or most tiles, but not for the tile I referenced above, where on average each value occurs 120 times. That tile is unbuildable with a 3gb heap without my patch and buildable with 1gb heap with my patch. Shall I post an updated patch without FuzzyIntern? Scott