
Hello, i cannot help with details about the SparseMultiMap datatypes but i can confirme that memory usage of the splitter depends on the maximum node id within the input file, which should be nearly the same independent of the total size of your input. If you for example add a faked Node with id above 2^31 memory usage of the splitter explodes (Observed on 64bit linux engine), below this, memory usage seems to be nearly constant, even on very big input files. This will be a problem in the near future (1/2 to 1 year) when the osm node ids reach this limit. hasemann
since a few days I try to understand the idea of the class SparseInt2ShortMapInline in splitter because this seems to waste memory although it is documented to save it. If I got this right, the amount of needed memory directly depends on the highest id that is saved, thus for example the data for a small country like german "bundesland" saarland (from geofabrik) is not processed with java -Xmx700 -jar splitter.jar --max-areas=2 --max-nodes=200000 e:\dwnload\saarland.osm.pbf I tried that with R181 and R180. The program runs without problems when I change the code so that SparseInt2ShortMultiMap always uses Int2ShortOpenHashMap and not SparseInt2ShortMapInline. Was this code optimized for a special input (e.g.with small id values) ?