> I've seen problematic_polygons file in growing quite
fast in the wiki.
> May it affect splitter performance? If so, would it make
sense to split
> the file by continents or countries?
regarding performance:
a) if you specify the --problem-file parm, splitter has to do
a lot more work, so that affects performance, no matter how
many ids you put ino your list.
b) I don't expect a measurable performance impact for any id
that does NOT occur in your input OSM file(s) as long as this
list doesn't contain
millions of ids. The information is stored in HashMaps, so the
only negative impact is a higher possibility of hash
collisions and a slightly higher
memory usage for these HashMaps. Most of the additional time
that is required to handle the list is caused by the fact that
splitter has to read the input
file more often (3 times). With o5m and pbf format, only parts
of the file(s) are read, with XML input the complete file is
processed.
Of course, those ids that occor in your input data will
require more heap and probably produce more output data, so
that will affect performance.
OK?
The bigger problem that I see is this:
The list now contains some relations like
rel:52822 # Border Sweden
rel:1059668 # Border Norway
If you use the complete list to split e.g. finland.osm.pbf,
the input file will only contain parts of the needed data for
these polygons.
I wonder what splitter should do in these cases?
Currently it might print a few messages regarding missing
nodes or ways, but it will use the incomplete data and it
might
write the incomplete relation to more tiles than r202, thus
mgkmap might produce even more error messages.
I think it would be better to change this so that splitter
returns to the default handling for every relation or way that
is not
complete, with a corresponding message.
Would that be better?