Hi Carlos,

> I've seen problematic_polygons file in growing quite fast in the wiki.
> May it affect splitter performance? If so, would it make sense to split
> the file by continents or countries?

regarding performance:
a) if you specify the --problem-file parm, splitter has to do a lot more work, so that affects performance, no matter how many ids you put ino your list.
b) I don't expect a measurable performance impact for any id that does NOT occur in your input OSM file(s) as long as this list doesn't contain
millions of ids. The information is stored in HashMaps, so the only negative impact is a higher possibility of hash collisions and a slightly higher
memory usage for these HashMaps. Most of the additional time that is required to handle the list is caused by the fact that splitter has to read the input
file more often (3 times). With o5m and pbf format, only parts of the file(s) are read, with XML input the complete file is processed.

Of course, those ids that occor in your input data will require more heap and probably produce more output data, so that will affect performance.

OK?

The bigger problem that I see is this:
The list now contains some relations like

rel:52822 # Border Sweden
rel:1059668 # Border Norway

If you use the complete list to split e.g. finland.osm.pbf, the input file will only contain parts of the needed data for these polygons.
I wonder what splitter should do in these cases?
Currently it might print a few messages regarding missing nodes or ways, but it will use the incomplete data and it might
write the incomplete relation to more tiles than r202, thus mgkmap might produce even more error messages.

I think it would be better to change this so that splitter returns to the default handling for every relation or way that is not
complete, with a corresponding message.

Would that be better?