The limitation I see with a density map approach is that parts
of the planet that have the highest node density would by definition contain
the smallest split areas. This means the density map would need to be high
enough resolution that we didn't have to test against too many candidate
split areas in these dense parts (especially since these would be the most
common tests).

A 256x256 map would have 65536 cells about 120km on a side. Looking at the overview map of a max-nodes=1000000 split in qlandkarte, which shows tile boundaries,.I'd guess the worst case would be 10-15 tiles in a cell. That would prune away at least 97.5% of the bounds-checks.

With the planet file, at this nodecount and cell size, an R-tree traversal would likely have more boundschecks than that. I could see an r-tree being a win if tilesizes got a lot smaller, say people started to run max-nodes=50000 splits, or spitted files with insane node densities, such as contour lines.

To reduce the memory requirements, sets of candidate areas
would probably need to be pooled and reused across multiple points on the
density map.

I wouldn't worry too much about memory. Absolute worst case is each cell has all 255 areas, which would be 4 bytes * 255 areas IntList * (256*256 cells) =64mb. Use a ByteList and it's 16mb. Bitvector and it's 2mb. More realistically, I'd expect it to be 1/10th of the worst case.

Scott