Performance with large files

How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.

Hi Bernhard, the calculation time per tile can depend on map elements, e.g. complex multpolygons may result in additional seconds. Besides that there should be a O(n) calculation time. The final step however is typically to calculate the index , and that step reads data from all tiles, so it depends on the amount of data. Note the hint about --index in the help: If both the --gmapsupp and --tdbfile options are given alongside the --index option, then both indexes will be created. Note that this will require roughly twice as much memory. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 12:06:31 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: [mkgmap-dev] Performance with large files How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Bernhard, if you see a different picture with the current version, please let me know. One effect that might cause longer run time for later processed tiles is the String.intern() method or Gargabe Collection in general. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Sonntag, 19. März 2017 12:16:54 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files Hi Bernhard, the calculation time per tile can depend on map elements, e.g. complex multpolygons may result in additional seconds. Besides that there should be a O(n) calculation time. The final step however is typically to calculate the index , and that step reads data from all tiles, so it depends on the amount of data. Note the hint about --index in the help: If both the --gmapsupp and --tdbfile options are given alongside the --index option, then both indexes will be created. Note that this will require roughly twice as much memory. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 12:06:31 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: [mkgmap-dev] Performance with large files How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints. _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2). Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.

OK, quite surprising numbers. I'll see if I can reproduce them and if there is something wrong. It seems you are using splitter with keep-complete=false? It would be great if you could post the areas.list from splitter and the options for both programs. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 19:46:32 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2). Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Bernhard, reg. splitter: If I got that right you used pbf as input format for the "Germany" result and *.o5m for "Central Europe". Since the o5m format allows faster processing the splitter times are okay for me. reg. mkgmap: I've added a few lines in mkgmap to report the calculation time for each tile. As you might know, mkgmap starts a few threads (depending on the max-jobs option) and each thread processes on tile at a time. The threads share only a few data structures, and mkgmap doesn't collect much information for each tile in memory, so there is no good reason for an increase of run time per processed tile. My system is probably close to yours, a machine with 8 GB Memory and a 4 core CPU (i5) running a 64 Windows. The attached patch adds a few lines to report the calculation time for a tile. The patched version is here: http://files.mkgmap.org.uk/download/339/mkgmap.jar I've used the patched version to compile a map with the OpenfietsMap Lite style of an area around (and including) Germany . I used the attached dach-x.poly file with osmconvert and a planet.o5m file from 2017-01-17. and splitter with max-nodes=1000000 & keep-complete=true and found what I expected. Times are between 7 and 35 seconds per tile, most are around ~20 secs, no increase in time/tile. The time for the creation of the index / gmapsupp is rather short compared to the overall run time. The complete mkgmap log is here : http://files.mkgmap.org.uk/download/340/mkgmap_2017-03-20-064126.log So, I cannot reproduce your result for mkgmap. Maybe your machine was busy with other work like installing updates, maybe some tiles in the added area require much more time, maybe times depend on program options or style. I just noticed that I did not enable assertions (-ea) so I now try a variant with a max-nodes=1400000 and -ea. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 19:46:32 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2). Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Bernhard, other tests did not show new results. Any idea why you got so different numbers? Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Montag, 20. März 2017 07:40:00 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files Hi Bernhard, reg. splitter: If I got that right you used pbf as input format for the "Germany" result and *.o5m for "Central Europe". Since the o5m format allows faster processing the splitter times are okay for me. reg. mkgmap: I've added a few lines in mkgmap to report the calculation time for each tile. As you might know, mkgmap starts a few threads (depending on the max-jobs option) and each thread processes on tile at a time. The threads share only a few data structures, and mkgmap doesn't collect much information for each tile in memory, so there is no good reason for an increase of run time per processed tile. My system is probably close to yours, a machine with 8 GB Memory and a 4 core CPU (i5) running a 64 Windows. The attached patch adds a few lines to report the calculation time for a tile. The patched version is here: http://files.mkgmap.org.uk/download/339/mkgmap.jar I've used the patched version to compile a map with the OpenfietsMap Lite style of an area around (and including) Germany . I used the attached dach-x.poly file with osmconvert and a planet.o5m file from 2017-01-17. and splitter with max-nodes=1000000 & keep-complete=true and found what I expected. Times are between 7 and 35 seconds per tile, most are around ~20 secs, no increase in time/tile. The time for the creation of the index / gmapsupp is rather short compared to the overall run time. The complete mkgmap log is here : http://files.mkgmap.org.uk/download/340/mkgmap_2017-03-20-064126.log So, I cannot reproduce your result for mkgmap. Maybe your machine was busy with other work like installing updates, maybe some tiles in the added area require much more time, maybe times depend on program options or style. I just noticed that I did not enable assertions (-ea) so I now try a variant with a max-nodes=1400000 and -ea. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 19:46:32 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2). Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Gerd, mkgmap is running now, I'll likely report tomorrow. But I already saw that the first tile took very long, and in the areas file it is: 43100001: 1765376,-935936 to 2822144,139264 # : 37.880859,-20.083008 to 60.556641,2.988281 So, I suspect that here could be a problem: some of the newly added extracts inflates the area covered by the map extremely. I first thought of the Dutch Antilles in the Caribean, but that's farther west and farther south. I cannot see how that area comes into the map. The maps I created before, typically contained Germany, Chechia, Austria, Liechtenstein, and Switzerland. That took normally less than an hour per map. Now I added Luxemburg, Belgium and Netherlands also, which does not increase file size a lot. But the time for map creation was increased enormously. Bernhard Am 21.03.2017 um 09:17 schrieb Gerd Petermann:
Hi Bernhard,
other tests did not show new results. Any idea why you got so different numbers?
Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Montag, 20. März 2017 07:40:00 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files
Hi Bernhard,
reg. splitter: If I got that right you used pbf as input format for the "Germany" result and *.o5m for "Central Europe". Since the o5m format allows faster processing the splitter times are okay for me.
reg. mkgmap: I've added a few lines in mkgmap to report the calculation time for each tile. As you might know, mkgmap starts a few threads (depending on the max-jobs option) and each thread processes on tile at a time. The threads share only a few data structures, and mkgmap doesn't collect much information for each tile in memory, so there is no good reason for an increase of run time per processed tile.
My system is probably close to yours, a machine with 8 GB Memory and a 4 core CPU (i5) running a 64 Windows.
The attached patch adds a few lines to report the calculation time for a tile. The patched version is here: http://files.mkgmap.org.uk/download/339/mkgmap.jar
I've used the patched version to compile a map with the OpenfietsMap Lite style of an area around (and including) Germany . I used the attached dach-x.poly file with osmconvert and a planet.o5m file from 2017-01-17. and splitter with max-nodes00000 & keep-complete=true and found what I expected. Times are between 7 and 35 seconds per tile, most are around ~20 secs, no increase in time/tile. The time for the creation of the index / gmapsupp is rather short compared to the overall run time. The complete mkgmap log is here : http://files.mkgmap.org.uk/download/340/mkgmap_2017-03-20-064126.log
So, I cannot reproduce your result for mkgmap. Maybe your machine was busy with other work like installing updates, maybe some tiles in the added area require much more time, maybe times depend on program options or style.
I just noticed that I did not enable assertions (-ea) so I now try a variant with a max-nodes00000 and -ea.
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 19:46:32 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files
Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2).
Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Bernhard, okay, that is a good guess. How do you combine the files? Do you download single country extracts from geofabrik and merge them with osmconvert? Or maybe another tool? The splitter log file may show some hints why the area is so large, and maybe you should check splitter option no-trim. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Dienstag, 21. März 2017 18:56:51 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files Hi Gerd, mkgmap is running now, I'll likely report tomorrow. But I already saw that the first tile took very long, and in the areas file it is: 43100001: 1765376,-935936 to 2822144,139264 # : 37.880859,-20.083008 to 60.556641,2.988281 So, I suspect that here could be a problem: some of the newly added extracts inflates the area covered by the map extremely. I first thought of the Dutch Antilles in the Caribean, but that's farther west and farther south. I cannot see how that area comes into the map. The maps I created before, typically contained Germany, Chechia, Austria, Liechtenstein, and Switzerland. That took normally less than an hour per map. Now I added Luxemburg, Belgium and Netherlands also, which does not increase file size a lot. But the time for map creation was increased enormously. Bernhard Am 21.03.2017 um 09:17 schrieb Gerd Petermann:
Hi Bernhard,
other tests did not show new results. Any idea why you got so different numbers?
Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Montag, 20. März 2017 07:40:00 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files
Hi Bernhard,
reg. splitter: If I got that right you used pbf as input format for the "Germany" result and *.o5m for "Central Europe". Since the o5m format allows faster processing the splitter times are okay for me.
reg. mkgmap: I've added a few lines in mkgmap to report the calculation time for each tile. As you might know, mkgmap starts a few threads (depending on the max-jobs option) and each thread processes on tile at a time. The threads share only a few data structures, and mkgmap doesn't collect much information for each tile in memory, so there is no good reason for an increase of run time per processed tile.
My system is probably close to yours, a machine with 8 GB Memory and a 4 core CPU (i5) running a 64 Windows.
The attached patch adds a few lines to report the calculation time for a tile. The patched version is here: http://files.mkgmap.org.uk/download/339/mkgmap.jar
I've used the patched version to compile a map with the OpenfietsMap Lite style of an area around (and including) Germany . I used the attached dach-x.poly file with osmconvert and a planet.o5m file from 2017-01-17. and splitter with max-nodes00000 & keep-complete=true and found what I expected. Times are between 7 and 35 seconds per tile, most are around ~20 secs, no increase in time/tile. The time for the creation of the index / gmapsupp is rather short compared to the overall run time. The complete mkgmap log is here : http://files.mkgmap.org.uk/download/340/mkgmap_2017-03-20-064126.log
So, I cannot reproduce your result for mkgmap. Maybe your machine was busy with other work like installing updates, maybe some tiles in the added area require much more time, maybe times depend on program options or style.
I just noticed that I did not enable assertions (-ea) so I now try a variant with a max-nodes00000 and -ea.
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 19:46:32 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files
Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2).
Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Bernhard, I assume that the merged file doesn't contain a bbox. In that case you may see a few nodes from ferry routes in the input file and splitter will calculate a bbox containing them. I did not see this problem when merging Belgium + Netherlands with osmconvert. Anyway, you may solve the problem in the future by using appropriate bounds, e.g. with osmconvert or with the --polygon-file option in spltter. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Dienstag, 21. März 2017 19:15:13 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files Hi Bernhard, okay, that is a good guess. How do you combine the files? Do you download single country extracts from geofabrik and merge them with osmconvert? Or maybe another tool? The splitter log file may show some hints why the area is so large, and maybe you should check splitter option no-trim. Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Dienstag, 21. März 2017 18:56:51 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files Hi Gerd, mkgmap is running now, I'll likely report tomorrow. But I already saw that the first tile took very long, and in the areas file it is: 43100001: 1765376,-935936 to 2822144,139264 # : 37.880859,-20.083008 to 60.556641,2.988281 So, I suspect that here could be a problem: some of the newly added extracts inflates the area covered by the map extremely. I first thought of the Dutch Antilles in the Caribean, but that's farther west and farther south. I cannot see how that area comes into the map. The maps I created before, typically contained Germany, Chechia, Austria, Liechtenstein, and Switzerland. That took normally less than an hour per map. Now I added Luxemburg, Belgium and Netherlands also, which does not increase file size a lot. But the time for map creation was increased enormously. Bernhard Am 21.03.2017 um 09:17 schrieb Gerd Petermann:
Hi Bernhard,
other tests did not show new results. Any idea why you got so different numbers?
Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Montag, 20. März 2017 07:40:00 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files
Hi Bernhard,
reg. splitter: If I got that right you used pbf as input format for the "Germany" result and *.o5m for "Central Europe". Since the o5m format allows faster processing the splitter times are okay for me.
reg. mkgmap: I've added a few lines in mkgmap to report the calculation time for each tile. As you might know, mkgmap starts a few threads (depending on the max-jobs option) and each thread processes on tile at a time. The threads share only a few data structures, and mkgmap doesn't collect much information for each tile in memory, so there is no good reason for an increase of run time per processed tile.
My system is probably close to yours, a machine with 8 GB Memory and a 4 core CPU (i5) running a 64 Windows.
The attached patch adds a few lines to report the calculation time for a tile. The patched version is here: http://files.mkgmap.org.uk/download/339/mkgmap.jar
I've used the patched version to compile a map with the OpenfietsMap Lite style of an area around (and including) Germany . I used the attached dach-x.poly file with osmconvert and a planet.o5m file from 2017-01-17. and splitter with max-nodes00000 & keep-complete=true and found what I expected. Times are between 7 and 35 seconds per tile, most are around ~20 secs, no increase in time/tile. The time for the creation of the index / gmapsupp is rather short compared to the overall run time. The complete mkgmap log is here : http://files.mkgmap.org.uk/download/340/mkgmap_2017-03-20-064126.log
So, I cannot reproduce your result for mkgmap. Maybe your machine was busy with other work like installing updates, maybe some tiles in the added area require much more time, maybe times depend on program options or style.
I just noticed that I did not enable assertions (-ea) so I now try a variant with a max-nodes00000 and -ea.
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 19:46:32 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files
Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2).
Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Gerd, it was one single tile which took more than 2 hours: 43100003.osm.pbf took ~7242 sec It covers a large area, but not as big as the tile mentioned previously: 43100003: 1765376,741376 to 2203648,1310720 # : 37.880859,15.908203 to 47.285156,28.125000 The input file "43100003.osm.pbf" measures 12.5 MB - quite a normal size; the resulting img file is 4.2 MB, that's below average. Strange that that takes such long. The --polygon-file option (with a polygon from 2-19°E, 45-55°N) reduced the time spent by mkgmap to just below 1 hour, no tile took more than 100s. Thanks for that trick. I am still curious why the creation of a tile covering a large area, but hardly containing nodes requires so much time. Also with the polygon mentioned above, a large almost empty tile exists (South of Belgium, because I did not include France) which took longest (97s). Bernhard Am 22.03.2017 um 06:26 schrieb Gerd Petermann:
Hi Bernhard,
I assume that the merged file doesn't contain a bbox. In that case you may see a few nodes from ferry routes in the input file and splitter will calculate a bbox containing them. I did not see this problem when merging Belgium + Netherlands with osmconvert.
Anyway, you may solve the problem in the future by using appropriate bounds, e.g. with osmconvert or with the --polygon-file option in spltter.
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Dienstag, 21. März 2017 19:15:13 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files
Hi Bernhard,
okay, that is a good guess. How do you combine the files? Do you download single country extracts from geofabrik and merge them with osmconvert? Or maybe another tool?
The splitter log file may show some hints why the area is so large, and maybe you should check splitter option no-trim.
Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Dienstag, 21. März 2017 18:56:51 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files
Hi Gerd,
mkgmap is running now, I'll likely report tomorrow.
But I already saw that the first tile took very long, and in the areas file it is: 43100001: 1765376,-935936 to 2822144,139264 # : 37.880859,-20.083008 to 60.556641,2.988281
So, I suspect that here could be a problem: some of the newly added extracts inflates the area covered by the map extremely. I first thought of the Dutch Antilles in the Caribean, but that's farther west and farther south. I cannot see how that area comes into the map.
The maps I created before, typically contained Germany, Chechia, Austria, Liechtenstein, and Switzerland. That took normally less than an hour per map.
Now I added Luxemburg, Belgium and Netherlands also, which does not increase file size a lot. But the time for map creation was increased enormously.
Bernhard
Am 21.03.2017 um 09:17 schrieb Gerd Petermann:
Hi Bernhard,
other tests did not show new results. Any idea why you got so different numbers?
Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Montag, 20. März 2017 07:40:00 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files
Hi Bernhard,
reg. splitter: If I got that right you used pbf as input format for the "Germany" result and *.o5m for "Central Europe". Since the o5m format allows faster processing the splitter times are okay for me.
reg. mkgmap: I've added a few lines in mkgmap to report the calculation time for each tile. As you might know, mkgmap starts a few threads (depending on the max-jobs option) and each thread processes on tile at a time. The threads share only a few data structures, and mkgmap doesn't collect much information for each tile in memory, so there is no good reason for an increase of run time per processed tile.
My system is probably close to yours, a machine with 8 GB Memory and a 4 core CPU (i5) running a 64 Windows.
The attached patch adds a few lines to report the calculation time for a tile. The patched version is here: http://files.mkgmap.org.uk/download/339/mkgmap.jar
I've used the patched version to compile a map with the OpenfietsMap Lite style of an area around (and including) Germany . I used the attached dach-x.poly file with osmconvert and a planet.o5m file from 2017-01-17. and splitter with max-nodes00000 & keep-complete=ue and found what I expected. Times are between 7 and 35 seconds per tile, most are around ~20 secs, no increase in time/tile. The time for the creation of the index / gmapsupp is rather short compared to the overall run time. The complete mkgmap log is here : http://files.mkgmap.org.uk/download/340/mkgmap_2017-03-20-064126.log
So, I cannot reproduce your result for mkgmap. Maybe your machine was busy with other work like installing updates, maybe some tiles in the added area require much more time, maybe times depend on program options or style.
I just noticed that I did not enable assertions (-ea) so I now try a variant with a max-nodes00000 and -ea.
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 19:46:32 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files
Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2).
Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Hi Bernhard, thanks for the feedback. I found a reason for the long run time, please check my posts reg. r3861 and also this one: http://www.mkgmap.org.uk/pipermail/mkgmap-dev/2017q1/026489.html Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Mittwoch, 22. März 2017 21:07:37 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files Hi Gerd, it was one single tile which took more than 2 hours: 43100003.osm.pbf took ~7242 sec It covers a large area, but not as big as the tile mentioned previously: 43100003: 1765376,741376 to 2203648,1310720 # : 37.880859,15.908203 to 47.285156,28.125000 The input file "43100003.osm.pbf" measures 12.5 MB - quite a normal size; the resulting img file is 4.2 MB, that's below average. Strange that that takes such long. The --polygon-file option (with a polygon from 2-19°E, 45-55°N) reduced the time spent by mkgmap to just below 1 hour, no tile took more than 100s. Thanks for that trick. I am still curious why the creation of a tile covering a large area, but hardly containing nodes requires so much time. Also with the polygon mentioned above, a large almost empty tile exists (South of Belgium, because I did not include France) which took longest (97s). Bernhard Am 22.03.2017 um 06:26 schrieb Gerd Petermann:
Hi Bernhard,
I assume that the merged file doesn't contain a bbox. In that case you may see a few nodes from ferry routes in the input file and splitter will calculate a bbox containing them. I did not see this problem when merging Belgium + Netherlands with osmconvert.
Anyway, you may solve the problem in the future by using appropriate bounds, e.g. with osmconvert or with the --polygon-file option in spltter.
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Dienstag, 21. März 2017 19:15:13 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files
Hi Bernhard,
okay, that is a good guess. How do you combine the files? Do you download single country extracts from geofabrik and merge them with osmconvert? Or maybe another tool?
The splitter log file may show some hints why the area is so large, and maybe you should check splitter option no-trim.
Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Dienstag, 21. März 2017 18:56:51 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files
Hi Gerd,
mkgmap is running now, I'll likely report tomorrow.
But I already saw that the first tile took very long, and in the areas file it is: 43100001: 1765376,-935936 to 2822144,139264 # : 37.880859,-20.083008 to 60.556641,2.988281
So, I suspect that here could be a problem: some of the newly added extracts inflates the area covered by the map extremely. I first thought of the Dutch Antilles in the Caribean, but that's farther west and farther south. I cannot see how that area comes into the map.
The maps I created before, typically contained Germany, Chechia, Austria, Liechtenstein, and Switzerland. That took normally less than an hour per map.
Now I added Luxemburg, Belgium and Netherlands also, which does not increase file size a lot. But the time for map creation was increased enormously.
Bernhard
Am 21.03.2017 um 09:17 schrieb Gerd Petermann:
Hi Bernhard,
other tests did not show new results. Any idea why you got so different numbers?
Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Gerd Petermann <GPetermann_muenchen@hotmail.com> Gesendet: Montag, 20. März 2017 07:40:00 An: Development list for mkgmap Betreff: Re: [mkgmap-dev] Performance with large files
Hi Bernhard,
reg. splitter: If I got that right you used pbf as input format for the "Germany" result and *.o5m for "Central Europe". Since the o5m format allows faster processing the splitter times are okay for me.
reg. mkgmap: I've added a few lines in mkgmap to report the calculation time for each tile. As you might know, mkgmap starts a few threads (depending on the max-jobs option) and each thread processes on tile at a time. The threads share only a few data structures, and mkgmap doesn't collect much information for each tile in memory, so there is no good reason for an increase of run time per processed tile.
My system is probably close to yours, a machine with 8 GB Memory and a 4 core CPU (i5) running a 64 Windows.
The attached patch adds a few lines to report the calculation time for a tile. The patched version is here: http://files.mkgmap.org.uk/download/339/mkgmap.jar
I've used the patched version to compile a map with the OpenfietsMap Lite style of an area around (and including) Germany . I used the attached dach-x.poly file with osmconvert and a planet.o5m file from 2017-01-17. and splitter with max-nodes00000 & keep-complete=ue and found what I expected. Times are between 7 and 35 seconds per tile, most are around ~20 secs, no increase in time/tile. The time for the creation of the index / gmapsupp is rather short compared to the overall run time. The complete mkgmap log is here : http://files.mkgmap.org.uk/download/340/mkgmap_2017-03-20-064126.log
So, I cannot reproduce your result for mkgmap. Maybe your machine was busy with other work like installing updates, maybe some tiles in the added area require much more time, maybe times depend on program options or style.
I just noticed that I did not enable assertions (-ea) so I now try a variant with a max-nodes00000 and -ea.
Gerd
________________________________________ Von: mkgmap-dev <mkgmap-dev-bounces@lists.mkgmap.org.uk> im Auftrag von Bernhard Hiller <bhil@gmx.de> Gesendet: Sonntag, 19. März 2017 19:46:32 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] Performance with large files
Eventually I updated Java and tried the latest version of mkgmap 3847 (and also splitter 580). The extracts were retrieved from Geofabrik on Saturday 18. Germany: 2.93 GB plus additional countries: 5.62 GB (incl. Germany) which were combined to a 7.57 GB o5m file. splitter: Germany 14 minutes - Central Europe 20 minutes mkgmap: Germany 30 minutes - Central Europe 133 minutes While splitter performed better than O(n) (more like O(sqrt(n))), mkgmap performed worse than O(n^2).
Am 19.03.2017 um 12:06 schrieb Bernhard Hiller:
How is mkgmap expected to behave when input files grow in size? Is a linear inrease in calculation time - i.e. O(n) - expected, or an increase beyond linearity? E.g. when I create a map with routable lines for bicycle, mkgmap takes some 30 minutes for Germany alone (3 GB pbf file resulting in 850 MB img file), but more than 2 hours for Germany and some neighboring countries (7 GB o5m file, resulting 1.4 GB img). Are there many calculations at O(n^2) or beyond in mkgmap, or is this due to other factors, e.g. memory limitation? Notes: mkgmap is called by %JAVA% -Xmx6800M -ea -jar %MKGMAP% .... on 64bit Win 7; swapping to disc does not occur. But I am more interested in a general rule than in some hints for improving the performance in this concrete case. E.g. how I could estimate the duration if I add some further countries... Thanks for your hints.
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
participants (2)
-
Bernhard Hiller
-
Gerd Petermann