[PATCH v10] make maps in parallel

18 May 2009

      Changed default number of threads to be 1. If you specify --max-jobs
without a value, you get one thread per core. --max-jobs=N means use N
threads.

With regard to comparing the output with known good maps to see if the
parallel processing is corrupting anything, one problem is that the
files contain timestamps. I have test code that zeros the time stamps
and have been able to compare the output from different runs.

What I have seen is that sometimes there are differences that appear
to be due to the order in which the labels are written to the output
file. If only the order is changing that is harmless but it would be
nice to understand how it's happening (I have a theory about this, yet
to be proven).

---------

Now preserves order in which files are combined (thanks Steve for the
tweak).

---------

Now serialises reading of style files and map source to avoid
reentrancy issue in GType.

Reworked top-level loop that waits for the parallel jobs to complete.
Appears to use a lot less CPU and could possibly influence the weird
problems some were reporting on Windows/Mac - please retest with this
version.

Steve, I haven't incorporated your changed options handling stuff yet
but will do in the future if (a) you don't commit it separately and (b)
we can fix the reliability issues with this parallelisation code.

---------

Now respects --num-jobs again (broken in last patch).

---------

Now reports exceptions in the worker threads.

---------

Here's a better fix than last night's effort for the problem where the
mapname and description for each job were getting clobbered due to the
way that the command args are processed. Each job now gets a "snapshot"
of the command args so it doesn't matter if they subsequently get
changed.

---------

Whoops! fixed a bad bug whereby each map was being output to the same
file. Not sure if the fix is very elegant but at least it's not being
silly any more.

Now limits the default value of max-jobs to 4 no matter how many cores
you have as further testing shows that having more threads just burns
CPU cycles but doesn't actually finish any quicker. I guess the memory
system is limiting the performance and the CPUs are spinning waiting
for access.

Now showing a real speedup of around 240% (my earlier higher claim
was based on CPU usage and I now realise that was erroneous, sorry).

--------

Now defaults to creating a thread per core so without doing anything
you should see a speedup on a SMP box when processing multiple maps.

You can use --max-jobs=N to limit the concurrency - you may
want to specify that if you can't increase the VM size to what is
required. However, it occurs to me that if you can afford a box with
more than 2 cores, then you can probably afford a reasonable amount of
memory (otherwise, what's the point in having more cores?)

Added help blurb.

--------

OK, let it not be said that I don't listen to others!

The attached patch provides support for making maps in parallel. By
default, the behaviour is the same as before but if you specify
--num-threads=N where N is greater than 1, it will process N maps at
the same time and then combine the results (if required). Don't forget
to increase the heap size appropriately.

A quick test on the big box shows good speedup - specifying
--num-threads=4 and 2GB VM size. I  was seeing better than 380%
utilisation with 8 cores in use.

I suspect the performance limitation here will be VM size and memory
system bandwidth.

BTW - I don't think num-threads is actually the best name for the
option, so please suggest alternatives.

Cheers,

Mark

Mark Burton

Mark Burton

Toby Speight

Toby Speight

Mark Burton

Marko Mäkelä

Mark Burton

Paul

Mark Burton

Paul

Mark Burton

Clinton Gladstone

Mark Burton

Martin Marinus

Clinton Gladstone

Mark Burton

tags

participants (6)