Using extra cores in gmp?

Torbjorn Granlund tg at gmplib.org
Sun Mar 30 15:25:59 UTC 2014


Ronald Bruck Jr <bruck at usc.edu> writes:

  Now that 12-core Intel processors are available (for a mind-boggling
  combined 24 cores on the right mobo), let me ask whether there are any
  experimental versions of gmp which can use this many cores. In
  particular, many of my operations are carried out to thousands of digits
  (though seldom more than 32000), and it would seem that multiplication
  and division would greatly benefit from multithreading. (Even lowly
  addition!)
  
I assume you are talking about decimal digits.  32000 decimal digits is
about 1700 64-bit words.  I very much doubt one could get much speedup
from multiplies of that size, where each operation take around 3 ms on a
modern high-end CPU.  One problem is that a intermediate result from one
CPU will live in its L1 cache.  Accessing this from another CPU is very
expensive.
  
One could get speedup for multiplies of large enough operands, though.
Unfortunately that would be a lot of hard work, and the utility would
limited as huge operands are not common.

  As it turns out, most of my use of multiprecision wouldn't benefit much
  from such parallelism. Most of my uses involve (hundreds of) thousands
  of repetitions of a single suite of programs, and it's fine to launch 20
  or so invocations at a time on single threads. Each individual program
  takes much longer to run than if gmp were multithreaded, but the time
  for the whole collection will be about the same (or even faster).
  
I believe that's the typical scenario for GMP number crunching
applications.
  
  But I can foresee future situations where single invocations would be
  useful. I once thought that GPU's could accelerate computations, but I
  quickly discovered that these are largely memory-bound. Someone from
  Bailey's (competing) multiprecision group wrote me, several years ago,
  that THEY found the same, and thought the future was in the increasing
  number of cores.

Using current GPUs does not help GMP much.  I made a quite careful
feasibility study of porting GMP to Nvidia and AMD GPUs some years ago,
and determined that even a high-end GPU does not provide enough multiply
bandwidth to outperform a CPU by a very large margin.  This might change
but requires some unlikely architecture changes to the GPUs.


Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-discuss mailing list