GMP on the Cell processor

Torbjorn Granlund tege at swox.com
Thu Apr 19 19:37:29 CEST 2007


Paul Leyland <paul at leyland.vispa.com> writes:

  My interest is primarily in integer factorization, several algorithms
  for which are trivially parallelizable and computationally demanding (as
  opposed to memory-intensive).  ECM is probably the best example but
  there are others, including some which are subroutines for other
  algorithms such as NFS.  Other algorithms are not entirely trivial to
  parallelize but the Chinese Remainder Theorem provides an obvious entry
  point into their parallelization.
  
  Running 7 copies of stage 1 of ECM suimultaneously, each with the same N
  but different curves, on the SPUs of a PS3 using their local memory is
  *very* attractive.   Their second stages, which are very memory-hungry,
  would then be farmed out either to the main PPC or to other more
  conventional machines.
  
This is out-of-scope for GMP.  You'd need to write a library for "SIMD
bignum" arithmetic, i.e., a lib for performing the same operations on
a set of bignums.  That would probably make the best use of Cell for
ECM factoring.

  GMP for the SPU, a stripped down version if necessary to get it to fit
  in the limited memory, would be very welcome indeed.

I would not expect it to perform well at all, unfortunately.  Reasons:

* IIRC, the Cell doesn't have a 32x32->64 bit multiply instruction.
  The lack of such an operation is a disaster for the performance.

* Making good use of SIMD for the most critical GMP operations is hard.
  (See the mpn/cray and perhaps mpn/powerpc*/vmx for some examples.)

* Using multiple cores for any real GMP peformance benefit is probably
  very difficult.

I think Cell is a very cool processor, although not for GMP.

-- 
Torbjörn


More information about the gmp-discuss mailing list