gmpbench -- how to utilize all CPU cores?

Sun Sep 29 15:31:16 CEST 2013

Among several similar statements, somebody wrote:

  I agree with you. Someone skilled shall parallelize the code and at the
  same time teach the people through the open source code how to properly
  parallelize the code in GMP.

Requests for parallelisation of GMP are done again and again over the
years.  It is not due to lack of unwillingness of us VOLUNTEERS which
keeps GMP non-parallel.  Note that GMP is re-entrant, as described in
the manual.

Thanks to the latest flurry of messages, we know that "serious users"
will never use GMP due to GMP's lack of internal parallelism.  Oh my,
there are lot of un-serious users out there!

I suggest that someone who expects a parallel GMP to be feasible start
with some simple case of bignum arithmetic and parallelise it on (say) a
4-core x86-64 processor.  Please let me know when your parallel
functions get less than 100x SLOWDOWN compared to GMP for operands which
are of common sizes.

Before you start, make sure you understand about caches, and cache
concepts like false sharing, replacement, inter-cache bandwidth.  And
please read up on parallelising and the significance of granularity.
Without that knowledge, reaching a mere 100x slowdown might be too hard
for you.

If you want a deeper reasoning on the subject of parallelising GMP, I
suggest that you search the gmp mailing list archives.

PS. Please remember that GMP is a gift from us to you.  Complaining
about it in an quite ignorant way, adding a degree of rudeness, might not
trigger us to give you a better gift.

Torbjörn