Risc V greatly underperforms

Torbjörn Granlund tg at gmplib.org
Wed Oct 6 08:48:43 UTC 2021


Hans Petter Selasky <hps at selasky.org> writes:

  If the GMP could utilitize multiple cores when doing bignum
  multiplication and addition, I think the picture would look different.

  For example for addition, you could split the number in two parts, and
  then speculate if there is an addition for the higher part or not.

And if the guess is wrong, then what?

It is well knowm in a model which ignores caches and memory bandwidth,
than one can get 2n/k + log(k) word operation steps for n-word addition
on k execution agents.  Agent k computes the sum of block k with both
carry = 1 and carry in = 0 and saves both results.  The log(k) term is
for serially choosing the proper block depending on whether carry-in
happened to specific blocks.

On a cached system, I would expect this algorithm to just slow things
down.

  I thought that RISC-V would produce cheaper and more cores, and that
  single core performance was not that critical.

Slow cores are useful in some applications, sure.

  Talking about x86, don't forget that there is microcode below each
  instruction.

This is a false sattement.  Even it it were true, how is that relevant
for this discusson?  The relevant instructions run in one cycle.

-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list