Risc V greatly underperforms
Hans Petter Selasky
hps at selasky.org
Wed Oct 6 10:12:18 UTC 2021
On 10/6/21 10:48 AM, Torbjörn Granlund wrote:
> Hans Petter Selasky <hps at selasky.org> writes:
>
> If the GMP could utilitize multiple cores when doing bignum
> multiplication and addition, I think the picture would look different.
>
> For example for addition, you could split the number in two parts, and
> then speculate if there is an addition for the higher part or not.
>
> And if the guess is wrong, then what?
Hi,
Then you get a penalty. But the penalty might not be so big assuming
random input. Adding one to a number is pretty cheap and you only need
to continue traversing the data words making up the number when the
increment overflows. Which in turn gets you a variable number of iterations.
> It is well knowm in a model which ignores caches and memory bandwidth,
> than one can get 2n/k + log(k) word operation steps for n-word addition
> on k execution agents. Agent k computes the sum of block k with both
> carry = 1 and carry in = 0 and saves both results. The log(k) term is
> for serially choosing the proper block depending on whether carry-in
> happened to specific blocks.
>
> On a cached system, I would expect this algorithm to just slow things
> down.
>
> I thought that RISC-V would produce cheaper and more cores, and that
> single core performance was not that critical.
>
> Slow cores are useful in some applications, sure.
>
> Talking about x86, don't forget that there is microcode below each
> instruction.
>
> This is a false sattement. Even it it were true, how is that relevant
> for this discusson? The relevant instructions run in one cycle.
How microcode works and what instruction sequences are optimal for a
bignum adder, I will not go into. My point is just that x86 instructions
are parsed before they are executed. Almost like a VM.
I would guess that if RISC-V executed "N" instructions at a time on the
same logical core w/o using microcode, the performance would be
comparable to x86. Then it would be up to the compiler to layout the
instructions correctly and not the microcode.
--HPS
More information about the gmp-devel
mailing list