Performance on riscv32
Torbjörn Granlund
tg at gmplib.org
Mon Mar 30 18:28:34 CEST 2026
Niels Möller <nisse at lysator.liu.se> writes:
I'm not that familiar with riscv, but to me the generated code looks
pretty good under the architectural limitations, and I see no obvious
microptimizations (only a single move instruction that appear a bit
redundant). But I may be missing something.
The one thing you could do is to unroll the code (by means of
-funroll-loops, presumably). That would mitigate the problem with
RiscV's weak addressing.
When benchmarking, my ed25519 code is about 50% slower slower than the
monocypher C library, for the ed25519 signing operation (10 million
cycles vs 7 million). That library appears to use arithmetic based on
nail bits (in GMP terminology), to avoid dealing with low-level carry
propagation (and it also has the advantage of specialized code for the
size of interest). So I wonder, is it possible to get reasonable speed
with fullsize limbs (no nails) on this platform? If I could switch from
mini-gmp to full gmp (a bit challenging due to the rather limited
environment with no normal libc), and revive GMP nails code, would that
make sense for performance?
Just like for Alpha and MIPS, nails are neessary. In fact, the quite
new RiscV is not all that different from those decades old
architectures.
Unfortunately, we've let nails rot in GMP. I don't expect it to be
terribly hard to make it work again.
There are SIMD optional instructions and I believe they tried to address
some of the shortcomings of the basic instruction set there. They even
have some carry-support, IIRC. I have no idea if it is well-designed
enough to be practically useful, though.
And beware of the "modular" design of RiscV! Not only this SIMD stuff
is optimal (which is unsurprising). Need rotate insructions? Those are
optional! The full instruction set is weak, the mandatory basic
instruction set is extremely limited.
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list