Best way to carry on 2-input architecture?

Torbjörn Granlund tg at
Mon Aug 18 13:39:00 UTC 2014

nisse at (Niels Möller) writes:

  I also don't know the ARM internals. But short latency between carry in
  and carry out is important to make the umaal and umlal instructions
  useful for bignum multiplication.

One usually cannot have a critical-path addend for a*b+c operations.
But these operations are useful anyway, since the needed operation is
actually a*b+x+y, where x is not on the critical path, while y is.

The solution is of course to choose the proper of x and y in the a*b+c
op, and use a plain old add op for the other.

There are a few examples of how GMP does this in the ia64 and arm GMP
assembly directories.

Please encrypt, key id 0xC8601622

More information about the gmp-devel mailing list