Best way to carry on 2-input architecture?
Torbjörn Granlund
tg at gmplib.org
Sun Aug 17 15:05:14 UTC 2014
Both I and Niels have looked into ISAs which support GMP operations
well.
My work is available here: https://gmplib.org/~tege/fisa.pdf
You're right that "umulhi" and addition as well as subtraction with
carry/borrow are critical operations. And for multiply throughput is
more important than low latency; except that division with few-word
divisors depend on short latency.
I think a separate carry flag might not be good for modern designs.
Carry/borrow state is better to keep in a plain register. You might
want to take a look at the Itanic GMP assembly code which achieves 1 c/l
without any separate carry flag (albeit with some Itanic specific
condition bit trickery).
You might consider the alternative of a = b + c + (d bitand 1) and a
corresponding subtract, with both low and high (i.e. carry) variants.
Of course, encoding 4 separate operands and needing 3 register reads has
a cost. An trick for denser coding is to either enforce that e.g. a ==
d, or that d can just be (say) the low 8 registers.
For multiply, d = (a * b + c) mod B (B being the word base) and d = [(a
* b + c) / B] are very useful. Again, the encoding and read port
problem might arise.
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list