tg at gmplib.org
Wed Jul 19 22:37:30 UTC 2017
I pushed new files for 2-adic division, using the agreed-upon semantics
and interface. I expanded _q and _qr to a pure _r variant, in order to
lower the register pressure for asm variants (no need for a qp
The only asm file so far is for AMD Zen. This is a more thorough
implementation than our old redc_1.asm code. Thew new code has special
loops for operands up to 8 limbs, and also does software pipelining of
the quotient computation. (The final quotient limb computation will be
wasted, but that's no real harm.)
The new code runs just a tad bit slower than plain mul_basecase.
We should use this in lieu of redc_1 in mpn/generic/powm.c and
mpn/generic/sec_powm.c. It is a non-trivial thing, as redc_1 and
sbpi1_bdiv_r leaves the remainder in different places; redc_1 puts it in
place of the low input dividend limbs while sbpi1_bdiv_r puts it in
toward the upper end of the same operand.
The end goal is to get rid pf the redc_* interfaces completely.
Please encrypt, key id 0xC8601622
More information about the gmp-devel