GCD project status?
Niels Möller
nisse at lysator.liu.se
Thu Sep 19 08:14:45 UTC 2019
tg at gmplib.org (Torbjörn Granlund) writes:
> With some thought, I am sure the table size could come down while the
> accuracy could improve.
I see two ways. Either just do an inversion table, which will cost an
additional multiplication when used. Than one could handle quotients up
to 7 or 8 bits bits with just 256 bytes of table.
Or shift both ni and di, and get quotient with something like
q = tabp[(ni << NBITS) + di] >> (NBITS - (dcnt - ncnt))
Then one could handle quotients up to 5 bits with a table of 1024 bytes
(indexing by 5 significant bits of each of n and d, excluding the most
significant one bit).
> q0 = tabp[(ni << NBITS) + di];
> r0 = n0 - q0 * d0;
> mask = -(mp_limb_t) (r0 >= d0);
> q0 -= mask;
> r0 -= d0 & mask;
You round tabulated q down, and adjust up. I guess one can save one
cycle if one instead rounds it up and adjusts down,
q0 = tabp[(ni << NBITS) + di];
t = q0 * d0;
r0 = n0 - t;
mask = -(mp_limb_t) (n0 < t);
q0 += mask;
r0 += d0 & mask;
since computation of mask and initial r0 can then run in parallel (or
the same instruction, if the compiler is clever enough to use use carry
out). Or is there a risk for overflow in q0 * d0?
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list