gcd_22
Niels Möller
nisse at lysator.liu.se
Tue Aug 20 06:55:05 UTC 2019
tg at gmplib.org (Torbjörn Granlund) writes:
> I think we might get more speed from this algorithm:
>
> while (highlimb(U) | highlimb(V)) != 0
> S = U - V
> T = V - U
> cnt = count_trailing_zeros(lowlimb(S))
> cnd1 = U < V
> cnd2 = highlimb(U) < highlimb(V)
> if (cnd2)
> V = U
> U = T
> else
> U = S
> if (cnd1 != cnd2) { fix mistake! }
> U >>= cnt;
>
> Why would it be faster? Because while cnd1 needs to wait for the
> previous iteration's double-limb shifting, cnd2 depends on just the
> single-limb shifting of the high U world.
Interesting. I did something related in the u±v gcd_22 variant, for
other reasons. We probably don't need any fixup for the V assignment, I
would expect that
V <-- min(U,V)
can be replaced with
V <-- (highlimb(U) < highlimb(V)) ? U : V
with no measurable impact on performance. Since it will make a
difference for at most one iteration, and we still make reasonable
progress when it happens.
--- a/mpn/generic/gcd_22.c Sat Aug 17 00:59:04 2019 +0200
+++ b/mpn/generic/gcd_22.c Tue Aug 20 07:51:25 2019 +0200
@@ -52,7 +53,7 @@ mpn_gcd_22 (mp_limb_t u1, mp_limb_t u0,
{
mp_limb_t vgtu, t1, t0;
sub_ddmmss (t1, t0, u1, u0, v1, v0);
- vgtu = LIMB_HIGHBIT_TO_MASK(t1);
+ vgtu = LIMB_HIGHBIT_TO_MASK(u1 - v1);
if (UNLIKELY (t0 == 0))
{
@@ -91,6 +92,9 @@ mpn_gcd_22 (mp_limb_t u1, mp_limb_t u0,
since t0 != 0. */
u0 = (t0 ^ vgtu) - vgtu;
u1 = t1 ^ vgtu;
+ if (UNLIKELY ((mp_limb_signed_t) u1 < 0))
+ {
+ u0 = ~u0;
+ u1 = -u1;
+ }
if (UNLIKELY (c == GMP_LIMB_BITS))
{
u0 = u1;
I take it the idea is to have an unlikely branch for this case;
attempting a branchless correction of vgtu would miss the point?
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list