bodrato at mail.dm.unipi.it
Fri Dec 29 16:58:53 UTC 2017
Il Gio, 28 Dicembre 2017 8:14 am, Niels Möller ha scritto:
> Replacing multiplies by squares is no gain for scalar numbers, but
Of course, but this gives me an idea...
> Let's look closer at one of the functions.
>> mul6 (mp_limb_t m)
>> - mp_limb_t m01 = (m + 0) * (m + 1);
>> - mp_limb_t m23 = (m + 2) * (m + 3);
>> - mp_limb_t m45 = (m + 4) * (m + 5) >> 1;
>> - mp_limb_t m0123 = m01 * m23 >> 3;
>> - return m0123 * m45;
>> + mp_limb_t m05 = (m + 0) * (m + 5);
>> + mp_limb_t m1234 = (m05 + 4) * (m05 + 6) >> 3;
>> + return m1234 * (m05 >> 1);
(m05 + 4) * (m05 + 6) = (m05 + 5) * (m05 + 5) - 1
But m05 is even, so (m05 + 5) is odd, and (m05 + 5)^2 = 1 mod 8
We can write
mp_limb_t m1234 = (m05 + 5) * (m05 + 5) >> 3;
The square is no gain, but one less addition is needed.
> The dependency depth is unchanged, though, since before the change, the
There is one more addition in the chain in my proposed variation.
Torbjörn wrote all those function, so I'd like to have his opinion.
Just for fun I wrote also a variant of mpz_fac_ui for mini-gmp that halves
mpz_fac_ui (mpz_t x, unsigned long n)
unsigned long p = n >> (1 ^ n & 1);
long i = n >>= 1;
mpz_set_ui (x, p + (p == 0));
/* p can overflow in the loop below,
* if n > sqrt(ULONG_MAX*8) - 2
while (--i > 0)
mpz_mul_ui (x, x, p += i);
mpz_mul_2exp (x, x, n);
PS: because of a temporary problem with my e-mail, I did not receive
Niels' message and I took it from the archive... that's why the thread was
More information about the gmp-devel