# Fast mpz_bin_ui

Marco Bodrato bodrato at mail.dm.unipi.it
Fri Dec 29 16:58:53 UTC 2017

```Ciao,

Il Gio, 28 Dicembre 2017 8:14 am, Niels Möller ha scritto:
> Replacing multiplies by squares is no gain for scalar numbers, but

Of course, but this gives me an idea...

> Let's look closer at one of the functions.

>>  mul6 (mp_limb_t m)
>>  {
>> -  mp_limb_t m01 = (m + 0) * (m + 1);
>> -  mp_limb_t m23 = (m + 2) * (m + 3);
>> -  mp_limb_t m45 = (m + 4) * (m + 5) >> 1;
>> -  mp_limb_t m0123 = m01 * m23 >> 3;
>> -  return m0123 * m45;
>> +  mp_limb_t m05 = (m + 0) * (m + 5);
>> +  mp_limb_t m1234 = (m05 + 4) * (m05 + 6) >> 3;
>> +  return m1234 * (m05 >> 1);
>>  }

(m05 + 4) * (m05 + 6) = (m05 + 5) * (m05 + 5) - 1
But m05 is even, so (m05 + 5) is odd, and (m05 + 5)^2 = 1 mod 8

We can write
mp_limb_t m1234 = (m05 + 5) * (m05 + 5) >> 3;

The square is no gain, but one less addition is needed.

> The dependency depth is unchanged, though, since before the change, the

There is one more addition in the chain in my proposed variation.

Torbjörn wrote all those function, so I'd like to have his opinion.

Just for fun I wrote also a variant of mpz_fac_ui for mini-gmp that halves
the multiplications...

void
mpz_fac_ui (mpz_t x, unsigned long n)
{
unsigned long p = n >> (1 ^ n & 1);
long i = n >>= 1;
mpz_set_ui (x, p + (p == 0));
/* p can overflow in the loop below,
* if n > sqrt(ULONG_MAX*8) - 2
*/
while (--i > 0)
mpz_mul_ui (x, x, p += i);
mpz_mul_2exp (x, x, n);
}

Ĝis,
m

PS: because of a temporary problem with my e-mail, I did not receive
Niels' message and I took it from the archive... that's why the thread was
broken.

--
http://bodrato.it/papers/

```