Fast mpz_bin_ui

Marco Bodrato bodrato at
Wed Dec 27 23:31:41 UTC 2017


Il Gio, 12 Ottobre 2017 9:03 pm, Niels Möller ha scritto:
> So for k!, I'd suggest just using mpz_oddfac_1 (and it would make sense
> to me to change it to return the exponent for the omitted power of two,
> which from the implementation of mpz_fac_ui seems to be k -
> popcount(k)).

> For n^_{k}, I would suggest a function to compute the odd part and
> return the exponent of the power of two, which would repeatedly split
> in even and odd numbers and compute products of the form

Some years ago I published some code recursively computing the odd part of
the numerator and denominator to compute the binomial...

...maybe i should revive it, but it uses some complex thecnique and I'm
not sure it's worth doing.

>>   n (n-2)(n-4)(n-6) = [(n-3)^2 - 5]^2 - 25
> I like this trick, replacing three multiplies by two squares (but it's

An interesting idea in Johansson's code is not just the fact that it uses
a square instead of a multiplication, but the fact that the result of that
square is recycled many times; updating it instead of recomputing it again
and again.

His idea means replacing half of the multiplications with a single square
and some linear algebra.

That idea can be improved.

Instead of computing n^2 and "correct" it to obtain (n)(n-1), then
(n-2)(n-3) and so on...

We can compute p=(n)(n-k+1) and observe that

p' = (n-1)(n-k+2)= p + k - 2

p" = (n-2)(n-k+3)= p'+ k - 4

p"'= (n-3)(n-k+4)= p"+ k - 6

... and so on, with a minimal update cost; we can replace half of the

I wrote a code implementing this idea obtaining a farily fast alternative
to current code. It is attached, it's a drop-in replacement for the
current mpz/bin_ui.c

> One could try some sieving and instead produce n^_{k} as a product
> of prime powers for all primes <= k and a residual, to be able to
> easily subtract the corresonding prime-power representation of k!.

The old code I wrote in 2011 was able to remove many factors before
multiplying. Is it worth doing? If we have A, B, C and D. A and B
approximatively balanced, both much larger than C and D, with C exactly
dividing both A and B...

Should we compute (A/C)*(B/C) / D or it is faster to compute (A*B)/(C^2*D) ?

By the way, my next step will be to develop your idea of grouping factors
four by four. If k=2i is even, using the notation above we can write

p*p'= p(p+2i-2) = (p+i-1)^2 - (i-1)^2

p"*p"'= p"(p"+2i-6) = (p"+i-3)^2 - (i-3)^2

... and if k = 0 mod 4, we can compute each product by updating the
previous one...


-------------- next part --------------
A non-text attachment was scrubbed...
Name: bin_ui.c
Type: text/x-csrc
Size: 6551 bytes
Desc: not available
URL: <>

More information about the gmp-devel mailing list