Divide and conquer binomial

Niels Möller nisse at lysator.liu.se
Wed Dec 14 22:42:39 CET 2011


Let me summarize my understanding of the binomial, and see if anybody
has any additions. I'm aware of a couple of different algorithms:

1. When the result is small (single limb, possibly with extensions to a
   few limbs):

   Tabulate the odd part of n! mod B and the corresponding mod B
   inverse. Then one can compute the odd part of bin(n, k) using n! /
   (k! (n-k)!) and these tables. To deal with the powers of two, use the
   cute formula

     ctz(n!) = n - popcount(n)

   (ctz is count trailing zeros, aka 2-adic valuation, often denoted
   v_2).

2. When k is small and n is large, just multiply together n (n-1) ...
   (n-k+1), preferably in an order using roughly balanced
   multiplications, and divide by k! using tabulated values and
   inverses.

3. When k and n are of moderate size, so that it is practical to iterate
   over all primes <= n, count multiplicity of each prime in dividend
   and divisor. I think Marco has implemented this.

4. When k is of moderate size, and n is too large for iterating over the
   primes, but n and k still small enough so that bin(n, k) fits in
   available storage, it would make sense to use some divide and conquer
   algorithm. For example, bin(2^80, 2^20), which is approximately
   2^{64427324.069447006302169138472946275466} (using pari/gp with
   default precision and the lngamma function) and needs 61 Mb of
   storage.

   First, one question is what complexity to expect. The result is of
   size N = O(k log n), so I don't think one can't hope for anything
   better than O(M(k log n)). I think one should be happy if one get
   within a log N factor of that.
   
   One possibly useful identity (assume for simplicity that k is
   even) is:

     bin(n, k) = bin(n, k/2) bin(n-k/2, k/2) / bin(k, k/2)

   Here, we halve the size of k, getting two recursive calls with
   essentially the same n (which means that we will unfortunately not
   get to use (3) as a base case, only (2). And one recursive call for
   the adjustment factor bin(k, k/2).

   I think Marco has tried this or something similar.

   * If the complexity is supposed to be close to linear in k, it seems
     right to divide into two pieces with size k/2. It should save work
     by both (i) getting a reasonably balanced multiplication tree, and
     (ii) divide out many common factors early.

   * What about bin(k, k/2)? I imagine that k is typically in range for
     method (3). Or are there any other shortcuts for this particular
     form?

   * The factor bin(k, k/2) is still pretty large, of size close to k
     (i.e., value on the order of 2^k, or 2^{2^20} in the example
     bin(2^80, 2^20)).

     If we can somehow divide out most of those factors earlier, that
     should save considerable work.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.



More information about the gmp-devel mailing list