Binomial improvements

Torbjorn Granlund tg at gmplib.org
Sun Dec 11 23:03:24 CET 2011


nisse at lysator.liu.se (Niels Möller) writes:

  Would that work (I haven't thought very carefully about the math)? The
  size of the needed tables is the same, and it ought to give a decent
  speedup. fac_inv[k] holds the mod B inverse of the odd part of k!.
  
Sorry about the mail crossing...it works, albeit in a slightly narrower
range.

I choose to start with the "division", initialising the c variable from
the table.

This looses 4 bits, resulting from the shifts, compared to the last
code.  (I think I understand why this needs 4 bit extra, but I don't
understand why the last code does not need that...)

I suppose we could speed things up 10%-20%, and presumably get back the
4 bits if we shifted i and not c in the loop.  (Then the recurrency path
would consist of just back-to-back multiply insns.)
  
I think this code is a good obfuscation of n-choose-k...  :-)

Here is the new code:

static mp_limb_t facinv[] = {
#if GMP_NUMB_BITS == 64
  0x0000000000000001,  /*  1 */  0x0000000000000001,  /*  2 */
  0xaaaaaaaaaaaaaaab,  /*  3 */  0xaaaaaaaaaaaaaaab,  /*  4 */
  0xeeeeeeeeeeeeeeef,  /*  5 */  0x4fa4fa4fa4fa4fa5,  /*  6 */
  0x2ff2ff2ff2ff2ff3,  /*  7 */  0x2ff2ff2ff2ff2ff3,  /*  8 */
  0x938cc70553e3771b,  /*  9 */  0xb71c27cddd93e49f,  /* 10 */
  0xb38e3229fcdee63d,  /* 11 */  0xe684bb63544a4cbf,  /* 12 */
  0xc2f684917ca340fb,  /* 13 */  0xf747c9cba417526d,  /* 14 */
  0xbb26eb51d7bd49c3,  /* 15 */  0xbb26eb51d7bd49c3,  /* 16 */
  0xb0a7efb985294093,  /* 17 */  0xbe4b8c69f259eabb,  /* 18 */
  0x6854d17ed6dc4fb9,  /* 19 */  0xe1aa904c915f4325,  /* 20 */
  0x3b8206df131cead1,  /* 21 */  0x79c6009fea76fe13,  /* 22 */
  0xd8c5d381633cd365,  /* 23 */  0x4841f12b21144677,  /* 24 */
  0x4a91ff68200b0d0f,  /* 25 */  0x8f9513a58c4f9e8b,  /* 26 */
  0x2b3e690621a42251,  /* 27 */  0x4f520f00e03c04e7,  /* 28 */
  0x2edf84ee600211d3,  /* 29 */  0xadcaa2764aaacdfd,  /* 30 */
  0x161f4f9033f4fe63,  /* 31 */  0x161f4f9033f4fe63   /* 32 */
#endif
#if GMP_NUMB_BITS == 32
  0x00000001,  /*  1 */  0x00000001,  /*  2 */
  0xaaaaaaab,  /*  3 */  0xaaaaaaab,  /*  4 */
  0xeeeeeeef,  /*  5 */  0xa4fa4fa5,  /*  6 */
  0xf2ff2ff3,  /*  7 */  0xf2ff2ff3,  /*  8 */
  0x53e3771b,  /*  9 */  0xdd93e49f,  /* 10 */
  0xfcdee63d,  /* 11 */  0x544a4cbf,  /* 12 */
  0x7ca340fb,  /* 13 */  0xa417526d,  /* 14 */
  0xd7bd49c3,  /* 15 */  0xd7bd49c3   /* 16 */
#endif
};

static unsigned char ctz[] = {0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,5};

void
mpz_bc_bin_uiui (mpz_ptr r, unsigned long int n, unsigned long int k)
{
  mp_limb_t i, j, c;

  c = facinv[k - 1];
  i = n - k + 1;
  for (j = 0; j < k; j++)
    {
      c = c * i;
      c = c >> ctz[j];
      i++;
    }

  c = c & ((CNST_LIMB(1) << GMP_NUMB_BITS-4) - 1);

  PTR(r)[0] = c;
  SIZ(r) = 1;
}

-- 
Torbjörn


More information about the gmp-devel mailing list