Binomial improvements
Torbjorn Granlund
tg at gmplib.org
Sun Dec 11 23:03:24 CET 2011
nisse at lysator.liu.se (Niels Möller) writes:
Would that work (I haven't thought very carefully about the math)? The
size of the needed tables is the same, and it ought to give a decent
speedup. fac_inv[k] holds the mod B inverse of the odd part of k!.
Sorry about the mail crossing...it works, albeit in a slightly narrower
range.
I choose to start with the "division", initialising the c variable from
the table.
This looses 4 bits, resulting from the shifts, compared to the last
code. (I think I understand why this needs 4 bit extra, but I don't
understand why the last code does not need that...)
I suppose we could speed things up 10%-20%, and presumably get back the
4 bits if we shifted i and not c in the loop. (Then the recurrency path
would consist of just back-to-back multiply insns.)
I think this code is a good obfuscation of n-choose-k... :-)
Here is the new code:
static mp_limb_t facinv[] = {
#if GMP_NUMB_BITS == 64
0x0000000000000001, /* 1 */ 0x0000000000000001, /* 2 */
0xaaaaaaaaaaaaaaab, /* 3 */ 0xaaaaaaaaaaaaaaab, /* 4 */
0xeeeeeeeeeeeeeeef, /* 5 */ 0x4fa4fa4fa4fa4fa5, /* 6 */
0x2ff2ff2ff2ff2ff3, /* 7 */ 0x2ff2ff2ff2ff2ff3, /* 8 */
0x938cc70553e3771b, /* 9 */ 0xb71c27cddd93e49f, /* 10 */
0xb38e3229fcdee63d, /* 11 */ 0xe684bb63544a4cbf, /* 12 */
0xc2f684917ca340fb, /* 13 */ 0xf747c9cba417526d, /* 14 */
0xbb26eb51d7bd49c3, /* 15 */ 0xbb26eb51d7bd49c3, /* 16 */
0xb0a7efb985294093, /* 17 */ 0xbe4b8c69f259eabb, /* 18 */
0x6854d17ed6dc4fb9, /* 19 */ 0xe1aa904c915f4325, /* 20 */
0x3b8206df131cead1, /* 21 */ 0x79c6009fea76fe13, /* 22 */
0xd8c5d381633cd365, /* 23 */ 0x4841f12b21144677, /* 24 */
0x4a91ff68200b0d0f, /* 25 */ 0x8f9513a58c4f9e8b, /* 26 */
0x2b3e690621a42251, /* 27 */ 0x4f520f00e03c04e7, /* 28 */
0x2edf84ee600211d3, /* 29 */ 0xadcaa2764aaacdfd, /* 30 */
0x161f4f9033f4fe63, /* 31 */ 0x161f4f9033f4fe63 /* 32 */
#endif
#if GMP_NUMB_BITS == 32
0x00000001, /* 1 */ 0x00000001, /* 2 */
0xaaaaaaab, /* 3 */ 0xaaaaaaab, /* 4 */
0xeeeeeeef, /* 5 */ 0xa4fa4fa5, /* 6 */
0xf2ff2ff3, /* 7 */ 0xf2ff2ff3, /* 8 */
0x53e3771b, /* 9 */ 0xdd93e49f, /* 10 */
0xfcdee63d, /* 11 */ 0x544a4cbf, /* 12 */
0x7ca340fb, /* 13 */ 0xa417526d, /* 14 */
0xd7bd49c3, /* 15 */ 0xd7bd49c3 /* 16 */
#endif
};
static unsigned char ctz[] = {0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,5};
void
mpz_bc_bin_uiui (mpz_ptr r, unsigned long int n, unsigned long int k)
{
mp_limb_t i, j, c;
c = facinv[k - 1];
i = n - k + 1;
for (j = 0; j < k; j++)
{
c = c * i;
c = c >> ctz[j];
i++;
}
c = c & ((CNST_LIMB(1) << GMP_NUMB_BITS-4) - 1);
PTR(r)[0] = c;
SIZ(r) = 1;
}
--
Torbjörn
More information about the gmp-devel
mailing list