mpz_jacobi

Niels Möller nisse at lysator.liu.se
Thu Apr 22 11:44:24 CEST 2010


Torbjorn Granlund <tg at gmplib.org> writes:

>   * div1 and div2 identical to those in hgcd2.c,
>   
> These are typically inlined, aren't they?  Then the only problem is
> source code duplication.

Right.

>   * mpn_hgcd2_jacobi almost the same as mpn_hgcd2 (but for performance
>     this makes some sense. Benchmarking of jacobi vs gcd in the Lehmer
>     range would be of some interest).
>   
> Cannot speed already do that for us?

It can. Before:

overhead 0.000000010 secs, precision 10000 units of 1.67e-09 secs, CPU freq 600.00 MHz
              mpn_gcd    mpz_jacobi
1        #0.000000743        1.0570
2        #0.000001509        3.0255
3        #0.000005574        1.4893
4        #0.000008193        1.5063
5        #0.000010814        1.5662
6        #0.000013368        1.5956
7        #0.000016032        1.6998
8        #0.000018702        1.6430
9        #0.000021455        1.7903
10       #0.000024659        1.7064
11       #0.000027480        1.8332
12       #0.000030912        1.7154
13       #0.000034158        1.8432
14       #0.000037302        1.7355
15       #0.000039882        1.8945
16       #0.000042906        1.8057
17       #0.000045804        1.8985
18       #0.000049174        1.7665
19       #0.000052844        1.9063
20       #0.000056709        1.7568
21       #0.000059450        1.9489
22       #0.000062424        1.8196
23       #0.000065798        1.9711
24       #0.000069897        1.8136
25       #0.000073083        2.0123
26       #0.000077093        1.8539
27       #0.000080217        2.0286
28       #0.000083045        1.9071
29       #0.000086297        2.0778
30       #0.000091596        1.8784

After:

overhead 0.000000010 secs, precision 10000 units of 1.67e-09 secs, CPU freq 600.00 MHz
              mpn_gcd    mpz_jacobi
1        #0.000000743        1.2433
2        #0.000001512        1.3652
3        #0.000005549        1.0678
4        #0.000008194        1.0657
5        #0.000010799        1.0695
6        #0.000013371        1.0557
7        #0.000016004        1.0543
8        #0.000018701        1.0539
9        #0.000021446        1.0475
10       #0.000024703        1.0419
11       #0.000027499        1.0584
12       #0.000030861        1.0490
13       #0.000034139        1.0377
14       #0.000037136        1.0399
15       #0.000039961        1.0417
16       #0.000042900        1.0509
17       #0.000045740        1.0469
18       #0.000049126        1.0469
19       #0.000052971        1.0331
20       #0.000056848        1.0257
21       #0.000059378        1.0400
22       #0.000062482        1.0476
23       #0.000065990        1.0470
24       #0.000069855        1.0388
25       #0.000073120        1.0455
26       #0.000076873        1.0381
27       #0.000080559        1.0373
28       #0.000083166        1.0501
29       #0.000086714        1.0561
30       #0.000091266        1.0447

It looks like the linear term is quite significant, and that Jacobi is
some 4% slower than gcd (if I run speed -C -d, I get the difference to
about 70 cycles per limb. All this running on my x86_32 laptop. The
direct additional cost is one o two table lookups per quotient, but
there may also be the indirect cost of higher register pressure in
hgcd2.

I haven't looked at the assembler code for mpn_hgcd2 and
mpn_hgcd2_jacobi (these functions are the main part of for the linear
term of the running time).

>   * The basecase function mpn_jacobi_2 is longer and more complicated than
>     I like. Maybe it can be simplified to eliminate some of the cases.
>   
> Is the JACOBI_BASE_METHOD stuff still relevant?

There are two basecase functions:

  jacbase.c:mpn_jacobi_base, which handles single limb numbers, and ses
  JACOBI_BASE_METHOD to select between 4 different variants.

  jacobi_lehmer.c:mpn_jacobi_2, which handles double limb numbers (and
  calls mpn_jacobi_base once they are reduced to single limbs) Uses
  JACOBI_2_METHOD to select between two different variants.

BTW, it would be nice if the thresholds web page showed
JACOBI_BASE_METHOD and other tuned *_METHOD. It would be even better if
it also showed the next best method (needs additional output from
tuneup).

Currently, I have no idea which values of JACOBI_BASE_METHOD are still
relevant.

> Oh, so the new code is never used from mpz, currently?

Correct.

> Are here any tests of the new code now, at the mpn level?

No. To test it, it must be enabled in the source code and then
tests/mpz/t-jac will test it.

/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list