Jacobi symbol using Lehmer's algorithm.
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Wed Jan 27 19:26:08 CET 2010
Hi!
I have again some time for the project.
> > New version (the file is getting rather large now) at
> > http://www.lysator.liu.se/~nisse/misc/jacobi.c
While configuring and tuning GMP on an atom-32, I tested Niels' code.
Great! It works smoothly and a lot faster than current code!
> And here's a benchmark on shell:
> n old lehmer subquadratic
> 1000 36827.000 0.121 0.099
> 3000 300266.000 0.118 0.058
> 10000 3564051.000 0.103 0.029
On the 32-bits atom I got (bits per limbs are halved, then I doubled limbs)
2000 396025.000 0.182 0.111
6000 3712232.000 0.176 0.064
20000 42794674.000 0.169 0.029
The speed-up for the subquadratic algorithm is basically the same!
> So...what is the speedup @ n=100000?
You mean 6400000 bits? I got:
200000 4773830346.00 0.15876 0.00531
it's a 188x speed-up, current code took 1 hour and 20 minutes...
Niels' used 26 seconds!
> Very nice, a factor 34! But Paul got a factor 100. :-)
I tested also Paul's code. With the same (20.000 and 200.000) sizes. Its
speed is comparable with the new subquadratic code by Niels, but slightly
slower.
Obviously the comparison is unfair, because Niels' code uses the two
thresholds:
#define HGCD_JACOBI_THRESHOLD HGCD_THRESHOLD
#define JACOBI_DC_THRESHOLD GCD_DC_THRESHOLD
based on tuned threshold.
Paul's code needs some thresholds too, but their value is hard-coded and I
did not even try to tune them...
For sure their speed are comparable. That's why I decided to run a test
comparing _results_ given by the two sub-quadratic codes (with even a and
odd b), they hopefully do not share the same bugs. No differences have
been found, yet.
Regards,
Marco
--
http://bodrato.it/
More information about the gmp-devel
mailing list