div_qr_2

Niels Möller nisse at lysator.liu.se
Mon Mar 21 23:03:58 CET 2011


I've now added a 4/2 division loop, based on earlier work of Torbjörn's.
It generates two quotient limbs at a time, while the old 3/2 loop
generates one quotient limb at a time.

Tuned via DIV_QR_2_PI2_THRESHOLD, which by default is infinity. I'm
actually not sure if the 4/2 C code (generating two quotient limbs at a
time) can beat the 3/2 division anywhere, but at least Torbjörns said
that an assembler implementation can be competitive.

But I added tuneup code (with check_size=500, so if it not better there,
the threshold is set to infinity). It will be interesting to see the
results for the nightly build.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.



More information about the gmp-devel mailing list