Alternative div_qr_1

Niels Möller nisse at lysator.liu.se
Tue Jun 15 20:17:58 CEST 2010


Torbjorn Granlund <tg at gmplib.org> writes:

> I can only reproduce *some* data dependency, but not quite like that:

This is what I get on shell:

$ ./speed -s1000-1005,2000-2005 -C mpn_mod_1_1.0xabcdef0123456780 mpn_mod_1_1.0xbcdef0123456780
overhead 6.06 cycles, precision 10000 units of 3.75e-10 secs, CPU freq 2668.53 MHz
        mpn_mod_1_1.0xabcdef0123456780 mpn_mod_1_1.0xbcdef0123456780
1000          11.5400      #11.2120
1001          11.5405      #11.2128
1002          11.5449      #11.2096
1003          11.5454      #11.2104
1004          11.5418      #11.2112
1005          11.5423      #11.2080
2000          11.4540      #11.1060
2001          11.4523      #11.1044
2002          11.4525      #11.1049
2003          11.4548      #11.1073
2004          11.4531      #11.1038
2005          11.4534      #11.1062
              normalized   unnormalized

A fairly consistent difference of around 0.3 cycles per limb. And then
the unnormalized case implies more work to do outside of the loop.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list