Alternative div_qr_1
Niels Möller
nisse at lysator.liu.se
Tue Jun 15 20:17:58 CEST 2010
Torbjorn Granlund <tg at gmplib.org> writes:
> I can only reproduce *some* data dependency, but not quite like that:
This is what I get on shell:
$ ./speed -s1000-1005,2000-2005 -C mpn_mod_1_1.0xabcdef0123456780 mpn_mod_1_1.0xbcdef0123456780
overhead 6.06 cycles, precision 10000 units of 3.75e-10 secs, CPU freq 2668.53 MHz
mpn_mod_1_1.0xabcdef0123456780 mpn_mod_1_1.0xbcdef0123456780
1000 11.5400 #11.2120
1001 11.5405 #11.2128
1002 11.5449 #11.2096
1003 11.5454 #11.2104
1004 11.5418 #11.2112
1005 11.5423 #11.2080
2000 11.4540 #11.1060
2001 11.4523 #11.1044
2002 11.4525 #11.1049
2003 11.4548 #11.1073
2004 11.4531 #11.1038
2005 11.4534 #11.1062
normalized unnormalized
A fairly consistent difference of around 0.3 cycles per limb. And then
the unnormalized case implies more work to do outside of the loop.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list