div_qr_1n_pi1
Niels Möller
nisse at lysator.liu.se
Sun Jun 6 20:05:59 UTC 2021
nisse at lysator.liu.se (Niels Möller) writes:
> $ ./speed -p 1000000 -s 2-20 -C mpn_div_qr_1n_pi1.0x8765432108765432 mpn_div_qr_1n_pi1_1.0x8765432108765432 mpn_div_qr_1n_pi1_2.0x8765432108765432 mpn_div_qr_1n_pi1_3.0x8765432108765432 mpn_div_qr_1n_pi1_4.0x8765432108765432
> overhead 2.63 cycles, precision 1000000 units of 7.16e-10 secs, CPU freq 1396.05 MHz
> mpn_div_qr_1n_pi1.0x8765432108765432 mpn_div_qr_1n_pi1_1.0x8765432108765432 mpn_div_qr_1n_pi1_2.0x8765432108765432 mpn_div_qr_1n_pi1_3.0x8765432108765432 mpn_div_qr_1n_pi1_4.0x8765432108765432
> 2 4.4566 #3.9635 5.8490 5.4085 6.0087
> 3 4.4323 #4.2441 5.8115 5.1158 5.7832
> 4 4.5348 #4.3992 5.7306 5.0807 5.8611
> 5 #4.5534 4.6698 5.7493 4.9803 5.5605
> 6 #4.5653 4.8412 5.7497 4.9129 5.6516
> 7 #4.6069 5.1388 5.7388 4.8811 5.6110
> 8 #4.6202 5.5073 5.7423 4.9359 5.5695
> 9 #4.6433 5.7341 5.7537 4.9357 5.5407
> 10 #4.6436 5.9595 5.7400 4.9231 5.5428
> 11 #4.6698 6.1348 5.7449 4.9430 5.5237
> 12 #4.6395 6.2541 5.7378 4.9452 5.5239
> 13 #4.6905 6.3761 5.7373 4.9482 5.5085
> 14 #4.6700 6.4692 5.8173 4.9447 5.5006
> 15 #4.6643 6.5548 5.7426 4.9644 5.4958
> 16 #4.6809 6.6305 5.7439 4.9625 5.4924
> 17 #4.6800 6.6901 5.7418 4.9576 5.4760
> 18 #4.6903 6.7440 5.7436 4.9866 5.4840
> 19 #4.6886 6.7891 5.7366 4.9753 5.4872
> 20 #4.6818 6.8370 5.7405 4.9783 5.4820
> asm method 1 method 2 method 3 method 4
And I don't quite trust these cycle numbers, they should probably be
twice as large, on the order of 10 cycles/limb for all variants. Less
than 5 cycles is too good to be true, right?
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list