div_qr_1n_pi1

Niels Möller nisse at lysator.liu.se
Sun Jun 6 20:05:59 UTC 2021


nisse at lysator.liu.se (Niels Möller) writes:

> $ ./speed -p 1000000 -s 2-20 -C mpn_div_qr_1n_pi1.0x8765432108765432 mpn_div_qr_1n_pi1_1.0x8765432108765432 mpn_div_qr_1n_pi1_2.0x8765432108765432 mpn_div_qr_1n_pi1_3.0x8765432108765432 mpn_div_qr_1n_pi1_4.0x8765432108765432
> overhead 2.63 cycles, precision 1000000 units of 7.16e-10 secs, CPU freq 1396.05 MHz
>         mpn_div_qr_1n_pi1.0x8765432108765432 mpn_div_qr_1n_pi1_1.0x8765432108765432 mpn_div_qr_1n_pi1_2.0x8765432108765432 mpn_div_qr_1n_pi1_3.0x8765432108765432 mpn_div_qr_1n_pi1_4.0x8765432108765432
> 2              4.4566       #3.9635        5.8490        5.4085        6.0087
> 3              4.4323       #4.2441        5.8115        5.1158        5.7832
> 4              4.5348       #4.3992        5.7306        5.0807        5.8611
> 5             #4.5534        4.6698        5.7493        4.9803        5.5605
> 6             #4.5653        4.8412        5.7497        4.9129        5.6516
> 7             #4.6069        5.1388        5.7388        4.8811        5.6110
> 8             #4.6202        5.5073        5.7423        4.9359        5.5695
> 9             #4.6433        5.7341        5.7537        4.9357        5.5407
> 10            #4.6436        5.9595        5.7400        4.9231        5.5428
> 11            #4.6698        6.1348        5.7449        4.9430        5.5237
> 12            #4.6395        6.2541        5.7378        4.9452        5.5239
> 13            #4.6905        6.3761        5.7373        4.9482        5.5085
> 14            #4.6700        6.4692        5.8173        4.9447        5.5006
> 15            #4.6643        6.5548        5.7426        4.9644        5.4958
> 16            #4.6809        6.6305        5.7439        4.9625        5.4924
> 17            #4.6800        6.6901        5.7418        4.9576        5.4760
> 18            #4.6903        6.7440        5.7436        4.9866        5.4840
> 19            #4.6886        6.7891        5.7366        4.9753        5.4872
> 20            #4.6818        6.8370        5.7405        4.9783        5.4820
>               asm          method 1      method 2      method 3      method 4

And I don't quite trust these cycle numbers, they should probably be
twice as large, on the order of 10 cycles/limb for all variants. Less
than 5 cycles is too good to be true, right?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list