Recent changes to mpn_get_str/mpn_set_str
Marco Bodrato
bodrato at mail.dm.unipi.it
Wed Feb 15 03:24:21 UTC 2017
Ciao,
Il Mar, 14 Febbraio 2017 3:10 am, Torbjörn Granlund ha scritto:
> "Marco Bodrato" <bodrato at mail.dm.unipi.it> writes:
> For some bases it the current code is particularly slow for a single
> limb... the attached one is faster in those cases...
>
> Which bases?
Current code on shell gives:
$ tune/speed.17265 -p10000000 -cs1-3 mpn_get_str.9 mpn_get_str.10
mpn_get_str.11 mpn_get_str.12 mpn_get_str.191 mpn_get_str.192
overhead 5.84 cycles, precision 10000000 units of 2.86e-10 secs, CPU freq
3500.07 MHz
mpn_get_str.9 get_str.10 get_str.11 get_str.12 get_str.191 get_str.192
1 803.60 #190.86 732.71 693.22 346.52 346.19
2 947.90 363.97 367.61 376.76 #237.93 237.96
3 506.40 508.26 549.63 565.51 #380.27 382.79
It seems that converting a single limb to base 11 costs more cycles than
converting three. It might be an error in the measuring code...
> I quickly wrote a sort-of pow_1, but this part needs refinement.
>
> We have some pow_1 code of different leves of complexity; mpz/n_pow_ui.c
> is pretty hairy while mpn/generic/pow_1.c is simpler. I think the
> latter is perfectly suitable for get_str's needs (the mpz code checks
> for bases which are much smaller than the max limb value, then
> conditionally performs some limb steps).
My code uses the latter, but it needs the power of the base (not only
big_base), and it computes the sub-limb part locally.
To fully integrate this code with the _dc_ functions, I should probably
force the use of big_base blocks...
Regards,
m
--
http://bodrato.it/papers/
More information about the gmp-devel
mailing list