Recent changes to mpn_get_str/mpn_set_str

Marco Bodrato bodrato at
Wed Feb 15 03:24:21 UTC 2017


Il Mar, 14 Febbraio 2017 3:10 am, Torbjörn Granlund ha scritto:
> "Marco Bodrato" <bodrato at> writes:

>   For some bases it the current code is particularly slow for a single
>   limb... the attached one is faster in those cases...
> Which bases?

Current code on shell gives:

$ tune/speed.17265 -p10000000 -cs1-3 mpn_get_str.9 mpn_get_str.10
mpn_get_str.11 mpn_get_str.12 mpn_get_str.191 mpn_get_str.192
overhead 5.84 cycles, precision 10000000 units of 2.86e-10 secs, CPU freq
3500.07 MHz
 mpn_get_str.9 get_str.10 get_str.11 get_str.12 get_str.191 get_str.192
1       803.60   #190.86     732.71     693.22     346.52      346.19
2       947.90    363.97     367.61     376.76    #237.93      237.96
3       506.40    508.26     549.63     565.51    #380.27      382.79

It seems that converting a single limb to base 11 costs more cycles than
converting three. It might be an error in the measuring code...

>   I quickly wrote a sort-of pow_1, but this part needs refinement.
> We have some pow_1 code of different leves of complexity; mpz/n_pow_ui.c
> is pretty hairy while mpn/generic/pow_1.c is simpler.  I think the
> latter is perfectly suitable for get_str's needs (the mpz code checks
> for bases which are much smaller than the max limb value, then
> conditionally performs some limb steps).

My code uses the latter, but it needs the power of the base (not only
big_base), and it computes the sub-limb part locally.
To fully integrate this code with the _dc_ functions, I should probably
force the use of big_base blocks...



More information about the gmp-devel mailing list