mpn_sqrtrem{1,2} - patch for pure C implem

Torbjörn Granlund tg at
Tue Jun 6 23:40:06 UTC 2017

"Marco Bodrato" <bodrato at> writes:

  I got inspired by Adrien's code and by Paul's one, and I wrote another
  candidate replacement for mpn_sqrtrem{1,2}, working for both 32 and 64

  My goals:
   - new code should not be slower than the current :-)

A commend that!  :-)

   - get rid of all signed shifts,

Since such shifts are poorly defined in C, they should indeed be

   - get rid of all mp_limb_signed_t needs,
   - keep-on using the current invsqrttab (optimal IMO).
  I decided to use the iteration (for i an approximation of a^{-1/2})
   i' = i*(3-a*i*i)/2
  because it does not need handling signed numbers.
  This iteration nearly doubles the precision of the approximation.
  Some problems arise because of the "nearly"...
Would it help to use (say) 9-bit lookup table?  That might give a 17-bit
value the next step, etc.  Right, you write that later in your
message, sort of...

  invsqrttab allows us to start with an approximation of 8 bits and little
  more (I computed an error smaller than 1/355 < 2^-8.47).

Ah, 8.5 bits.  :-)

  A little perturbation on the first iteration
   i' = i*(3+c-a*i*i)/2 , with c = 3/(2*356^2)
  (the opposite sign wrt the current MAGIC constant...) give an error
  smaller than 1/150000 < 2^-17; the following iterations should be able to
  give at least the desired precision. (17 -> 33 -> 65 ...).
  For some computations, even for the 64-bits version, a 32x32->32
  multiplication is enough. I #define-d the gmp_uint_fast32_t type, but
  currently it is not really guaranteed "fast"...
Why do you want that?

I believe that would be an improvement on very few platforms.  Of
course, if it is used for 2-limb sqrt on 32-bit systems, it's a good

I didn't look at any detail on your code, but is has less magic magic
cnstants than my old code, which is good!

Please encrypt, key id 0xC8601622

More information about the gmp-devel mailing list