mpn_sqrtrem{1,2} - patch for pure C implem

Tue Jun 6 23:40:06 UTC 2017

"Marco Bodrato" <bodrato at mail.dm.unipi.it> writes:

  I got inspired by Adrien's code and by Paul's one, and I wrote another
  candidate replacement for mpn_sqrtrem{1,2}, working for both 32 and 64
  bits.

Nice!

  My goals:
   - new code should not be slower than the current :-)

A commend that!  :-)

   - get rid of all signed shifts,

Since such shifts are poorly defined in C, they should indeed be
avoided.

   - get rid of all mp_limb_signed_t needs,
   - keep-on using the current invsqrttab (optimal IMO).

  I decided to use the iteration (for i an approximation of a^{-1/2})
   i' = i*(3-a*i*i)/2
  because it does not need handling signed numbers.

  This iteration nearly doubles the precision of the approximation.
  Some problems arise because of the "nearly"...

Would it help to use (say) 9-bit lookup table?  That might give a 17-bit
value the next step, etc.  Right, you write that later in your
message, sort of...

  invsqrttab allows us to start with an approximation of 8 bits and little
  more (I computed an error smaller than 1/355 < 2^-8.47).

Ah, 8.5 bits.  :-)

  A little perturbation on the first iteration
   i' = i*(3+c-a*i*i)/2 , with c = 3/(2*356^2)
  (the opposite sign wrt the current MAGIC constant...) give an error
  smaller than 1/150000 < 2^-17; the following iterations should be able to
  give at least the desired precision. (17 -> 33 -> 65 ...).

  For some computations, even for the 64-bits version, a 32x32->32
  multiplication is enough. I #define-d the gmp_uint_fast32_t type, but
  currently it is not really guaranteed "fast"...

Why do you want that?

I believe that would be an improvement on very few platforms.  Of
course, if it is used for 2-limb sqrt on 32-bit systems, it's a good
improvement.

I didn't look at any detail on your code, but is has less magic magic
cnstants than my old code, which is good!

-- 
Torbjörn
Please encrypt, key id 0xC8601622