# mpn_sqrtrem{1,2} - patch for pure C implem

Torbjörn Granlund tg at gmplib.org
Tue Jun 6 23:40:06 UTC 2017

```"Marco Bodrato" <bodrato at mail.dm.unipi.it> writes:

I got inspired by Adrien's code and by Paul's one, and I wrote another
candidate replacement for mpn_sqrtrem{1,2}, working for both 32 and 64
bits.

Nice!

My goals:
- new code should not be slower than the current :-)

A commend that!  :-)

- get rid of all signed shifts,

Since such shifts are poorly defined in C, they should indeed be
avoided.

- get rid of all mp_limb_signed_t needs,
- keep-on using the current invsqrttab (optimal IMO).

I decided to use the iteration (for i an approximation of a^{-1/2})
i' = i*(3-a*i*i)/2
because it does not need handling signed numbers.

This iteration nearly doubles the precision of the approximation.
Some problems arise because of the "nearly"...

Would it help to use (say) 9-bit lookup table?  That might give a 17-bit
value the next step, etc.  Right, you write that later in your
message, sort of...

invsqrttab allows us to start with an approximation of 8 bits and little
more (I computed an error smaller than 1/355 < 2^-8.47).

Ah, 8.5 bits.  :-)

A little perturbation on the first iteration
i' = i*(3+c-a*i*i)/2 , with c = 3/(2*356^2)
(the opposite sign wrt the current MAGIC constant...) give an error
smaller than 1/150000 < 2^-17; the following iterations should be able to
give at least the desired precision. (17 -> 33 -> 65 ...).

For some computations, even for the 64-bits version, a 32x32->32
multiplication is enough. I #define-d the gmp_uint_fast32_t type, but
currently it is not really guaranteed "fast"...

Why do you want that?

I believe that would be an improvement on very few platforms.  Of
course, if it is used for 2-limb sqrt on 32-bit systems, it's a good
improvement.

I didn't look at any detail on your code, but is has less magic magic
cnstants than my old code, which is good!

--
Torbjörn