Square roots are taken using the “Karatsuba Square Root” algorithm by Paul Zimmermann (see References).

An input *n* is split into four parts of *k* bits each, so with
*b=2^k* we have *n = a3*b^3 + a2*b^2
+ a1*b + a0*. Part a3 must be “normalized” so that either the high or
second highest bit is set. In GMP, *k* is kept on a limb boundary and
the input is left shifted (by an even number of bits) to normalize.

The square root of the high two parts is taken, by recursive application of the algorithm (bottoming out in a one-limb Newton’s method),

s1,r1 = sqrtrem (a3*b + a2)

This is an approximation to the desired root and is extended by a division to
give *s*,*r*,

q,u = divrem (r1*b + a1, 2*s1) s = s1*b + q r = u*b + a0 - q^2

The normalization requirement on a3 means at this point *s* is
either correct or 1 too big. *r* is negative in the latter case, so

if r < 0 then r = r + 2*s - 1 s = s - 1

The algorithm is expressed in a divide and conquer form, but as noted in the paper it can also be viewed as a discrete variant of Newton’s method, or as a variation on the schoolboy method (no longer taught) for square roots two digits at a time.

If the remainder *r* is not required then usually only a few high limbs
of *r* and *u* need to be calculated to determine whether an
adjustment to *s* is required. This optimization is not currently
implemented.

In the Karatsuba multiplication range this algorithm is *O(1.5*M(N/2))*, where *M(n)* is the time to multiply two numbers
of *n* limbs. In the FFT multiplication range this grows to a bound of
*O(6*M(N/2))*. In practice a factor of about 1.5 to 1.8 is
found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range.

The algorithm does all its calculations in integers and the resulting
`mpn_sqrtrem`

is used for both `mpz_sqrt`

and `mpf_sqrt`

.
The extended precision given by `mpf_sqrt_ui`

is obtained by
padding with zero limbs.