About sqrt_exact...

marco.bodrato at tutanota.com marco.bodrato at tutanota.com
Tue Jun 16 18:42:46 CEST 2026


Ciao Niels,
15 giu 2026, 21:57 da nisse at lysator.liu.se:

> Low n limbs of the output would be a square root of the input mod B^n,
> but since B is a power of two I'd expect that there are plenty of square
> roots, so that doesn't seem so helpful.
>
There are exactly 4. Let's say that the highest 2 bits can be 00, 01, 10 or 11.
This means that if you search the square roots mod 2*B^n, you will find exactly 2 that fit in n limbs, and they are one the negation of the other one.
If the given operand is a square and it has N bits, we know in advance that the root has exactly (N+1)>>1 bits. If we search the possible sqrt mod 2^[(N+1)/2 + 1], then only one of the 4 possible results has the correct size, and that's the root.

But, you are right, this approach does not give a bidirectional algorithm.
I started with a very simple optimization: an initial loop on a single limb, and...

[bodrato at shell ~/gmp-repo]$ /var/tmp/bodrato/gmp/hg/build/tune/speed -r -s 1-3000 -p 100000000 -f 2 mpn_sqrt mpn_bsqrtinvoverhead 0.000000001 secs, precision 100000000 units of 2.86e-10 secs, CPU freq 3500.17 MHz
             mpn_sqrt  mpn_bsqrtinv
1         0.000000009       #0.9083
2        #0.000000024        1.4472
4         0.000000065       #0.8762
8         0.000000114       #0.9990
16        0.000000276       #0.8852
32       #0.000000441        1.3615
64       #0.000000857        2.0866
128      #0.000001938        3.1572
256      #0.000005202        3.7938
512      #0.000016213        3.7798
1024     #0.000052126        3.5139
2048     #0.000159041        3.1950

Now, for small sizes, bsqrtinv seems faster than sqrt. But computing the _inverse_ square root is not the same as computing the root. Btw, here is the loop I added:

      mp_limb_t t0, r0, y0 = *yp; /* in y0 the lowest limb of the operand */
      if ((y0 & 7) != 1)
        return 0; /* it's not a square! */
      r0 = 33 + ((y0 & 8) * 5 >> 2) - ((y0 & 16) >> 1); /* Initial value ... */      do {
        t0 = r0 * r0 * y0 >> 1; /* (r*r*y-1)/2 */        r0 -= r0 * t0;
      } while ((t0 & (GMP_NUMB_MAX >> (GMP_NUMB_BITS >> 1))) != 0);
      *rp = r0 & GMP_NUMB_MAX;

Ĝis,
m


More information about the gmp-devel mailing list