A significant fraction of non-squares can be quickly identified by checking whether the input is a quadratic residue modulo small integers.
mpz_perfect_square_p
first tests the input mod 256, which means just
examining the low byte. Only 44 different values occur for squares mod 256,
so 82.8% of inputs can be immediately identified as non-squares.
On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total 99.25% of inputs identified as non-squares. On a 64-bit system 97 is tested too, for a total 99.62%.
These moduli are chosen because they’re factors of 2^24-1 (or
2^48-1 for 64-bits), and such a remainder can be quickly taken just
using additions (see mpn_mod_34lsub1
).
When nails are in use moduli are instead selected by the gen-psqr.c
program and applied with an mpn_mod_1
. The same 2^24-1 or
2^48-1 could be done with nails using some extra bit shifts, but
this is not currently implemented.
In any case each modulus is applied to the mpn_mod_34lsub1
or
mpn_mod_1
remainder and a table lookup identifies non-squares. By
using a “modexact” style calculation, and suitably permuted tables, just one
multiply each is required, see the code for details. Moduli are also combined
to save operations, so long as the lookup tables don’t become too big.
gen-psqr.c does all the pre-calculations.
A square root must still be taken for any value that passes these tests, to verify it’s really a square and not one of the small fraction of non-squares that get through (i.e. a pseudo-square to all the tested bases).
Clearly more residue tests could be done, mpz_perfect_square_p
only
uses a compact and efficient set. Big inputs would probably benefit from more
residue testing, small inputs might be better off with less. The assumed
distribution of squares versus non-squares in the input would affect such
considerations.