Jacobi symbol *( a/b)*

Initially if either operand fits in a single limb, a reduction is done with
either `mpn_mod_1`

or `mpn_modexact_1_odd`

, followed by the binary
algorithm on a single limb. The binary algorithm is well suited to a single limb,
and the whole calculation in this case is quite efficient.

For inputs larger than `GCD_DC_THRESHOLD`

, `mpz_jacobi`

,
`mpz_legendre`

and `mpz_kronecker`

are computed via the HGCD (Half
GCD) function, as a generalization to Lehmer’s algorithm.

Most GCD algorithms reduce *a* and *b* by repeatedly computing the
quotient *q = floor(a/b)* and iteratively replacing

*a, b = b, a - q * b*

Different algorithms use different methods for calculating q, but the core algorithm is the same if we use Lehmer’s algorithm or HGCD.

At each step it is possible to compute if the reduction inverts the Jacobi
symbol based on the two least significant bits of `a` and `b`. For
more details see “Efficient computation of the Jacobi symbol” by
Möller (see References).

A small set of bits is thus used to track state

- current sign of result (1 bit)
- two least significant bits of
`a`and`b`(4 bits) - a pointer to which input is currently the denominator (1 bit)

In all the routines sign changes for the result are accumulated using fast bit twiddling which avoids conditional jumps.

The final result is calculated after verifying the inputs are coprime (GCD = 1)
by raising *(-1)^e*.

Much of the HGCD code is shared directly with the HGCD implementations, such
as the 2x2 matrix calculation, See Lehmer’s algorithm basecase and
`GCD_DC_THRESHOLD`

.

The asymptotic running time is *O(M(N)*log(N))*, where
*M(N)* is the time for multiplying two *N*-limb numbers.