15.3.3 Subquadratic GCD

For inputs larger than GCD_DC_THRESHOLD, GCD is computed via the HGCD (Half GCD) function, as a generalization to Lehmer’s algorithm.

Let the inputs a,b be of size N limbs each. Put S = floor(N/2) + 1. Then HGCD(a,b) returns a transformation matrix T with non-negative elements, and reduced numbers (c;d) = T^{-1} (a;b). The reduced numbers c,d must be larger than S limbs, while their difference abs(c-d) must fit in S limbs. The matrix elements will also be of size roughly N/2.

The HGCD base case uses Lehmer’s algorithm, but with the above stop condition that returns reduced numbers and the corresponding transformation matrix half-way through. For inputs larger than HGCD_THRESHOLD, HGCD is computed recursively, using the divide and conquer algorithm in “On Schönhage’s algorithm and subquadratic integer GCD computation” by Möller (see References). The recursive algorithm consists of these main steps.

GCD is then implemented as a loop around HGCD, similarly to Lehmer’s algorithm. Where Lehmer repeatedly chops off the top two limbs, calls mpn_hgcd2, and applies the resulting matrix to the full numbers, the sub-quadratic GCD chops off the most significant third of the limbs (the proportion is a tuning parameter, and 1/3 seems to be more efficient than, e.g., 1/2), calls mpn_hgcd, and applies the resulting matrix. Once the input numbers are reduced to size below GCD_DC_THRESHOLD, Lehmer’s algorithm is used for the rest of the work.

The asymptotic running time of both HGCD and GCD is O(M(N)*log(N)), where M(N) is the time for multiplying two N-limb numbers.