15.3.4 Extended GCD

The extended GCD function, or GCDEXT, calculates gcd(a,b) and also cofactors x and y satisfying a*x+b*y=gcd(a,b). All the algorithms used for plain GCD are extended to handle this case. The binary algorithm is used only for single-limb GCDEXT. Lehmer’s algorithm is used for sizes up to GCDEXT_DC_THRESHOLD. Above this threshold, GCDEXT is implemented as a loop around HGCD, but with more book-keeping to keep track of the cofactors. This gives the same asymptotic running time as for GCD and HGCD, O(M(N)*log(N)).

One difference to plain GCD is that while the inputs a and b are reduced as the algorithm proceeds, the cofactors x and y grow in size. This makes the tuning of the chopping-point more difficult. The current code chops off the most significant half of the inputs for the call to HGCD in the first iteration, and the most significant two thirds for the remaining calls. This strategy could surely be improved. Also the stop condition for the loop, where Lehmer’s algorithm is invoked once the inputs are reduced below GCDEXT_DC_THRESHOLD, could maybe be improved by taking into account the current size of the cofactors.