gcdext_1 questions

Fri Dec 4 10:23:10 CET 2009

       Niels,

> From: nisse at lysator.liu.se (Niels =?iso-8859-1?Q?M=F6ller?=)
> Date: Thu, 03 Dec 2009 23:22:47 +0100
> 
> Paul Zimmermann <Paul.Zimmermann at loria.fr> writes:
> 
> > If you consider instead the "binary recursive gcd" we defined with
> > Damien Stehlé, it removes 1.95 bit per iteration.
> 
> I haven't considered this for small sizes (at least not for some years,
> I don't remember if I looked into that when I implemented it the
> subquadratic algorithm for large integers, where my implementation was
> 10% slower than the left-to-right algorithm).
> 
> The basic iteration would be count_trailing_zeros, shift, table lookup
> to get an inverse (and an unlikely newton iteration if there are many
> bits), a multiply to get the quotient, and another multiply and shift to
> get the remainder.
> 
> But it's not clear to me what book-keeping and postprocessing is needed
> to get cofactors according to the required conventions.

in fact you can merge the two shifts. Assuming the first shift, say j,
was already done (thus both operands are odd):

1) table lookup to get the quotient (or another slow method if the unlikely
   case where the quotient is large)
2) a multiplication with accumulation (addmul_si, and addmul in rare cases)
3) count_trailing_zeros (starting from bit j) -> gives the new j
4) a right shift by old j + new j

Thus one iteration is equivalent to Stein's binary algorithm, except you
replace the subtraction by an addmul_si (and doing that you remove more bits).

Paul