_basecase or _sec? [

Sat May 4 22:54:39 CEST 2013

Torbjorn Granlund <tg at gmplib.org> writes:

>   I see. Makes some sense. To get a nice progress guarantee, one such
>   "large iteration" would need to apply the matrix, and then do a small
>   number of additional reductions.
>   
> Applying the matrix yes, but why additional reductions?  Would they
> handle some degenerate cases where we would else not make k progress?

Consider the left-to-right case (since that's what I'm most familiar
with), and say we chop off the 32 bits of each operand (and aim at
roughly 15 bit progress).

Then the best case we reduce those high limbs until they are reduced to
about 17 bits left, apply the resulting matrix, and we're done.

But it may happen that that the construction of that matrix terminates
early, e.g, with a much larger than 16 bits and b much smaller than 16
bits. Or both much larger than 16 bits, but top bits equal. So to get
close to 16 bits, those unlikely cases must be handled.

Both cases can be understood as corresponding to a quotient which is too
large to fit in a matrix of the desired size. Or thinking of it the
other way, computing a decent quotient approximation requires higher
precision than we have in the high bits we chopped off.

I think issues are similar for right-to-left.

>   It's possible, and it might be more efficient, to divide the cofactors
>   by 2^k (mod b) at the end, instead of one bit at a time.

> I think it is, iff we must maintain full-size operands for SC silence
> (as in your present algorithm).

Hmm. We might be able to use smaller cofactors initially, if the
divisions by two are delayed.

> Sure, but a <-- a + k b gives us <= one bit per iteration.

The currrent algorithm gives precisely one bit per iteration, using

  a <-- (a + k b) / 2

where k = -1 if a is odd, k = 0 if a is even (and the allgorithm is
arranged so that b is always odd, and b <= a whenever k != 0. I would
expect an algorithm using

  a <-- (a + k b) / 4

with cleverly chosen k to give a guaranteed progress of alpha n bits for
n iterations, with alpha somewhere in the range between 1 and 2. (we
don't get alpha = 2, since unlike plain binary, the numbers sometimes
grow in the high end). It's not obvious to me what the worst cases are.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.