_basecase or _sec? [

Thu May 2 22:55:23 CEST 2013

Torbjorn Granlund <tg at gmplib.org> writes:

> Hmm...can one use Newton for any a^(-1) mod b (when that exists)?  (I
> listed gcdext for the sake of inverses, and omitted gcd.)

I think Newton analogues exist only when b is a power, not in general.
And the most important case is prime b.

> I am not familiar with the Jebelean (or Möller!) criteria for gcd, but
> my intuition suggests that there exists a O(k)-bit matrix which reduces
> k bits from the operands.  And the same intuition suggests that it can
> be computed from O(k) high bits.  Am I wrong?

You need to have roughly *2k* bits of each operand to make k bits
progress. And you get into trouble at least in the case that those 2k
bits are equal (you need to subtract one operand from the other, but
which depends on the other bits).

Let me think aloud.

As a start, one should try to construct an algorithm which makes (at
least) two bits of progress per iteration, with all problematic special
cases handled in a side-channel silent manner. I looked into it briefly
before writing my current code, but I couldn't find anything easy.

And I'm afraid the table lookup even for just two bits of progress needs
5 bits from each operand, i.e., 1024 entries, which might leak
information through the cache instead.

Right-to-left algorithms will likely have easier bookkeeping than
left-to-right algorithms.

Instead of a lehmer-like algorithms which eliminates k bits of each
operand using a vector/matrix multiply, it might be better (smaller
table) to eliminate k bits from the larger operand only. But then we
need to keep track of which operand is largest. I think this simplest
appproach was the one I looked a bit into, and it didn't seem very
promising. I think one can always eliminate two bits, by setting

  m <-- min(a,b)
  M <-- max(a,b)
  a <-- (M + k m) / 4
  b <-- m

where k = ± 1, which ever is needed to make M + k m a multiple of 4. And
then some additional logic for the case of even numbers (if M = 0 (mod
4), just set k = 0, and if M = 2 (mod 4), set k = 2). But to go beyond
two bits, one would need something like k-ary gcd, which might introduce
false factors.

The k \in {0, 1, 2, -1} could be a conditional subtraction and addition

1.  Subtract: d <-- a - [a odd] b   (if a odd)
2.  Swap:     b <-- a, d <-- -d     (if underflow above, d < 0) 
3.  Add:      a <-- a + 2 b         (under certain conditions)
4.  Shift:    a <-- a / 4	    Always

Current code does (1) and (2) only (and a shift by one bit).

If one wants a single addmul_1, one could choose k \in {0,1,2,3}
instead, but one still *needs* to take sign (a-b) into account; it's no
use to add the larger operand to the smaller. Analysis of the guaranteed
progress after n iterations is going to get more complicated with k > 0.

For any variant, I think the iteration needs to take into account:

  Low bits of a and b (one can try to maintain b odd at all times, but
  one needs to handle the case of even a). Or high bits for a
  left-to-right algorithm, but I think that's going to be less
  convenient.

  Sign of (a - b).

The straight-forward way to determine the sign of (a - b) is to actually
compute the difference. The result of this subtraction is directly
useful for the "plain" binary algorithm, but a bit wasted if we try
something more sophisticated.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.