Wraparound multiplicaton vs mullo

Sun Oct 18 19:46:16 CEST 2009

David Harvey <dmharvey at cims.nyu.edu> writes:

> The idea is to choose M so that multiplication mod M is efficient,
> and so that division by M is efficient (i.e. I suppose linear time).

I see, Montgomery multiplication but generalized so that M is not
necessarily a power of two. Clever.

And then division by M really is *exact* division by M. So if M is not
a power of two, it would be preferable if the inverse mod a suitable
power of two has a simple structure.

But then again, if doing exact division by multiplication by an
inverse mod a power of two, that's again a mullow for computing the
quotient. But again, I guess you replace the mod 2^n used by mullow by
any other convenient modulo which large enough? Interesting, I wonder
if that can beat Jebelean's bidirectional algorithm.

On the topic of montgomery multiplication, I thought the most
promising idea for improvement was to do it bidirectionally as Paul
hinted on some months ago, referring to:

@Article{KaTa08,
  author = 	 {Marcelo E. Kaihara and Naofumi Takagi},
  title = 	 {Bipartite Modular Multiplication Method},
  journal = 	 ieeetc,
  year = 	 2008,
  volume = 	 57,
  number = 	 2,
  pages = 	 {157--164}
}

The idea was, for A, B and N of size n bits, compute the product C =
A*B, then eliminate n/2 of the bits from the left and n/2 from the
right using classical and hensel division by N (this obviously only
adds and subtracts suitable multiples of N), and the remaining n bits
will be A*B 2^{-n/2} mod N in canonical representation.

Regards,
/Niels