Toom-4.5 (aka Toom-5x4, Toom-6x3, Toom-7x2)

Niels Möller nisse at
Thu Oct 15 23:08:18 CEST 2009

Torbjorn Granlund <tg at> writes:

> I am working on mpn_bdiv_q_1_pi2, i.e. Hensel division by a one-limb
> number d using a two-limb inverse d^(-1) mod B^2.  It should run at 6 or
> 7 c/l on AMD chips.  The code is simple.


> I think mpn_bdiv_q_1_pi2 is the most sane approach for Toom's needs.

Sanity is a nice goal ;-)

And one could still use two calls to mpn_bdiv_dbm1c for the special
divisors where that works, which if I got your numbers right would be
one or two cycles faster per limb.


More information about the gmp-devel mailing list