Toom-4.5 (aka Toom-5x4, Toom-6x3, Toom-7x2)

Niels Möller nisse at lysator.liu.se
Thu Oct 15 23:08:18 CEST 2009


Torbjorn Granlund <tg at gmplib.org> writes:

> I am working on mpn_bdiv_q_1_pi2, i.e. Hensel division by a one-limb
> number d using a two-limb inverse d^(-1) mod B^2.  It should run at 6 or
> 7 c/l on AMD chips.  The code is simple.

[...]

> I think mpn_bdiv_q_1_pi2 is the most sane approach for Toom's needs.

Sanity is a nice goal ;-)

And one could still use two calls to mpn_bdiv_dbm1c for the special
divisors where that works, which if I got your numbers right would be
one or two cycles faster per limb.

/Niels


More information about the gmp-devel mailing list