Toom-4.5 (aka Toom-5x4, Toom-6x3, Toom-7x2)
nisse at lysator.liu.se
Thu Oct 15 23:08:18 CEST 2009
Torbjorn Granlund <tg at gmplib.org> writes:
> I am working on mpn_bdiv_q_1_pi2, i.e. Hensel division by a one-limb
> number d using a two-limb inverse d^(-1) mod B^2. It should run at 6 or
> 7 c/l on AMD chips. The code is simple.
> I think mpn_bdiv_q_1_pi2 is the most sane approach for Toom's needs.
Sanity is a nice goal ;-)
And one could still use two calls to mpn_bdiv_dbm1c for the special
divisors where that works, which if I got your numbers right would be
one or two cycles faster per limb.
More information about the gmp-devel