GMP 4.3 multiplication performance
Niels Möller
nisse at lysator.liu.se
Tue Jun 2 23:54:28 CEST 2009
bodrato at mail.dm.unipi.it writes:
> Toom43 exists too! I announced it on this list some months ago :-) but the
> list was silent that times:
> http://gmplib.org/list-archives/gmp-devel/2009-February/000818.html .
Cool. I missed/forgot that. I've spent a few hours on
toom_interpolate_6pts, using a sequence basically taken from your
paper but with slightly different signs, to avoid explicit negation
operators:
w0 = f(0),
w1 = f(1),
w2 = f(-1)
w3 = f(2)
w4 = f(-2)
w5 = limit at infinity of f(x) / x^5,
w1 -= w2
w3 -= w4
w3 /= 2
w4 += w3
w4 -= w0
w3 -= w1
w1 /= 2
w2 += w1
w2 -= w0
w3 /= 6
w4 -= 4 * w2
w3 -= 4 * w5
w1 -= w3
w4 /= 12
w2 -= w4
w3 -= w5
But I didn't waste too much time; I haven't spent the time needed to
test, debug and optimize my interpolation function.
One comment on your interpolation function: You don't need any scratch
space. But then for the operation
/* W2 =(W2 - W4)/3 - W0<<2 */
you use mpn_submul_1 with a constant multiplier of 4, which I imagine
is more costly than a shift. toom_interpolate_7pts has the same
problem, and that code does the opposite, it uses mpn_lshift to a
scratch area followed by mpn_sub_n.
I'm not sure what's fastest, but it's definitely annoying that you
can't shift without using a scratch area.
Regards,
/Niels
More information about the gmp-devel
mailing list