Timing Toom-8.5 (mpn level) [Was: Toom-8 testing (mpz level)]

bodrato at mail.dm.unipi.it bodrato at mail.dm.unipi.it
Mon Oct 26 11:41:33 CET 2009


Ciao Alberto,

> Alle 12:49, mercoledì 7 ottobre 2009, Paul Zimmermann ha scritto:

>> > From: Alberto Zanoni <zanoni at volterra.uniroma2.it>

>> > the paper itself, with all the details, on the web, but actually Marco
>> is
>> > working quite ahead of it, and will surely produce something better at
>> > mpn level, with many new ideas, so I think that the paper is actually
>> > already obsolete.

I'm testing the new Toom-8.5 code, it is perfectly working on my 32-bits
CPU, but it is still untested on 64-bits... so that a little bit more time
is needed before release. And yes, I'm confident that the new "early
recomposition" strategy makes it a quite fast code...

>> please could you update your graphics with the new FFT code at

I performed some timing tests, both with the development GMP code and with
the last release of MPIR. This should include also Zimmerman's FFT.

Results, graphs and some observations are available on my web pages:
http://bodrato.it/software/toom.html#TC8.5Time

In short terms (on my old laptop):
- GMP: Toom-8.5 is faster than Toom-4 from ~300 limbs ... and keeps on
being faster than FFT up to 12,000+ limbs
- MPIR: Toom-8.5 is faster than Toom-3 from ~208 limbs ... but the newer
FFT code wins earlier; anyway the cut-off point is above 4,000 limbs

Finer tuning of both libraries and Toom-8.5 recursion can give different
numbers, but the range where the balanced Toom-8 (obtained with a
truncated Toom-8.5) can be the preferred algorithm seems wide.

I hope I'll find the time to test correctness on 64-bits CPU, and release
the code early.

Regards,
Marco

PS: the big graph is very noisy... I know... a single CPU with a single
core, 512Mb of RAM, and Firefox+emacs+gcc+X+OOo+... are the main reasons.

-- 
http://bodrato.it/toom-cook/



More information about the gmp-devel mailing list