Toom-6.5 (Aka: Toom 6'n'half :-P )

bodrato at mail.dm.unipi.it bodrato at mail.dm.unipi.it
Tue Oct 13 21:35:14 CEST 2009


Dear Niels,

> Then evaluation in 3 (full list of points: 0, oo, ±1, ±2, ±3) might
> actually be faster. Since the resulting division by 15 can be done
> with bdiv_dbm1c for both 64 and 32 bit limb sizes.

I do not know if I want to invest the necessary time to rewrite a third
version of Toom-4.5... but evaluations in ±3 can be very useful for
Toom-8.5 (the next step). Because evaluations in ±8 give some overflow (on
32-bits) I'd prefer not to handle... I'll decide...

> I've previously said that I think quadruples like ±2 and ±1/2 or ±x
> and ±1/x are good, but now I'm not sure why I said that. It seems the
> symmetry +x, -x can be exploited in interpolation (after initial
> butterflies, we get a matrix with a regular structure, which makes
> early recombination possible. Is there any use of the other symmetry,
> x, 1/x, in the interpolation? I don't see that.

Yes there is a use of the x, 1/x symmetry, but you need more values
{x,1/x,y,1/y...} to fully exploit it.
I used it for the Toom-6.5, with points {0, ±¼, ±½, ±1, ±2, ±4, oo}. If
you want to look at the code, it is on my web-page, as usual:
http://bodrato.it/software/toom.html#TC6.5

I shall write a note explaining all the tricks, before coding Toom-8.5 too.

Notes about the new code:
- it shares the same toom-tools.h with Toom-4.5,
- it has a single entry point mpn_toom6_n_half_mul(pp,ap,an,bp,bn,scratch)
- it implements both Toom-6 and Toom-6.5, choice performed at runtime
- it implements different (un)balancements: 6:6, 7:6, 7:5, 8:5, 8:4, 9:4
- no _interpolate_12pts() function, (no 9:3, 10:3, 10:2, 11:2 foreseeable)

Interpolation was not refined, I simply wrote it so that it works, but on
my laptop (an old Centrino) it is faster than Toom-4 above 300 limbs and
faster than Toom42 above 400:200.
It may have problems with splitting, I'd not use it with bn < 100 limbs...
To be honest, the code was scarcely tested... but I look forward to
admiring a new spot of colour in Torbjorn's graph :-D One single colour
does, this time.

If someone has the time to test it, please let me know!

Best regards,
Marco

-- 
http://bodrato.it/papers/



More information about the gmp-devel mailing list