nisse at lysator.liu.se
Wed Jun 3 00:18:54 CEST 2009
I've reorganized tomm44_mul to avoid using TMP_ALLOC. The storage need
for multiplying two n-limb numbers is now 8 n / 3 + log-term.
The current mpn_toom44_mul_itch says 12 ceil(n/4) + GMP_NUMB_BITS. I
think this should be sufficient and somewhat conservative, but I
haven't analyzed the sub-linear term in detail. (For comparison, both
toom22 and toom33 need 2 n + log term).
The use of parts of the product area for temporaries may be somewhat
irregular, but I think it's a pretty good strategy to evaluate at two
points +x and -x at a time, and store these four (n+1)-limb numbers in
the product area. Then the scratch space is needed to store 4 of the
pointwise products (the other three are also stored in the product
area in due time), and for the recursive multiplications.
More information about the gmp-devel