4-way Toom-Cook and Centrinia

Josh Liu zliu2 at student.gsu.edu
Wed Aug 4 01:23:11 CEST 2004


   I came up with the below transformations for interpolating the 
product coefficients from the point products at points {0, 1/4, 1/2, 1, 
2, 4, inf} in that order. Does anyone have any better transformations 
for the same points? What about for different points? I would 
appreciate it if anyone can improve upon what I have. I plan to 
implement the 4-way Toom-Cook multiplication algorithm by the end of 
the week so any suggestions, especially before, say, Sunday, would be 
greatly appreciated.
   One can TeX the below file or simply skim through it and look at the 
crucial parts, as the formating is relatively transparent.
   On a different note, I would like some advice for someone working on 
a fast poly-algorithmic multiple precision arithmetic library. I 
already have a working library. The library uses Karatsuba-Ofman, 3-way 
Toom-Cook, and DWT based Schoenhage-Strassen multiplication, 
Burnikel-Ziegler and Newton division, and Euclid, binary, and Lehmer 
GCD. It is currently lacking in extended GCD, exact division, modular 
arithmetic, rational arithmetic, and floating point arithmetic. It has 
a goal of being a computer algebra system so there are many more 
features that GMP does not have, such as a Jenkins-Traub polynomial 
root finder, polynomial arithmetic (a Nussbaumer implementation is used 
for multiplication), random number generators, prime number sieves, 
factoring algorithms, and many many more features.
   What are the most important aspects of GMP that deserved the most 
attention during the development process? What made GMP so fast? Where 
do most bugs originate? What does the debug cycle look like? How long 
does it take, and what steps are taken, when a bug or invalidity is 
found (the latter takes up more of my development time than any other 
factor, almost combined)? Are there any desires for GMP to be 
educational? Are there any efforts taken to reduce the size of the 
code, both machine and source? If there is a desire to reduce code 
size, what steps were taken at which points? Was the development of GMP 
easy? What were the hardest parts in it's development? What 
satisfaction do the authors get from developing the library? What would 
be a good ranking, in relative order of importance, the following 
factors: speed, stability, validity of results, machine code size, code 
simplicity/ease of understanding, wealth of algorithms implemented, 
development time, ease of development, memory footprint, 1337|\|355, 
and personal knowledge attainment. I thank any response to the above 
questions as they would be a guide for the development of my own 
project.

P.S. I stole the basecase multiplication assembly code for the Pentium 
4 and the part of the Burnikel-Ziegler division in the 2x1 operand 
routine for odd sized operands directly from GMP. Is that okay? This is 
a MIT license project. If anyone objects to my using these two blatant 
copying of intellectual property, I would immediately revert to my own 
code. GMP has inspired many implementations of algorithms indirectly. 
Some functions share similarities with GMP code even though I wrote 
these functions from scratch.

P.S.S. GMP has inspired me almost as much as Knuth. Thank you for such 
great work.


---- Begin interpolation_transformations.tex ----

% interpolation_transformations.tex


$$
W = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	4096&	1024&	256&	64&	16&	4&	1\cr
	64&	32&	16&	8&	4&	2&	1\cr
	1&	1&	1&	1&	1&	1&	1\cr
	1&	2&	4&	8&	16&	32&	64\cr
	1&	4&	16&	64&	256&	1024&	4096\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$


$$
A_0 = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	-4096&	1&	0&	0&	0&	0&	-1\cr
	-64&	0&	1&	0&	0&	0&	-1\cr
	-1&	0&	0&	1&	0&	0&	-1\cr
	-1&	0&	0&	0&	1&	0&	-64\cr
	-1&	0&	0&	0&	0&	1&	-4096\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_1 = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1&	0&	0&	0&	-1&	0\cr
	0&	0&	1&	0&	-1&	0&	0\cr
	0&	0&	0&	1&	0&	0&	0\cr
	0&	0&	1&	0&	1&	0&	0\cr
	0&	1&	0&	0&	0&	1&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_2 = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1&	0&	0&	0&	0&	0\cr
	0&	0&	1&	0&	0&	0&	0\cr
	0&	0&	0&	1&	0&	0&	0\cr
	0&	0&	0&	-16&	1&	0&	0\cr
	0&	0&	0&	-128&	0&	1&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_3 = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1&	-20&	0&	0&	0&	0\cr
	0&	-1&	34&	0&	0&	0&	0\cr
	0&	0&	0&	1&	0&	0&	0\cr
	0&	0&	0&	0&	50&	-1&	0\cr
	0&	0&	0&	0&	-36&	1&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_4 = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1/4&	0&	0&	0&	0&	0\cr
	0&	0&	1/8&	0&	0&	0&	0\cr
	0&	0&	0&	1&	0&	0&	0\cr
	0&	0&	0&	0&	1/8&	0&	0\cr
	0&	0&	0&	0&	0&	1/4&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_5 = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1/105&	0&	0&	0&	0&	0\cr
	0&	0&	1/21&	0&	0&	0&	0\cr
	0&	0&	0&	1&	0&	0&	0\cr
	0&	0&	0&	0&	1/7&	0&	0\cr
	0&	0&	0&	0&	0&	1/63&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_6 = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1&	0&	0&	0&	1&	0\cr
	0&	0&	1&	0&	1&	0&	0\cr
	0&	0&	0&	1&	1&	1&	0\cr
	0&	0&	-1&	0&	1&	0&	0\cr
	0&	-1&	0&	0&	0&	1&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_7 = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1/2&	0&	0&	0&	0&	0\cr
	0&	0&	1/2&	0&	0&	0&	0\cr
	0&	0&	0&	1&	0&	0&	0\cr
	0&	0&	0&	0&	1/2&	0&	0\cr
	0&	0&	0&	0&	0&	1/2&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_0 \cdot W = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1024&	256&	64&	16&	4&	0\cr
	0&	32&	16&	8&	4&	2&	0\cr
	0&	1&	1&	1&	1&	1&	0\cr
	0&	2&	4&	8&	16&	32&	0\cr
	0&	4&	16&	64&	256&	1024&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_1 \cdot A_0 \cdot W = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1020&	240&	0&	-240&	-1020&	0\cr
	0&	30&	12&	8&	-12&	-30&	0\cr
	0&	1&	1&	1&	1&	1&	0\cr
	0&	34&	20&	16&	20&	34&	0\cr
	0&	1028&	272&	128&	272&	1028&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$


$$
A_2 \cdot A_1 \cdot A_0 \cdot W = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1020&	240&	0&	-240&	-1020&	0\cr
	0&	30&	12&	8&	-12&	-30&	0\cr
	0&	1&	1&	1&	1&	1&	0\cr
	0&	18&	4&	0&	4&	18&	0\cr
	0&	900&	144&	0&	144&	900&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$



$$
A_3 \cdot A_2 \cdot A_1 \cdot A_0 \cdot W = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	420&	0&	0&	0&	-420&	0\cr
	0&	0&	168&	0&	-168&	0&	0\cr
	0&	1&	1&	1&	1&	1&	0\cr
	0&	0&	56&	0&	56&	0&	0\cr
	0&	252&	0&	0&	0&	252&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$



$$
A_4 \cdot A_3 \cdot A_2 \cdot A_1 \cdot A_0 \cdot W = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	105&	0&	0&	0&	-105&	0\cr
	0&	0&	21&	0&	-21	0&	0\cr
	0&	1&	1&	1&	1&	1&	0\cr
	0&	0&	7&	0&	7&	0&	0\cr
	0&	63&	0&	0&	0&	63&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$


$$
A_5 \cdot A_4 \cdot A_3 \cdot A_2 \cdot A_1 \cdot A_0 \cdot W = 
\pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1&	0&	0&	0&	-1&	0\cr
	0&	0&	1&	0&	-1&	0&	0\cr
	0&	1&	1&	1&	1&	1&	0\cr
	0&	0&	1&	0&	1&	0&	0\cr
	0&	1&	0&	0&	0&	1&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

$$
A_6 \cdot A_5 \cdot A_4 \cdot A_3 \cdot A_2 \cdot A_1 \cdot A_0 \cdot W 
= \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	2&	0&	0&	0&	0&	0\cr
	0&	0&	2&	0&	0&	0&	0\cr
	0&	0&	0&	1&	0&	0&	0\cr
	0&	0&	0&	0&	2&	0&	0\cr
	0&	0&	0&	0&	0&	2&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$


$$
A_7 \cdot A_6 \cdot A_5 \cdot A_4 \cdot A_3 \cdot A_2 \cdot A_1 \cdot 
A_0 \cdot W = \pmatrix {
	1&	0&	0&	0&	0&	0&	0\cr
	0&	1&	0&	0&	0&	0&	0\cr
	0&	0&	1&	0&	0&	0&	0\cr
	0&	0&	0&	1&	0&	0&	0\cr
	0&	0&	0&	0&	1&	0&	0\cr
	0&	0&	0&	0&	0&	1&	0\cr
	0&	0&	0&	0&	0&	0&	1\cr
}
$$

--- End interpolation_transformations.tex ----



More information about the gmp-devel mailing list