Nails or no nails?
Paul Zimmermann
Paul.Zimmermann at loria.fr
Tue Jun 19 18:09:17 CEST 2007
> Another reason is that some parts of the
> library, most notably FFT multiplication, does not work at all with
> nails and is therefore disabled.
I'm not sure this is correct. I remember I spent quite some time to "nailify"
the code in mpn/generic/mul_fft.c, to find out that the resulting code was
much slower with nails, but still working.
This inefficiency comes from a design choice in the modulo 2^n+1 routines
of the fft code, which assume n is a multiple of GMP_NUMB_BITS.
The fft algorithm requires n is divisible by a large power of two, say 2^k.
When GMP_NUMB_BITS is 64, we have a "free" factor of 2^6, thus we only need
the number of limbs to be divisible by 2^(k-6). But with 2 nail bits, i.e.,
GMP_NUMB_BITS = 62, we have only a factor 2^1, thus the number of limbs has
to be divisible by 2^(k-1), which gives a much larger n, and this explains
the inefficiency.
This limitation would be easy to get around, by implementing arithmetic
modulo 2^n+1 without the constraint that GMP_NUMB_BITS divides n.
(We only need addition, subtraction, and multiplication by 2^k.)
Maybe this would even speed up the non-nail case (for small sizes).
Paul Zimmermann
More information about the gmp-discuss
mailing list