Nails or no nails?

Paul Zimmermann Paul.Zimmermann at loria.fr
Tue Jun 19 18:09:17 CEST 2007


> Another reason is that some parts of the
> library, most notably FFT multiplication, does not work at all with
> nails and is therefore disabled.

I'm not sure this is correct. I remember I spent quite some time to "nailify"
the code in mpn/generic/mul_fft.c, to find out that the resulting code was
much slower with nails, but still working.

This inefficiency comes from a design choice in the modulo 2^n+1 routines
of the fft code, which assume n is a multiple of GMP_NUMB_BITS.

The fft algorithm requires n is divisible by a large power of two, say 2^k.
When GMP_NUMB_BITS is 64, we have a "free" factor of 2^6, thus we only need
the number of limbs to be divisible by 2^(k-6). But with 2 nail bits, i.e.,
GMP_NUMB_BITS = 62, we have only a factor 2^1, thus the number of limbs has
to be divisible by 2^(k-1), which gives a much larger n, and this explains
the inefficiency.

This limitation would be easy to get around, by implementing arithmetic
modulo 2^n+1 without the constraint that GMP_NUMB_BITS divides n.
(We only need addition, subtraction, and multiplication by 2^k.)

Maybe this would even speed up the non-nail case (for small sizes).

Paul Zimmermann



More information about the gmp-discuss mailing list