fixed transform length FFT
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Fri Oct 23 10:14:03 CEST 2009
Hi Paul!
> The code is automatically generated from a master program which takes as
> input k for a transform length 2^k. You can see an example for k=6 at
Great! Such a complex code needs some automatism for the generation...
> Some timings on a 2.83Ghz Core 2:
>
> tarte% ./fft6 2048 10000
> Test 0
> mul took 8250ms # mpn_mul_n + wraparound
> mpn_mul_fft6 took 4322ms (ratio 0.523879) # fixed length 2^6 FFT SSA
>
> tarte% ./speed -s 2048 mpn_mul_n mpn_mul_fft.6
> overhead 0.000000002 secs, precision 10000 units of 3.53e-10 secs, CPU
> freq 2833.00 MHz
> mpn_mul_n mpn_mul_fft.6
> 2048 0.000812073 #0.000465751
> That code can also give an upper bound for the different Toom variants.
This will be very useful. How should we compare the timing exactly? Above
you compare fft6 with mul_n+wraparound, this comparison is the timing you
need to decide if a wider fft should recur to fft6 or to mpn_mul_n...
But fft6 is not computing the full product of the two operands... Should I
compare fft6 2048x2048 -> 2048+1 with toom-N 1025x1025 -> 2050?
Regards,
Marco
--
http://bodrato.it/
More information about the gmp-devel
mailing list