fixed transform length FFT
Paul.Zimmermann at loria.fr
Fri Oct 23 11:49:42 CEST 2009
> Date: Fri, 23 Oct 2009 10:14:03 +0200 (CEST)
> From: bodrato at mail.dm.unipi.it
> Hi Paul!
> > The code is automatically generated from a master program which takes as
> > input k for a transform length 2^k. You can see an example for k=6 at
> Great! Such a complex code needs some automatism for the generation...
> > Some timings on a 2.83Ghz Core 2:
> > tarte% ./fft6 2048 10000
> > Test 0
> > mul took 8250ms # mpn_mul_n + wraparound
> > mpn_mul_fft6 took 4322ms (ratio 0.523879) # fixed length 2^6 FFT SSA
> > tarte% ./speed -s 2048 mpn_mul_n mpn_mul_fft.6
> > overhead 0.000000002 secs, precision 10000 units of 3.53e-10 secs, CPU
> > freq 2833.00 MHz
> > mpn_mul_n mpn_mul_fft.6
> > 2048 0.000812073 #0.000465751
> > That code can also give an upper bound for the different Toom variants.
> This will be very useful. How should we compare the timing exactly? Above
> you compare fft6 with mul_n+wraparound, this comparison is the timing you
> need to decide if a wider fft should recur to fft6 or to mpn_mul_n...
> But fft6 is not computing the full product of the two operands... Should I
> compare fft6 2048x2048 -> 2048+1 with toom-N 1025x1025 -> 2050?
exactly (more precisely 1024x1024 -> 2048, since the +1 is only one bit).
More information about the gmp-devel