Performance analysis of Mul algorithms in the same range

Lingchuan Meng lingchuanmeng at gmail.com
Sun Jan 4 18:16:15 CET 2009


On Sun, Jan 4, 2009 at 11:49 AM, Torbjorn Granlund <tg at swox.com> wrote:
> "Lingchuan Meng" <lingchuanmeng at gmail.com> writes:
>
>  > Since I don't know what result you expect, I don't think I can provide
>  > much more assistance.  But perhaps a little debugger session would help
>  > you?
>
>  The expected result should show dramatic difference among the four
>  multiplication algorithms when forced to run in the same range.
>
> The differences can be quite small, in particular near the measured
> thresholds.
>
Here is what I got from 'speed':

lm433 at lm433-spiral:~/dev/gmp-4.2.4/tune$ ./speed -c -s1-6000 -f1.4
mpn_mul_basecase mpn_kara_mul_n mpn_toom3_mul_n mpn_mul_fft
overhead 6.09 cycles, precision 10000 units of 1.00e-09 secs, CPU freq
1000.00 MHz
        mpn_mul_basecase mpn_kara_mul_n mpn_toom3_mul_n   mpn_mul_fft
1              #15.19           n/a           n/a      33897.00
2              #25.34        177.44           n/a      34753.00
3              #66.85        218.06           n/a      35207.00
4             #117.61        234.18           n/a      36057.00
5             #154.17        356.34           n/a      36300.00
6             #194.85        371.38           n/a      36408.00
8             #318.15        510.90           n/a      37165.00
11            #562.11        801.14           n/a      38172.00
15            #957.67       1169.50           n/a      39764.00
20           #1628.00       1706.43       3993.33      42921.00
27            2877.50      #2788.25       4940.33      44417.00
37            5381.50      #4786.33       7118.00      49225.00
51           10032.00      #8355.00       9731.00      57193.00
71           19191.00     #13872.00      15020.00      63065.00
99           36995.00      24512.00     #23621.00      74937.00
138          71475.00      41472.00     #39747.00      89743.00
193         139090.00      71889.00     #67749.00     124434.00
270         276704.00     119767.00    #114043.00     159686.00
377         535982.00     210493.00    #187306.00     233501.00
527        1044070.00     356412.00    #302648.00     337150.00
737        2034903.00     606125.00    #511640.00     512465.00
1031       3981240.00    1022804.00     845393.00    #725831.00
1443       7786178.00    1781544.00    1402529.00    #941168.00
2020      16001000.00    2974489.00    2280418.00   #1500799.00
2827      12001000.00    5187712.00    3831443.00   #2073563.00
3957      20001000.00    8777098.00    6342518.00   #2640879.00
5539      40002000.00   15145853.00   10294469.00   #4864354.00

I expect the difference between timing results of running 'mpz_mul()'
while forcing using the low-level 'basecase' and 'karatsuba'
respectively would be similar to that shown in the 'speed' result. But
the results are actually identical as you'll find in the attachment.

How would you modify the gmp-mpara.h if you wanna force karatsuba to
run in [1, 5000]? Thank you.

>  I did compile in that value, and at least didn't get crashes. However,
>  the build did stop when the SQR thresholds got modified incorrectly
>  in the gmp-mpara.h due to violation of assertion.
>
> Good, your values apparently get used!
>
> --
> Torbjörn
>



-- 
Best regards,

Lingchuan Meng
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mul_cycles_karatsuba.png
Type: image/png
Size: 4041 bytes
Desc: not available
Url : http://gmplib.org/list-archives/gmp-discuss/attachments/20090104/9b1111ce/attachment-0002.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mul_cycles_basecase.png
Type: image/png
Size: 4014 bytes
Desc: not available
Url : http://gmplib.org/list-archives/gmp-discuss/attachments/20090104/9b1111ce/attachment-0003.png 


More information about the gmp-discuss mailing list