speed of unbalanced multiplication
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Wed Feb 6 17:59:44 CET 2013
Ciao Paul!
Il Dom, 27 Gennaio 2013 10:09 am, Zimmermann Paul ha scritto:
> In the FTT range, multiplying n limbs by m limbs should not be more
> expensive then multiplying two numbers of (n+m)/2 limbs.
Of course. With current implementation, unbalanced multiplications need
some more memory and a few additions/subtractions, but this should not
give a measurable slow-down. The "matrix" obtained with
$ tune/speed -s 400000-800000 -t 100000 mpn_mul.800000 mpn_mul.900000
mpn_mul.1000000 mpn_mul.1100000 mpn_mul.1200000
shows that times are not as monotonic as desired, but "unbalancement" does
not really have an influence.
> In that case, we get a 6.5% slowdown with respect to that strategy.
I think that the culprit is the tune/speed program, but I'm not able to
correct it. I just tested the attached patch. After patching, the results
are:
$ tune/speed -s 800000-1000000 -t 100000 mpn_mul_n mpn_mul mpn_mul_bal
overhead 0.000000000 secs, precision 10000 units of 3.13e-11 secs, CPU
freq 31990.26 MHz
mpn_mul_n mpn_mul mpn_mul_bal
800000 0.646571000 0.682834000 #0.632961000
900000 #0.652178000 0.678564000 0.655979000
1000000 0.710674000 0.740998000 #0.702508000
As you can see mpn_mul_n and mpn_mul_bal are comparable, and mpn_mul is
always slower. All the three functions measure the time to multiply two
numbers of the same size. mpn_mul_bal and mpn_mul measure the same
function, but the first uses the same macro that tune/speed uses for
mpn_mul_n...
Regards,
m
--
http://bodrato.it/papers/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speed.diff
Type: text/x-patch
Size: 1382 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20130206/2c58991a/attachment.bin>
More information about the gmp-devel
mailing list