speed of unbalanced multiplication
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Thu Feb 7 09:53:59 CET 2013
Ciao Paul,
I elaborated a better patch (forget the experimental one I sent yesterday)
to the tune/speed program, it is attached.
Before the patch, mpn_mul seem sensibly slower than mpn_mul_n:
$ tune/speed -s 800000 mpn_mul_n mpn_mul mpn_mul_n mpn_mul
overhead 0.000000002 secs, precision 10000 units of 3.12e-10 secs, CPU
freq 3205.77 MHz
mpn_mul_n mpn_mul mpn_mul_n mpn_mul
800000 0.646153000 0.673501000 #0.643274000 0.686486000
After the patch, only changing the way tune/speed allocate memory for the
operands, their results are comparable:
$ tune/speed -s 800000 mpn_mul_n mpn_mul mpn_mul_n mpn_mul
overhead 0.000000002 secs, precision 10000 units of 3.12e-10 secs, CPU
freq 3200.20 MHz
mpn_mul_n mpn_mul mpn_mul_n mpn_mul
800000 0.644460000 0.649850000 0.634180000 #0.631246000
There is a side-effect: to measure the speed of unbalanced multiplication,
eg ###### x ##, you used
tune/speed -s ## mpn_mul.######
now the roles of the two parameters are swapped, and you have to write
tune/speed -s ###### mpn_mul.##
The transposed version of the matrix of times I suggested in the previous
message, can now be obtained with the following:
$ tune/speed -s 800000-1200000 -t 100000 mpn_mul.400000 mpn_mul.500000
mpn_mul.600000 mpn_mul.700000 mpn_mul.800000 mpn_mul_n
overhead 0.000000002 secs, precision 10000 units of 3.12e-10 secs, CPU
freq 3200.23 MHz
mul.400000 mul.500000 mul.600000 mul.700000 mul.800000 mpn_mul_n
800000 #0.430677 0.433753 0.515757 0.535098 0.630629 0.645156
900000 #0.431647 0.521545 0.532850 0.638642 0.644031 0.642488
1000000 #0.522817 0.527930 0.633221 0.646514 0.648290 0.708614
1100000 #0.516791 0.648199 0.640584 0.651306 0.681567 0.857438
1200000 0.647544 #0.640084 0.652030 0.675864 0.690255 0.950390
There still are problems of non-monotonicity (12..x5.. is slightly faster
than both the more unbalanced 12..x4.. and the less unbalanced 11..x5..),
but at least we isolated the issue.
If other developers does not dislike the changed meaning of the .<r>
parameter to mpn_mul, this patch can be applied to the main repo...
Opinions?
Best regards,
m
--
http://bodrato.it/software/combinatorics.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speed.diff
Type: text/x-patch
Size: 2169 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20130207/97e8fd4c/attachment.bin>
More information about the gmp-devel
mailing list