Neon mul_basecase
Torbjorn Granlund
tg at gmplib.org
Mon Feb 25 09:38:25 CET 2013
Richard Henderson <rth at twiddle.net> writes:
I'm not 100% sure how to interpret the speed results, since they're
not really in the same units as before. But comparing to mpn_mul_2
they're not encouraging. Am I doing something wrong with the testing
or did the 1.64 cyc/limb I got for addmul_8 not really come over to
this test?
But at least it pases make check...
On systems where we cannot reach a cycle counter, and where the clock
drops for an idle system, speed is not reliable.
Passing -p10000000 might help.
parma$ tune/speed -p10000000 -C -s 1-40
mpn_mul_basecase mpn_sqr_basecaseclock_gettime is 1.000ns accurate
overhead 51.51 cycles, precision 10000000 units of 1.00e-09 secs, CPU freq 1694.10 MHz
mpn_mul_basecase mpn_sqr_basecase
1 249.1149 #10.9920
2 17.9868 #8.4933
3 18.6519 #8.4938
4 18.4860 #11.2413
5 21.5837 #12.9901
6 21.4839 #15.1557
7 24.4103 #15.9892
8 25.4805 #17.4870
9 29.2007 #17.7648
10 34.2352 #19.0867
11 35.1140 #19.6235
12 36.5579 #20.6528
13 39.5108 #21.4463
14 40.8981 #25.1966
15 43.8365 #26.2477
16 45.5297 #26.6692
17 48.7875 #27.6847
18 50.1309 #27.7579
19 53.3292 #28.9786
20 54.9094 #29.0294
21 58.3364 #30.7874
22 59.6392 #30.8415
23 62.9972 #33.1067
24 64.4941 #32.7251
25 68.0297 #34.8538
26 69.2953 #34.4750
27 72.7605 #37.2694
28 74.1959 #36.3672
29 77.8029 #39.0739
30 79.0416 #38.3379
31 82.5836 #41.6473
32 83.9695 #41.1573
33 87.6311 #43.5729
34 88.8488 #42.8273
35 92.4461 #46.3504
36 93.7906 #46.0767
37 97.4999 #48.3971
38 98.6904 #47.9918
39 102.3349 #51.2183
40 103.6475 #50.2492
--
Torbjörn
More information about the gmp-devel
mailing list