Improvements to powerpc32 asm code

Mark Rodenkirch mrodenkirch@wi.rr.com
Sat, 31 May 2003 11:46:10 -0500


I see that one of the tasks is the improve the mpn_add_n and mpn_sub_n 
(on powerpc32) to 3.25 cycles per limb.  I have made some changes and 
am in the process of testing them.  If someone else is already doing 
this, I will halt my effort.

To test the changes, I am testing adds and subtracts on values from 1 
to 30 limbs for base 2 and base 10 numbers.  If there is a better means 
to testing, I would like to know.

Here are the speed comparisons:

overhead 4.02 cycles, precision 1000 units of 1.00e-06 secs, CPU freq 
500.00 MHz
             mpn_add_n mpn_add_n_new
1             21.1270      #10.0658
2             13.0883       #7.0432
3             10.0633       #5.7040
4              8.5514       #6.0408
5              7.6474       #6.0390
6              7.0422       #5.7012
7              6.6123       #5.4631
8              6.2918       #4.6541
9              6.0379       #4.8089
10             5.8375       #4.7285
15             5.2314       #4.2946
20             4.9345       #3.8258
200            4.1172       #3.3262
2000           4.0363       #3.2768

overhead 4.02 cycles, precision 1000 units of 1.00e-06 secs, CPU freq 
500.00 MHz
             mpn_sub_n mpn_sub_n_new
1             22.1485      #11.0705
2             13.5794       #7.5462
3             10.4018       #6.3730
4              8.8061       #6.5436
5              7.8481       #6.2390
6              7.2081       #5.8714
7              6.7600       #5.6110
8              6.4195       #4.9052
9              6.1492       #4.9204
10             5.9390       #4.8325
15             5.3004       #4.3631
20             4.9828       #3.9280
200            4.1214       #3.3354
2000           4.0373       #3.2768

--Mark