Improvements to powerpc32 asm code
Kevin Ryde
user42@zip.com.au
Tue, 03 Jun 2003 11:33:58 +1000
Mark Rodenkirch <mrodenkirch@wi.rr.com> writes:
>
> Yes, that was -C. Here are the -CD results if you are interested:
>
> 1 (21.1270) (#10.0622)
> 2 5.0496 #4.0293
> 3 4.0124 #3.0227
> 4 #4.0166 8.0599
No, you need to apply it over steps of 16 limbs or similar, especially
if the code is unrolled to a size like that and hence has special case
finish-ups for various modulo sizes. See tune/README,
./speed -s 16-64 -t 16 -C -D mpn_add_n