Some secondary asm T3,T4,T5 functions

David Miller davem at davemloft.net
Tue Apr 2 22:15:15 CEST 2013


From: Torbjorn Granlund <tg at gmplib.org>
Date: Tue, 02 Apr 2013 21:31:25 +0200

> David Miller <davem at davemloft.net> writes:
> 
>   Attached is a dive_1.asm that works for me on real hardware as
>   well as T4 timings from:
>   
>   tune/speed -p10000000 -s1-1000 -f1.1 -C mpn_divexact_1.3
>   
> Terrible speed, as expected on these machines for code that relies on
> mul *latency*.

Although it is several cycles faster than the existing code which
gets about 31 c/l.


More information about the gmp-devel mailing list