Some secondary asm T3,T4,T5 functions
davem at davemloft.net
Tue Apr 2 22:15:15 CEST 2013
From: Torbjorn Granlund <tg at gmplib.org>
Date: Tue, 02 Apr 2013 21:31:25 +0200
> David Miller <davem at davemloft.net> writes:
> Attached is a dive_1.asm that works for me on real hardware as
> well as T4 timings from:
> tune/speed -p10000000 -s1-1000 -f1.1 -C mpn_divexact_1.3
> Terrible speed, as expected on these machines for code that relies on
> mul *latency*.
Although it is several cycles faster than the existing code which
gets about 31 c/l.
More information about the gmp-devel