New T3/T4 code batch
Torbjorn Granlund
tg at gmplib.org
Thu Apr 4 02:40:58 CEST 2013
David Miller <davem at davemloft.net> writes:
> First mul_1, renamed again, now encoding the load scheduling. Only the
> 6c variant is new. Please time it. If it doesn't run at 3 c/l, then
> there are 2 simple things to try, indicated in a comment.
Looks exciting, I'll play around with this a little bit later.
I went ahead and pushed all completed T3/T4/T5 code, including a better
bdiv_dbm1 than what I've sent you. It would be interesting to see
timing data for both the naive variant mailed, and the well-scheduled
checked in variant.
I made sure to test the pushed version of all code, with unit testing.
Since a few of the pushed routines lack test/devel/try.c support, their
unit testing will not be tight wrt spurious memory reads and writes.
Let me know, at some point, if `make check' still works...
I'll spend much less time on hacking sparc code now.
--
Torbjörn
More information about the gmp-devel
mailing list