Some arm cortex-a8 improvements
rth at twiddle.net
Wed Feb 1 13:25:43 CET 2012
Three patches herein. If there's a better way to submit patches,
please advise; I've never used hg before.
The first patch gives gcc control over ctz/clz. Particularly for
armv6t2 and later, which have rbit for use for ctz.
The second patch improves multiplication a bit. I'm still playing
with addmul_2, but this is a start for addmul_1/mul_1. I couldn't
do better than the existing submul_1. Unfortunately the Xscale
machines in the gcc build farm are turned off, so I can't test to
see if I've regressed on that platform.
The third patch tidies up add_n/sub_n, and provides for the carry-in
It's a bit touchy speed testing these. There's no cycle counter
available in userspace, and Hz is depressingly low. So I've had
to bump the minimum iterations way way up in order to get semi-
reliable results. Which causes the speed testing to take quite
a long time.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the gmp-devel