Some arm cortex-a8 improvements

Richard Henderson rth at twiddle.net
Wed Feb 1 13:25:43 CET 2012


Three patches herein.  If there's a better way to submit patches,
please advise; I've never used hg before.

The first patch gives gcc control over ctz/clz.  Particularly for
armv6t2 and later, which have rbit for use for ctz.

The second patch improves multiplication a bit.  I'm still playing
with addmul_2, but this is a start for addmul_1/mul_1.  I couldn't
do better than the existing submul_1.  Unfortunately the Xscale
machines in the gcc build farm are turned off, so I can't test to
see if I've regressed on that platform.

The third patch tidies up add_n/sub_n, and provides for the carry-in
entry points.

It's a bit touchy speed testing these.  There's no cycle counter
available in userspace, and Hz is depressingly low.  So I've had
to bump the minimum iterations way way up in order to get semi-
reliable results.  Which causes the speed testing to take quite
a long time.

Feedback welcome.


r~
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: zz
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20120201/69a10b86/attachment.ksh>


More information about the gmp-devel mailing list