ARM public key benchmark
Torbjorn Granlund
tg at gmplib.org
Wed Apr 3 15:11:39 CEST 2013
nisse at lysator.liu.se (Niels Möller) writes:
nisse at lysator.liu.se (Niels Möller) writes:
> So it should be doable with the addmul_1 loop and two additional,
> non-recurrency, not instructions per limb, and then maybe some extra
> logic for the return value. One could aim for 4.25 c/l, I guess.
The below seems to give correct results. But still 5.25 c/l. Maybe
scheduling can be improved, I just put the new mvn instructions
immediately preceding umaal and str.
The A9 is not a true OoO design, it wants manual scheduling.
I also suspect the autoincrement of ldr should be replaced by a discrete
pointer update.
--
Torbjörn
More information about the gmp-devel
mailing list