ARM public key benchmark

Torbjorn Granlund tg at gmplib.org
Wed Apr 3 15:11:39 CEST 2013


nisse at lysator.liu.se (Niels Möller) writes:

  nisse at lysator.liu.se (Niels Möller) writes:
  
  > So it should be doable with the addmul_1 loop and two additional,
  > non-recurrency, not instructions per limb, and then maybe some extra
  > logic for the return value. One could aim for 4.25 c/l, I guess.
  
  The below seems to give correct results. But still 5.25 c/l. Maybe
  scheduling can be improved, I just put the new mvn instructions
  immediately preceding umaal and str.
  
The A9 is not a true OoO design, it wants manual scheduling.

I also suspect the autoincrement of ldr should be replaced by a discrete
pointer update.

-- 
Torbjörn


More information about the gmp-devel mailing list