[PATCH] Add optimized addmul_1 and submul_1 for IBM z13

Torbjörn Granlund tg at gmplib.org
Mon Mar 1 23:30:38 UTC 2021


Torbjörn Granlund <tg at gmplib.org> writes:

  I played a bit with an addmul_1 of my own, with some ideas from your
  code.  I don't plan to do more work on this.  Does this perform well on
  hardware?

I now realise that the instruction sequence of my example is essentially
the same as in your code, except with more unrolling.  That's a bit
surprising, as I wrote it by understanding the instruction set, starting
at the vac* insn which your code mase me look at.  I did actually not
look at your code while writing my code.


-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list