[PATCH 3 of 3] Add MIPS32R1 MADDU-based *mul_1.asm functions

Torbjörn Granlund tg at gmplib.org
Fri Sep 13 08:56:52 UTC 2019


info at mobile-stream.com writes:

  The code tries to keep the [accidental] property of MIPS-II counterparts:
  constant-time operation on 32x16 MDUs as found on e.g. 4KEc and some low-
  end MCUs. Even if that is unimportant, the performance cost is invisible.

What is MDU?

  It is faster on all tried MIPS32R1/R2/R5 CPUs (see the c/l table) and is
  expected to be fast with any pipelined MDU. So-called Area-Efficient MDU
  (optional on some MCUs) will run it *much* slower (~3x for addmul_1).

What is 3x slower than what?

  While functions look similar (especially mul_1 and addmul_1), they are
  kept separate due to corner-case (N=1,2,3) tweaks for P5600 without any
  ill effect on 4KEc or 24KEc at least.

I took a quick look at the code.  Do you use madd/msub for accumulation
here, while actual multiplication is done by multu?  As MIPS lacks
efficient multi-word addition/subtraction, that makes sense.

It's long since I did any substantial work with MIPS, but it would
appear that, at least for addmul_1, madd could be used also for
multiplication.  One should of course avoid creating a slow recurrency
path.

This assembly code contribution is substantial enough that the GNU
project needs paperwork from you and your employer.  Such paperwork
signs over copyright to the Free Software Foundation.  Are you willing
to sign such paperwork?  (You can remain publicly anonomous, but we need
a real name in private communication and on paperwork.  One of the major
GMP contributors used an alias, so this is something we are used to
handling.)


-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list