GMP 4.3 multiplication performance

Wed Jun 3 12:11:14 CEST 2009

nisse at lysator.liu.se (Niels Möller) writes:

  bodrato at mail.dm.unipi.it writes:

  > I do not know exactly... but mpn_submul_1 does write on memory only once,
  > that's why I prefer it.

  C-implementations of combination functions could use a strategy like
  in the current addsub: Use a small temporary area (a suitable fraction
  of the L1 cache, typically allocated on the stack), and do the
  combined operation blockwise with two (for sublsh_n) function calls
  per block.

  What are the costs and benefits of using the C implementation of
  addsub (it naturally varies with size, and the benefit should be zero
  if operands fit in the L1 cache).

  It would make the code so much easier to read if most or all calls to
  the combination functions could be done unconditionally. 

Unfortunately, my experience is that they become too slow in C.  The C
mpn_addsub_n is not a success.  We have to live with the #ifdef mess.

-- 
Torbjörn