GMP 4.3 multiplication performance

Wed Jun 3 15:18:34 CEST 2009

nisse at lysator.liu.se (Niels Möller) writes:

  Torbjorn Granlund <tg at gmplib.org> writes:

  > That seems somewhat unorthodox.
  >
  > Why override HAVE_NATIVE_mpn_lshift with a new meaning (and what should
  > defining it to empty mean, that it exists or that it does not exist?)?
  >
  > To what should mpn_sub_lshift default?

  I don't know. How would you suggest that one writes code that wants to
  use mpn_sub_lshift, but fall back to either submul_1 or lshift + sub
  depending on the target machine? That's going to be the case for
  practically every use of sub_lshift, and a single #ifdef per call site
  is ugly enough.

My idea was to just choose between submul_1 and sublsh_n, ignoring the
slowdown of submul_1 compared to lshift+sub.

But if tuneup is trained to know which is faster, submul_1 or
lshift+sub, the one would need:

#if HAVE_NATIVE_mpn_sublsh_n
  mpn_sublsh_n (...);
#elif USE_SUBMUL_1_FOR_SUB_LSHIFT
  mpn_submul_1 (...);
#else
  mpn_lshift (...);
  mpn_sub (...);
#endif

Pretty, isn't it?  :-)

Wrapping that in a local mpn_sub_lshift might make things cleaner, with
the drawback that the varying scratch space needs become implicit.

  My thinking was that if mpn_submul_1 is the best way to compute
  sub_lshift on a particular machine, then mpn_submul_1 on that machine
  can be considered as a decent native implementation of *both* submul_1
  and sub_lshift.

OK.  

-- 
Torbjörn