udiv_qrnnd_preinv and udiv_rnnd_preinv macros

Niels Möller nisse at lysator.liu.se
Sun Feb 27 12:25:11 CET 2011


I just committed a new macro, udiv_rnnd_preinv, which is
udiv_qrnnd_preinv with the q argument and the q adjustments omitted.
There are at least half a dozen potential users (but I haven't yet
commited changes to any of them).

A couple of questions:

Both udiv_qrnnd_preinv and udiv_rnnd_preinv have a special case for

  if (__builtin_constant_p (nl) && (nl) == 0)

The macro udiv_rnd_preinv (introduced by Torbjörn, I think) handles only
this special case and goes a bit further in optimizing it. Compared to
udiv_qrnnd_preinv, it does

  _r = ~(_qh + (nh)) * d

rather than

  _qh += (nh) + 1;
  _r = (nl) - _qh * (d);

and it also omits the final adjustment, if (UNLIKELY (_r >= (d))), since
that can't happen. Should these optimizations be copied to the nl ==
constant 0 special cases of udiv_qrnnd_preinv and udiv_rnnd_preinv? Then
udiv_rnd_preinv gets redundant, at least when compiling with gcc which
understands __builtin_constant_p, and maybe it could be removed?

Alternatively, one could remove these optimizations from
udiv_rnnd_preinv and let callers use udiv_rnd_preinv instead, where
appropriate. And similarly, if needed, introduce an udiv_qrnd and use
that at call sites where nl == constant 0.

Next issue: Masking logic for the hard-to-predict adjustment. For
udiv_rnnd_preinv it's simple to replace

  if (_r >  _ql)
    _r -= d;

by

  mask = - (_r > _ql)
  _r -= mask & d;

I've done similar things before with good results, so I'd be tempted to
do that change on faith, without even benchmarking ;-)

And a similar change could work also for udiv_qrnnd_preinv, where one
would need

  mask = - (_r > _ql)
  _qh += mask;
  _r -= mask & d;

(in the assembler version, that sequence corresponds to lea, cmp, cmovc,
sbb $0). Unless compilers are able to compile the currrent version to
branch free code (which I doubt, in particular for udiv_qrnnd_preinv), I
think this change ought to help? The mpn_divrem_1 or the (obsolete)
mpn_preinv_mod_1 are likely the simplest choices for benchmarking.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.



More information about the gmp-devel mailing list