mpn_sqrtrem{1,2} - rounding mode - erratum

Sun Mar 26 00:55:52 UTC 2017

This is essentially what I propose:

unsigned int
isqrt (unsigned long a0)
{
  double y;
  __asm__ ("sqrtsd %1, %0" : "=x" (y) : "xm" (0.999999999999999 * a0));
  unsigned int r = 1 + (unsigned int) y;
  return r - ((unsigned long) r * r > a0);
}

If we decide to use FP for some subset of x86_64 systems, than we need
to check carefully if this is an improvement for each pipeline.  We have
these expensive insns on the critical path:

convert uint64 -> double
fpmul
sqrtsd
convert double -> uint32
imulq alternatively mull

I mess with plain 'int' instead of 'long' in order to allow for narrower
multiplication.  The expression (1 + (unsigned int) y) can't overflow
thanks to the magic constant 0.999999999999999.

-- 
Torbjörn
Please encrypt, key id 0xC8601622