sqrt algorithm

Wed Jul 29 19:49:01 UTC 2015

nisse at lysator.liu.se (Niels Möller) writes:

  It could likely be made to work if the input is properly normalized,
  does the current code do that?

  I kind-of dislike having to normalize numbers up-front, but it might
  well be a good thing for sqrt. First, it provides better accuracy for a
  k+k split. And further, we get normalized divisors for all the division
  calls.

If normalising would make something superlinear faster, then normalising
locally (for operand sizes over a threashold) if not required by the
interface makes lot of sense.  After all, normalisation is O(n).

I'd suggest that you make ytour code require normalisation, then see if
it can be mad faster.  After that, I can write the normalisation code if
you think it is too ugly.  :-)

  And for small sizes, one ought to sqrlo rather than mulmod_bnm1 to
  exploit the cancellation. So the tuning would be between sqrlo and
  sqrmod_bnm1. But it seems we only have mullo, not sqrlo?

Squaring basecase code is a bit fiddly, perhaps therefore missing for
some variants.

(We need assembkly mullo for many CPUs too.)

-- 
Torbjörn
Please encrypt, key id 0xC8601622