tg at gmplib.org
Wed Jul 29 19:49:01 UTC 2015
nisse at lysator.liu.se (Niels Möller) writes:
It could likely be made to work if the input is properly normalized,
does the current code do that?
I kind-of dislike having to normalize numbers up-front, but it might
well be a good thing for sqrt. First, it provides better accuracy for a
k+k split. And further, we get normalized divisors for all the division
If normalising would make something superlinear faster, then normalising
locally (for operand sizes over a threashold) if not required by the
interface makes lot of sense. After all, normalisation is O(n).
I'd suggest that you make ytour code require normalisation, then see if
it can be mad faster. After that, I can write the normalisation code if
you think it is too ugly. :-)
And for small sizes, one ought to sqrlo rather than mulmod_bnm1 to
exploit the cancellation. So the tuning would be between sqrlo and
sqrmod_bnm1. But it seems we only have mullo, not sqrlo?
Squaring basecase code is a bit fiddly, perhaps therefore missing for
(We need assembkly mullo for many CPUs too.)
Please encrypt, key id 0xC8601622
More information about the gmp-devel