mpn_sqrtrem{1,2} - patch for pure C implem

Sun Apr 2 14:07:20 UTC 2017

Ciao,

Il Dom, 2 Aprile 2017 12:52 pm, Adrien Prost-Boucle ha scritto:
> On my side, I do observe a little improvement (64-bit).

> Note that for each bit size, my test first generates 100 random values
> and then does sqrt repeatedly, changing value each time.

That strategy is probably better than using ordered operands.

Il Sab, 1 Aprile 2017 9:02 pm, Adrien Prost-Boucle ha scritto:
> On Sat, 2017-04-01 at 18:15 +0200, Marco Bodrato wrote:
>> May I suggest to save one more operation with:
>>
>> invroot = invroot * (((CNST_LIMB(3) << (GMP_LIMB_BITS-2-9)) -
>>                      (a0 >> 27) * invroot * invroot));

> I was too focused on using as many significant bits possible, didn't
> realize it was not useful there.

I played with your code reducing precision and obtaining:

  gmp_uint_least32_t invroot, temp1, temp2;

  invroot = invsqrttab[(a0 >> GMP_LIMB_BITS - 9) - 0x80] + 0x100;
  temp1 = ((CNST_LIMB(3) << GMP_LIMB_BITS - 16) -
        ((a0 >> 32) * invroot * invroot)) >> 27;
  invroot *= temp1;
  root = (a0 >> 32) * invroot >> 30;
  temp2 = a0 - root * root >> 30;
  root += (mp_limb_t) temp2 * invroot >> 33;

I'm sure there are places where precision can be further reduced, but my
goal was to obtain 32x32 multiplications, you basically have the sequence
for sqrtrem2 with ABI=32 :-)

-- 
http://bodrato.it/