mpn_sqrtrem1

Tue Dec 20 03:00:14 UTC 2016

Ciao,

Il Lun, 19 Dicembre 2016 6:21 pm, Adrien Prost-Boucle ha scritto:
> That said, the interesting part in my code is these functions:
> - sqrt32_inv()    for single 32-bit words
> - sqrt64_inv()    for single 64-bit words
> - sqrt64x2_inv()  for double 64-bit words

Is there a reason why you defined three different invsqrt8_ arrays?
Doesn't invsqrttab contain suitable values?
On the other side, both sqrt64_ and sqrt64x2_ use invroot*invroot, maybe
table can store both the value and the squared value.

> I noted that GMP fallback function umul_ppmm(), in longlong.h in GMP code,
> uses 4 multiplications where the Karatsuba method would only requires 3,
> I was wondering whether optimization was possible...

Reducing the number of multiplications is possible... but I bet a
Karatsuba umul_ppmm() is not faster than the plain version (at least not
on current 64-bits CPUs ;-)

Regards,
m

-- 
http://bodrato.it/papers/