# [Paul Zimmermann <Paul.Zimmermann@loria.fr>] mpz_cbrtrem

**Jason Moxham
**
J.L.Moxham@maths.soton.ac.uk

*Mon, 4 Nov 2002 13:16:11 +0000*

On Monday 04 Nov 2002 12:54 pm, Paul Zimmermann wrote:
>* > [zimmerma@ecrouves ~/gmp]$ ./cbrtrem 50000 1
*>* > mpz_root took 29670ms
*>* > mpz_sqrtrem took 2270ms
*>* > mpz_cbrtrem took 1950ms
*>*
*>* strange , On my athlon mpz_cbrtrem is 50% slower than mpz_sqrtrem ?
*>*
*>* For the same number of limbs?
*
I didn't change you code , so it should be , used same compiler options a=
s gmp=20
lib was compiled with , although I replaced cputime() with clock() =20
>*
*>* > while being reasonably fast for one-limb operands:
*>* >
*>* > [zimmerma@ecrouves ~/gmp]$ ./cbrtrem 1 100000
*>* > mpz_root took 1830ms
*>* > mpz_sqrtrem took 110ms
*>* > mpz_cbrtrem took 2400ms
*>* >
*>* > The key function is mpz_add_bits: set the low bits of the destinat=
*ion
>* > to bits n0 to n1-1 from the operand.
*>* >
*>* > Paul
*>*
*>* I assume this is the karatsuba cube root , does it generilize to k-t=
*h
>* root ?
*>*
*>* Yes this is a recursive algorithm, with a direct computation of the
*>* remainder together with the cube root. It generalizes to k-th root, but
*>* when k increases the efficiency decreases with respect to first computi=
*ng
>* the k-th root, then the remainder by n - s^k.
*>*
*>* If this is the karatsuba cbrt-root then I was under the impression t=
*hat
>* it was not asymptotically optimal , although faster for pratical ranges=
* .
>*
*>* In fact my "karatsuba square root" is more generally a divide & conquer
*>* algorithm. Unlike divide & conquer division, it is asymptotically optim=
*al
>* (except perhaps for the multiplicative constant), since it is just a Ne=
*wton
>* iteration with incremental remainder computation.
*>*
*>* So Paul , if it does generalize to k-th root , do you plan to implem=
*ent
>* it for GMP ? , or does anyone else ?
*>*
*>* I first want to see if the cube-root implementation is fast enough.
*>* How does it compare to your generic implementation for k=3D3?
*>*
*
It faster by a couple of percent , so when optimized should be much faste=
r and=20
simpler than varible precision newton
I wish you had emailed me , that you were working on it , I would put off=
the=20
mpn-ification of my code until I had compaired it with you code to see if=
it=20
was worth the effort.
>* Paul
*