Paul Zimmermann Paul.Zimmermann at
Tue Sep 30 22:25:28 CEST 2003

	Dear gmp-developers,

I have a theoretical question about mpz_gcd. It uses Sorenson's
algorithm, with improvements of Weber. In short, the algorithm
is as follows:

# assume a, b are both odd of n limbs
# B = 2^32
while n > 0 do
1)   find two 1-limb numbers u, v such that B^2 divides u*a + v*b
2)   a <- (u*a + v*b) / B^2 # a has now n-1 limbs
3)   find q such that B divides b + q*a
4)   b <- (b + q*a) / B     # b has now n-1 limbs
     n <- n-1
end while

Steps 1) and 3) cost O(1) since they only need to consider the low
limbs from a and b. Step 2) can be performed by one mpn_mul_1 call
and one mpn_addmul_1 call, both of size n. Step 4) can be performed
by one mpn_addmul_1 call of size n too. Thus considering mpn_mul_1
has the same cost per limb c than mpn_addmul_1, we have 3*c*n to go
from n to n-1, which gives a total cost of about 3/2*n^2.

As a comparison, the basecase multiplication (mpn_mul_basecase) costs
c*n^2, thus we should have mpz_gcd ~ 3/2 * mpn_mul_basecase. However
experimental results show that the ratio is more about 2 than 3/2,
for example for n=5000 on my laptop:

1.5 * mpn_mul_basecase took 1005ms
mpz_gcd took 1370ms

Is there an explanation for this?


More information about the gmp-devel mailing list