speed of unbalanced multiplication

Zimmermann Paul Paul.Zimmermann at loria.fr
Thu Feb 7 09:22:10 CET 2013


       Marco,

> Date: Wed, 6 Feb 2013 17:59:44 +0100 (CET)
> From: bodrato at mail.dm.unipi.it
> 
> Ciao Paul!

Ciao!!!

> Of course. With current implementation, unbalanced multiplications need
> some more memory and a few additions/subtractions, but this should not
> give a measurable slow-down. The "matrix" obtained with
> 
> $ tune/speed -s 400000-800000 -t 100000 mpn_mul.800000 mpn_mul.900000
> mpn_mul.1000000 mpn_mul.1100000 mpn_mul.1200000
> 
> shows that times are not as monotonic as desired, but "unbalancement" does
> not really have an influence.

indeed:

frite% ./speed -s 400000-800000 -t 100000 mpn_mul.800000 mpn_mul.900000 mpn_mul.1000000 mpn_mul.1100000 mpn_mul.1200000
overhead 0.000000002 secs, precision 10000 units of 3.33e-10 secs, CPU freq 3000.00 MHz
        mpn_mul.800000 mpn_mul.900000 mpn_mul.1000000 mpn_mul.1100000 mpn_mul.1200000
400000   #0.460029000   0.472029000   0.564035000   0.572035000   0.712044000
500000   #0.476029000   0.560035000   0.548034000   0.692043000   0.708044000
600000   #0.572036000   0.576036000   0.704044000   0.680042000   0.696044000
700000   #0.556035000   0.688043000   0.672042000   0.676042000   0.724046000
800000    0.712045000   0.700044000  #0.668042000   0.688043000   0.772048000

> I think that the culprit is the tune/speed program, but I'm not able to
> correct it. I just tested the attached patch. After patching, the results
> are:
> 
> $ tune/speed -s 800000-1000000 -t 100000 mpn_mul_n mpn_mul mpn_mul_bal
> overhead 0.000000000 secs, precision 10000 units of 3.13e-11 secs, CPU
> freq 31990.26 MHz
>             mpn_mul_n       mpn_mul   mpn_mul_bal
> 800000    0.646571000   0.682834000  #0.632961000
> 900000   #0.652178000   0.678564000   0.655979000
> 1000000   0.710674000   0.740998000  #0.702508000

I can reproduce this on GMP 5.1.0 with your patch:

frite% ./speed -s 800000-1000000 -t 100000 mpn_mul_n mpn_mul mpn_mul_bal
overhead 0.000000002 secs, precision 10000 units of 3.33e-10 secs, CPU freq 3000.00 MHz
            mpn_mul_n       mpn_mul   mpn_mul_bal
800000    0.668041000   0.716045000  #0.664041000
900000   #0.652040000   0.696043000   0.656041000
1000000    0.724045000   0.748046000  #0.720045000

> As you can see mpn_mul_n and mpn_mul_bal are comparable, and mpn_mul is
> always slower. All the three functions measure the time to multiply two
> numbers of the same size. mpn_mul_bal and mpn_mul measure the same
> function, but the first uses the same macro that tune/speed uses for
> mpn_mul_n...

if the culprit is the macro used in speed, it should be fixed!

Paul


More information about the gmp-devel mailing list