mpn_sqrtrem{1,2}

Adrien Prost-Boucle adrien.prost-boucle at laposte.net
Thu Feb 16 21:45:43 UTC 2017


Hi all,

Before doing heavy (for me) asm development,
I first wanted to evaluate could be the impact, on the mpz_sqrt()
function itself, of a 2-3x speedup on functions mpn_sqrtrem{1,2}.

To avoid compile-time dependency on libm,
I simply recompiled GMP with exposed symbols mpn_sqrtrem{1,2}.
That way, these functions can be "replaced" at will, at run time,
by pre-loading shared libraries before executing test programs.

To evaluate speedup on mpz_sqrt(),
I first generate one large random number,
then compute sqrt a high number of times.

Here are the results, with vanilla GMP and with floating-point mpn_sqrtrem{1,2}.

10 times, size 10000000 bits .. vanilla 1.617 s / FP 1.611 s
1000000 times, 1024 bits ...... vanilla 0.807 s / FP 0.800 s
10000000 times, 128 bits ...... vanilla 1.044 s / FP 0.781 s
10000000 times, 100 bits ...... vanilla 1.373 s / FP 1.077 s

There is some noticeable speedup only for very short bit widths.
And the speedup is "only" 20-30%.
Which is a bit disappointing given the 2x-3x speedup put on mpn_sqrtrem{1,2}.

So clearly these functions are not the hot spot, contrary to what was assumed
at the beginning of the discussion about accelerating sqrt.

Regards,
Adrien


On Wed, 2017-02-01 at 18:55 +0100, Niels Möller wrote:
> > Adrien Prost-Boucle <adrien.prost-boucle at laposte.net> writes:
> 
> > Maybe the availability of SSE / AVX / NEON etc instruction sets can be
> > checked at compilation time?
> 
> That's what configure (and its helper scripts) does.
> 
> With --enable-fat, we also have runtime detection on certain systems.
> 
> > The ASM version would be very easy to obtain:
> > compile sqrtrem1 and sqrtrem2 (an FP implementation) on the right
> > machine and keep the ASM.
> 
> Or if these are small and easy functions, write them by hand.
> 
> > There would be no dependency on libm.
> > How difficult would it be to add such checks in GMP code?
> 
> It's not that hard to add new assembly files. You need to know some
> assembly, of course. And be aware that gmp uses m4 to preprocess
> assembly source files.
> 
> Regards,
> /Niels
> 


More information about the gmp-devel mailing list