GMP 4.3 multiplication performance

bodrato at bodrato at
Wed Jun 3 09:51:10 CEST 2009


> nisse at (Niels Möller) writes:
>   One comment on your interpolation function: You don't need any scratch
>   space. But then for the operation
>     /* W2 =(W2 - W4)/3 - W0<<2 */
>   you use mpn_submul_1 with a constant multiplier of 4, which I imagine
>   is more costly than a shift. toom_interpolate_7pts has the same

I do not know exactly... but mpn_submul_1 does write on memory only once,
that's why I prefer it.

> If using submul_1 removes the last scratch space needs, then perhaps use
> it unconditionally, instead of lshift.  Optionally, use sublsh_n if
> HAVE_NATIVE_mpn_sublsh_n and submul_1 otherwise.

#if HAVE_NATIVE_mpn_sublsh_n
#define DO_mpn_sublsh_n(dst,src,n,s) mpn_sublsh_n(dst,src,n,s)
#define DO_mpn_sublsh_n(dst,src,n,s) mpn_addmul_1(dst,src,n,CNST_LIMB(1) <<s)

If it is, can something like this be included in gmp-impl.h, so that we
can start using sublsh without the need to care about native or non-native
sublsh in generic code?

> Here is the full matrix of planned combination function.  The cycle


> 	addlsh1	(a + 2b)	addlsh	(a + (2^c)b)	2/1.5
> 	sublsh1	(a - 2b)	sublsh	(a - (2^c)b)	2.21/1.7

Both are really desired for Toom evaluation and interpolation.

> 	rsbrsh1	(a/2 - b)	rsbrsh	(a/(2^c) - b)	-/1.5

This too, may have interesting applications, particularly for a little
program I'm (slowly) working on... I adopted a different strategy because
of the lack of such a function.

> 	rsh1add	(a + b)/2	rshadd	(a + b)/(2^c)	2.14/1.5
> 	rsh1sub	(a - b)/2	rshsub	(a - b)/(2^c)	2.14/1.5

rsh1(add|sub) are already used in both interpolation_5pts and _7pts!

> 	addadd  (a + b + c)				-/2
> 	addsub  (a + b - c)				-/2

...addsub already exists with another meaning: it computes
(a,b)<-(a+b,a-b), should I assume it will be removed?


More information about the gmp-devel mailing list