GMP 4.3 multiplication performance
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Wed Jun 3 09:51:10 CEST 2009
Ciao!
> nisse at lysator.liu.se (Niels Möller) writes:
> One comment on your interpolation function: You don't need any scratch
> space. But then for the operation
> /* W2 =(W2 - W4)/3 - W0<<2 */
> you use mpn_submul_1 with a constant multiplier of 4, which I imagine
> is more costly than a shift. toom_interpolate_7pts has the same
I do not know exactly... but mpn_submul_1 does write on memory only once,
that's why I prefer it.
> If using submul_1 removes the last scratch space needs, then perhaps use
> it unconditionally, instead of lshift. Optionally, use sublsh_n if
> HAVE_NATIVE_mpn_sublsh_n and submul_1 otherwise.
#if HAVE_NATIVE_mpn_sublsh_n
#define DO_mpn_sublsh_n(dst,src,n,s) mpn_sublsh_n(dst,src,n,s)
#else
#define DO_mpn_sublsh_n(dst,src,n,s) mpn_addmul_1(dst,src,n,CNST_LIMB(1) <<s)
#endif
Correct?
If it is, can something like this be included in gmp-impl.h, so that we
can start using sublsh without the need to care about native or non-native
sublsh in generic code?
> Here is the full matrix of planned combination function. The cycle
Great!
> addlsh1 (a + 2b) addlsh (a + (2^c)b) 2/1.5
> sublsh1 (a - 2b) sublsh (a - (2^c)b) 2.21/1.7
Both are really desired for Toom evaluation and interpolation.
> rsbrsh1 (a/2 - b) rsbrsh (a/(2^c) - b) -/1.5
This too, may have interesting applications, particularly for a little
program I'm (slowly) working on... I adopted a different strategy because
of the lack of such a function.
> rsh1add (a + b)/2 rshadd (a + b)/(2^c) 2.14/1.5
> rsh1sub (a - b)/2 rshsub (a - b)/(2^c) 2.14/1.5
rsh1(add|sub) are already used in both interpolation_5pts and _7pts!
> addadd (a + b + c) -/2
> addsub (a + b - c) -/2
...addsub already exists with another meaning: it computes
(a,b)<-(a+b,a-b), should I assume it will be removed?
Regards,
Marco
--
http://bodrato.it/
More information about the gmp-devel
mailing list