mpn_add or mpn_add_n+MPN_COPY+MPN_INCR_U ?

Sat Jun 14 18:28:25 UTC 2014

bodrato at mail.dm.unipi.it writes:

> Looking at the code in mpn/generic/mul.c I've found lots of structures
> like the following:
>
> cy = mpn_add_n (prodp, prodp, ws, vn);
> MPN_COPY (prodp + vn, ws + vn, un);
> mpn_incr_u (prodp + vn, cy);
>
> that is logically equivalent to
>
> ASSERT_NOCARRY (mpn_add (prodp, ws, vn + un, prodp, vn));
>
> Unfortunately, this compact way to write the same operation (that should
> be optimal wrt read from/write to memory) is slower, at least on my amd64.
> Why?
> Probably because add_n, COPY and incr are asm-optimized, while mpn_add is
> not?

Looks like mpn_add_n is an inline function defined in terms of
__GMP_AORS, which does carry propagation differently. With an inline
mpn_add, there's no good reason for it to have more overhead than
mpn_add_N + MPN_COPY + mpn_incr_u.

> I'm tempted to substitute some mpn_add_1 in the code with COPY+INCR, or to
> write a new macro NOCARRY_MPN_ADD...

To me, the right way seems to be to improve the existing macrology used
by mpn_add. 

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.