Efficient implementation of polynomial multiplication

Sat Mar 6 19:16:23 CET 2004

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Saturday 06 March 2004 17:53, you wrote:
> Décio Luiz Gazzoni Filho <decio at decpp.net> writes:
> > I'm thinking either something that manipulates the mpz_t->_mp_d array
> > directly,
>
> Don't do that if you want to be compatible with future releases.

Considering how time critical the code is, I can afford to fix it if/when the 
interface changes.

> > or something using mpz_import/mpz_export...
>
> If the shifts aren't doing what you want then this would be the go.
>
> You might want to break out the profiler to check time is going where
> you think it is.

I'm pretty sure it's being used in the shifts. I was able to complete a task 
involving that code in 25 minutes today whereas the old code hadn't done 1/4 
or 1/3 of its job in half a day, after which I stopped it. And the shifts 
were the only thing I changed.

> >   X->_mp_size = deg*limbs+j;
>
> You may have to realloc X to ensure there's enough room at _mp_d.  And
> you definitely must ensure the high limb is non-zero.  See "Internals"
> in the manual if you really want to do this sort of thing.

That's why I had the following lines up there:
  mpz_init2(X,(deg+1)*b);
  mpz_init2(Y,(arg.deg+1*b);

Since b = limbs*mp_bits_per_limb, and j < limbs always holds, then this code 
will never write over unallocated memory. Since coeffs[deg] and arg[deg] are 
guaranteed to be non-zero, then I need to worry about a non-zero high limb 
only when storing those. That's why I took then out of the loop and didn't 
zero-pad _mp_d like I did with the others.

In fact when I asked about portability of the code, the main concern in my 
mind was about endianness. Do you see any flaws with the code regarding that?

Thanks a lot,

Décio
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFASk23FXvAfvngkOIRArouAJ43UsL2wv0U6hksvSUsEvUP4UnOewCePejE
8xXtS8i3oTXPUAfKvZU2nEs=
=CBz0
-----END PGP SIGNATURE-----