# Cancellation with hgcd / unbalanced mulmod_bnm1

Niels Möller nisse at lysator.liu.se
Fri Nov 11 16:21:52 CET 2011

```When applying a hgcd matrix, we compute two values of the form a b - c
d, where it is know a priori that the result is positive and that high
limbs cancel. So one can do the computation modulo some suitable number,
e.g., 2^n or 2^n - 1.

For use in hgcd_appr, typical sizes are that we compute a b, where a is
of size 4n, b is of size n, and the result (after the subtraction) is
known to fit in 3n limbs. What's the best strategy to do this?

One way is to select some k >= 3n, and do the computation mod m = B^k -
1. First reduce a mod m, then compute a b mod m using mpn_mulmod_bnm1.
The input sizes to mpn_mulmod_bnm1 is then 3n x n. How well suited is
current mpn_mulmod_bnm1 to these unbalanced operands?

Another way is to split a as a = a_3 B^{3n} + a_2 B^{2n} + a_1 B^n +
a_0, and compute mod m = B^{3n}, as

(a_1 B^n + a_0) * b  // mpn_mul, 2n x n
+ B^2 (a_2 b mod B^n)  // mpn_mullo, n x n

I imagine this is going to be the best strategy for small n (but then on
the other hand, small n may be irrelevant for divide and conquer
mpn_hgcd_appr.

Another option is to use the modulo m = B^{3n} - B^{2n}, and do the product
as

(a_1 B^n + a_0) * b        // mpn_mul, 2n x n
+ B^2 (a_2 b mod (B^n - 1))  // mpn_mulmod_bnm1, n x n

Will that be better or worse than an unbalanced 3n x n mpn_mulmod_bnm1?
In the fft range, it *should* be worse, since the mpn_mul will be done
as an mpn_mulmod_bnm1 of size 3n. But what about smaller sizes?

Any other tricks?

Regards,
/Niels

--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

```