Next: , Previous: , Up: Multiplication Algorithms   [Index]

#### 15.1.1 Basecase Multiplication

Basecase NxM multiplication is a straightforward rectangular set of cross-products, the same as long multiplication done by hand and for that reason sometimes known as the schoolbook or grammar school method. This is an O(N*M) algorithm. See Knuth section 4.3.1 algorithm M (see References), and the mpn/generic/mul_basecase.c code.

Assembly implementations of `mpn_mul_basecase` are essentially the same as the generic C code, but have all the usual assembly tricks and obscurities introduced for speed.

A square can be done in roughly half the time of a multiply, by using the fact that the cross products above and below the diagonal are the same. A triangle of products below the diagonal is formed, doubled (left shift by one bit), and then the products on the diagonal added. This can be seen in mpn/generic/sqr_basecase.c. Again the assembly implementations take essentially the same approach.

```     u0  u1  u2  u3  u4
+---+---+---+---+---+
u0 | d |   |   |   |   |
+---+---+---+---+---+
u1 |   | d |   |   |   |
+---+---+---+---+---+
u2 |   |   | d |   |   |
+---+---+---+---+---+
u3 |   |   |   | d |   |
+---+---+---+---+---+
u4 |   |   |   |   | d |
+---+---+---+---+---+
```

In practice squaring isn’t a full 2x faster than multiplying, it’s usually around 1.5x. Less than 1.5x probably indicates `mpn_sqr_basecase` wants improving on that CPU.

On some CPUs `mpn_mul_basecase` can be faster than the generic C `mpn_sqr_basecase` on some small sizes. `SQR_BASECASE_THRESHOLD` is the size at which to use `mpn_sqr_basecase`, this will be zero if that routine should be used always.