multiplication of unequal sizes

Kevin Ryde user42@zip.com.au
Tue, 21 Jan 2003 09:33:27 +1000


Jason Moxham <J.L.Moxham@maths.soton.ac.uk> writes:
>
> so what splitting is optimal ?

Some investigation would be good.  Make a basic Karatsuba that just
calls the basecase for its recursions, then measure how it goes for
combinations of x and y sizes, and with the split point selected at
different places (between xsize/2 and ysize I guess).  Hopefully
there's a pattern to where it's faster than the basecase, and what
split is best.

> and are there any referances to papers on this ?

Torbjorn showed me this from sci.math a while back:


From: tzs@hardy.u.washington.edu (Tim Smith)
Newsgroups: sci.math
Subject: Multiplying large integers that aren't the same size (LONG!)
Date: 15 Sep 1993 03:47:34 GMT
Organization: University of Washington School of Law, Class of '95
Lines: 487
Message-ID: <27638m$cjh@news.u.washington.edu>
NNTP-Posting-Host: hardy.u.washington.edu

[This posting is kind of long, because it includes some tables whose
pattern I'm asking for help in figuring out]

Nearly every book that covers arbitrary precision integer arithmetic gives
the following algorithm as a neat way to turn the ordinary O(n^2)
algorithm for multiplying two n bit numbers into an O(n^1.58) algorithm:

        1. Let the numbers be A 2^k + B and C 2^k + D, where k is [n/2].

        2. Compute AC, BD, and either (A+B)(C+D) or (A-B)(D-C).

        3. These three products can be combined with simple shifting and
        adding to give the desired final product.  Thus, one multiplication
        of n bit numbers is replaced by three multiplications of n/2 bit
        numbers.

        4. Do this recursively until the numbers you are working with are
        small enough to make your O(n^2) algorithm win because of that
        overhead of shifting and adding.

This is fine when the two numbers have the same number of bits.  But what
if they do not?  I've been looking into this (because I've written an
arbitrary precision package in C++, which I plan to release into the
public domain as soon as I get a good answer to the questions I'm posting
here), and there are some really weird things going on here.

In what follows, let's assume we are working with digits, instead of bits,
in some base, and assume that all we are interested in is reducing the
number of one digit by one digit multiplies.  Let the two numbers be U and
V, where U has u digits, and V has v digits.

I've looked at basically two ways to multiply U and V.

        A. Pick some k < u.  Use the aforesaid method, where B and D are the
        lower k digits of U and V, and A and C are the upper u-k and v-k
        digits.  Thus, a multiply of a u digit by a v digit number is
        replaced by three multiples: a k by k, a u-k by v-k, and a
        max(k,u-k) by max(k,v-k) (for simplicity, I'm assuming that there
        is no carry when doing A+B and C+D).

        B. Pick some k >= u.  Do two multiplies: a u by k, and a u by v-k.

The problem is this: given u and v, what is the best value for k?

It's easy to write a program that builds up a table of the best k for each
u and v in some range, and I've done so, but have been unable to discern
what the heck is going on.

Here's an example of what the data looks like for u=40:

(41: 16 20 24)	 this means for v=41, best k is 16, 20, or 24
...		 "..." means the best k values are the same as the line above
(44: 16 20 24)
(45: 16 24)
...
(48: 16 24)
(49: 24)
...
(51: 24)
(52: 20 24)
(53: 24)
...
(55: 24)
(56: 24 32)
(57: 32)
...
(64: 32)
(65: 32 64)
...
(96: 32 64)
(97: 64)
...
(103: 64)
(104: 40 64)
(105: 41 64)
(106: 42 64)
(107: 43 64)
(108: 44 64)
(109: 45 64)
(110: 46 64)
(111: 47 64)
(112: 48 64)
(113: 49 64)
(114: 50 64)
(115: 51 64)
(116: 52 64)
(117: 53 64)
(118: 54 64)
(119: 55 64)
(120: 56 64)
(121: 57 64)
(122: 58 64)
(123: 59 64)
(124: 60 64)
(125: 61 64)
(126: 62 64)
(127: 63,64)
(128: 64)
(129: 64 65 128)

There appear to be some general patterns.  For example, there's something
going on related to powers of 2.  As u increases, whenever it passes a
power of 2, things get all messed up, and as it approaches the next power
of 2, things calm down.  In particular, it appears that when u is beyond
approximately the halfway point between two powers of two, then there is
always a value of k that is best that is a power of 2:  I believe that the
first power of 2 beyond u works, for instance, if that is less than v, and
half that works otherwise.

If we just look at those u and v for which none of the best k is a power
of two, here's what the data looks like, out to u=127 and v=150:

(This is in for form

        u x v: list of best k values

The list sometimes has commas and sometimes doesn't because of a bug
in my output code.  It doesn't mean anything.)

9x13:	 5	Note: two u's after 8 don't have power of 2 k.
10x13:	 5,6

17x25:	 9	Note: six this time fail to have k that is power of 2
17x26:	 10
18x25:	 9
18x26:	 10
19x25:	 9
19x26:	 10
19x27:	 11
20x25:	 12
20x26:	 10 12
20x27:	 12
21x25:	 9 12,13
21x26:	 10 13
22x25:	 12

33x42:	 17	Note: 14 this time.  It seems to be going up by powers of 2.
33x49:	 17	      Predict 30 next time?
33x50:	 18
33x51:	 19
33x52:	 20
34x49:	 17
34x50:	 18
34x51:	 19
34x52:	 20
35x49:	 17
35x50:	 18
35x51:	 19
35x52:	 20
36x49:	 18
36x50:	 18
36x51:	 20
36x52:	 20
37x49:	 17
37x50:	 18
37x51:	 19
37x52:	 20
37x53:	 21
37x54:	 22
38x49:	 18
38x50:	 18
38x51:	 19
38x52:	 20
38x53:	 22
38x54:	 22
39x49:	 24
39x50:	 24
39x51:	 19
39x52:	 20
39x53:	 23,24
39x54:	 23,24
39x55:	 23
40x49:	 24
40x50:	 24
40x51:	 24
40x52:	 20 24
40x53:	 24
40x54:	 24
40x55:	 24
41x49:	 17 24,25
41x50:	 25
41x51:	 25
41x52:	 20
41x53:	 21 25
42x49:	 24
42x50:	 18 24 26
42x51:	 26
42x52:	 20 26
42x53:	 21 26
43x49:	 24
43x50:	 24
43x51:	 19 24 27
43x52:	 20
44x49:	 24
44x50:	 24
44x51:	 24
45x49:	 24
45x50:	 24
46x49:	 24

65x82:	 33	Note: Yup!  30 u's after 64 don't have k as power of 2
65x83:	 33
65x84:	 33
65x85:	 33
65x97:	 33
65x98:	 34
65x99:	 35
65x100:	 36
65x101:	 65
65x102:	 65
65x103:	 65
65x104:	 65
65x105:	 65
66x83:	 34
66x84:	 34
66x85:	 34
66x97:	 33
66x98:	 34
66x99:	 35
66x100:	 36
66x101:	 37
66x102:	 38
66x103:	 39
66x104:	 40
67x85:	 35
67x97:	 33
67x98:	 34
67x99:	 35
67x100:	 36
67x101:	 37
67x102:	 38
67x103:	 39
67x104:	 40
68x85:	 36
68x97:	 34
68x98:	 34
68x99:	 36
68x100:	 36
68x101:	 37
68x102:	 38
68x103:	 39
68x104:	 40
69x97:	 33
69x98:	 34
69x99:	 35
69x100:	 36
69x101:	 37
69x102:	 38
69x103:	 39
69x104:	 40
69x105:	 41
70x97:	 34
70x98:	 34
70x99:	 35
70x100:	 36
70x101:	 38
70x102:	 38
70x103:	 39
70x104:	 40
70x105:	 41
71x97:	 33
71x98:	 35
71x99:	 35
71x100:	 36
71x101:	 39
71x102:	 39
71x103:	 39
71x104:	 40
71x105:	 41
72x97:	 33
72x98:	 36
72x99:	 36
72x100:	 36
72x101:	 40
72x102:	 40
72x103:	 40
72x104:	 40
72x105:	 41
73x97:	 33
73x98:	 34
73x99:	 36
73x100:	 36
73x101:	 37
73x102:	 40
73x103:	 40
73x104:	 40
73x105:	 41
73x106:	 42
73x107:	 43
74x97:	 34
74x98:	 34
74x99:	 36
74x100:	 36
74x101:	 37
74x102:	 38
74x103:	 40
74x104:	 40
74x105:	 42
74x106:	 42
74x107:	 43
74x108:	 44
75x97:	 33
75x98:	 35
75x99:	 35
75x100:	 36
75x101:	 37
75x102:	 38
75x103:	 39
75x104:	 40
75x105:	 43
75x106:	 43
75x107:	 43
75x108:	 44
76x97:	 48
76x98:	 36
76x99:	 36
76x100:	 36
76x101:	 38
76x102:	 38
76x103:	 40
76x104:	 40
76x105:	 44
76x106:	 44
76x107:	 44
76x108:	 44
77x97:	 48
77x98:	 48
77x99:	 48
77x100:	 48
77x101:	 37
77x102:	 38
77x103:	 39
77x104:	 40
77x105:	 45 48
77x106:	 45 48
77x107:	 45
77x108:	 45
77x109:	 45
77x110:	 46
78x97:	 48
78x98:	 48
78x99:	 48
78x100:	 48
78x101:	 38 48
78x102:	 38
78x103:	 39
78x104:	 40
78x105:	 48
78x106:	 46 48
78x107:	 46 48
78x108:	 46 48
78x109:	 46
78x110:	 46
79x97:	 48
79x98:	 48
79x99:	 48
79x100:	 48
79x101:	 48
79x102:	 48
79x103:	 39
79x104:	 40
79x105:	 48
79x106:	 48
79x107:	 47,48
79x108:	 48
79x109:	 47,48
79x110:	 47,48
79x111:	 47
80x97:	 48
80x98:	 48
80x99:	 48
80x100:	 48
80x101:	 48
80x102:	 48
80x103:	 48
80x104:	 40 48
80x105:	 48
80x106:	 48
80x107:	 48
80x108:	 48
80x109:	 48
80x110:	 48
80x111:	 48
81x90:	 48
81x97:	 33 48,49
81x98:	 49
81x99:	 49
81x100:	 49
81x101:	 49
81x102:	 40 49
81x103:	 40
81x104:	 40
81x105:	 41 49
81x106:	 49
82x97:	 48
82x98:	 34 48 50
82x99:	 50
82x100:	 50
82x101:	 50
82x102:	 50
82x103:	 40
82x104:	 40
82x105:	 41
82x106:	 42 50
83x97:	 48
83x98:	 48
83x99:	 35 48 51
83x100:	 36
83x101:	 51
83x102:	 51
83x103:	 51
83x104:	 40
83x105:	 41
83x106:	 42
83x107:	 43 51
84x97:	 48
84x98:	 48
84x99:	 48
84x100:	 36 48 52
84x101:	 52
84x102:	 52
84x103:	 52
84x104:	 40 52
84x105:	 52
84x106:	 42 52
84x107:	 52
85x97:	 48
85x98:	 48
85x99:	 48
85x100:	 48
85x101:	 37 48 53
85x102:	 38
85x103:	 40
85x104:	 40
85x105:	 41 53
85x106:	 42 53
86x97:	 48
86x98:	 48
86x99:	 48
86x100:	 48
86x101:	 48
86x102:	 38 48 54
86x103:	 40
86x104:	 40
87x97:	 48
87x98:	 48
87x99:	 48
87x100:	 48
87x101:	 48
87x102:	 48
87x103:	 39 48 55
87x104:	 40
88x97:	 48
88x98:	 48
88x99:	 48
88x100:	 48
88x101:	 48
88x102:	 48
88x103:	 48
89x97:	 48
89x98:	 48
89x99:	 48
89x100:	 48
89x101:	 48
90x97:	 48
90x98:	 48
90x99:	 48
90x100:	 48
90x101:	 48
91x97:	 48
91x98:	 48
91x99:	 48
91x100:	 48
92x97:	 48
92x98:	 48
92x99:	 48
93x97:	 48
93x98:	 48
94x97:	 48

Some things stand out in this.  For example, 2^n+1 x 2^n+1+2^(n-1) always
seems to be one of the cases that doesn't have a power of 2 value of k.

I'm sure someone must have worked this out before.  Can anyone give me a
reference to what the pattern here is (or tell me?).