Conversions from binary to a power-of-2 radix use a simple and fast
*O(N)* bit extraction algorithm.

Conversions from binary to other radices use one of two algorithms. Sizes
below `GET_STR_PRECOMPUTE_THRESHOLD`

use a basic *O(N^2)* method.
Repeated divisions by *b^n* are made, where *b* is the radix and
*n* is the biggest power that fits in a limb. But instead of simply
using the remainder *r* from such divisions, an extra divide step is done
to give a fractional limb representing *r/b^n*. The digits of *r*
can then be extracted using multiplications by *b* rather than divisions.
Special case code is provided for decimal, allowing multiplications by 10 to
optimize to shifts and adds.

Above `GET_STR_PRECOMPUTE_THRESHOLD`

a sub-quadratic algorithm is used.
For an input *t*, powers *b^(n*2^i)* of the radix are
calculated, until a power between *t* and *sqrt(t)* is
reached. *t* is then divided by that largest power, giving a quotient
which is the digits above that power, and a remainder which is those below.
These two parts are in turn divided by the second highest power, and so on
recursively. When a piece has been divided down to less than
`GET_STR_DC_THRESHOLD`

limbs, the basecase algorithm described above is
used.

The advantage of this algorithm is that big divisions can make use of the
sub-quadratic divide and conquer division (see Divide and Conquer Division), and big divisions tend to have less overheads than lots of
separate single limb divisions anyway. But in any case the cost of
calculating the powers *b^(n*2^i)* must first be overcome.

`GET_STR_PRECOMPUTE_THRESHOLD`

and `GET_STR_DC_THRESHOLD`

represent
the same basic thing, the point where it becomes worth doing a big division to
cut the input in half. `GET_STR_PRECOMPUTE_THRESHOLD`

includes the cost
of calculating the radix power required, whereas `GET_STR_DC_THRESHOLD`

assumes that’s already available, which is the case when recursing.

Since the base case produces digits from least to most significant but they
want to be stored from most to least, it’s necessary to calculate in advance
how many digits there will be, or at least be sure not to underestimate that.
For GMP the number of input bits is multiplied by `chars_per_bit_exactly`

from `mp_bases`

, rounding up. The result is either correct or one too
big.

Examining some of the high bits of the input could increase the chance of getting the exact number of digits, but an exact result every time would not be practical, since in general the difference between numbers 100… and 99… is only in the last few bits and the work to identify 99… might well be almost as much as a full conversion.

The *r/b^n* scheme described above for using multiplications to bring out
digits might be useful for more than a single limb. Some brief experiments
with it on the base case when recursing didn’t give a noticeable improvement,
but perhaps that was only due to the implementation. Something similar would
work for the sub-quadratic divisions too, though there would be the cost of
calculating a bigger radix power.

Another possible improvement for the sub-quadratic part would be to arrange
for radix powers that balanced the sizes of quotient and remainder produced,
i.e. the highest power would be an *b^(n*k)* approximately equal to
*sqrt(t)*, not restricted to a *2^i* factor. That ought to
smooth out a graph of times against sizes, but may or may not be a net
speedup.