Multi-limb inverse for mpn_divrem_1

Tue Sep 16 09:12:19 CEST 2003

Torbjorn Granlund <tege at swox.com> writes:
>
> The problem with division by a single-limb divisor is that the
> limb multiply instructions lie on the recurrency path.  Pipelined
> multiply units are largely wasted.

I wonder if it'd be possible to make the trial quotient correct most
of the time.  That way the addback could be taken off the dependent
chain, or rather it would become dependent only occasionally.

The idea would be to kick off the next iteration in parallel with
verifying the correctness of the present quotient limb (or limbs).

Haven't thought this through, it might be too messy, or might be too
costly to improve the quotient.

> Here is a proof-of-concept implementation for n = 2.

Some code below I put together along those lines.  Basically the same,
but using more macros.  I think it's correct, but I don't think I got
around to seeing which chips it suited.  ev6 might have been the
inspiration, itanic might be helped.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: divrem_1_two.c
Type: text/x-csrc
Size: 14581 bytes
Desc: not available
Url : /list-archives/attachments/20030916/f602fcff/divrem_1_two-0001.bin