div_qr_1 interface

Niels Möller nisse at lysator.liu.se
Thu Oct 17 19:49:12 CEST 2013

Torbjorn Granlund <tg at gmplib.org> writes:

> I am not too enthusiastic about struct return types for critical
> functions.  I expect this to be returned via a stack slot everywhere o
> almost everywhere.

As far as I understand, the most common ABIs for x86_64 and ARM (which
is pretty close to "almost everywhere"...) both return structs of this form
in two registers: %rax/%rdx, and %r0/%r1.

Consider the test compilation unit

  typedef struct {
    unsigned long q; unsigned long r;
  } qr_t;
  qr_t divrem (unsigned long u, unsigned long d)
    qr_t res;
    res.q = u/d;
    res.r = u - res.q*d;
    return res;

On x86_64 (and gnu/linux), gcc -c -O compiles this to

   0:	48 89 f8             	mov    %rdi,%rax
   3:	31 d2                	xor    %edx,%edx
   5:	48 f7 f6             	div    %rsi
   8:	48 89 fa             	mov    %rdi,%rdx
   b:	48 0f af f0          	imul   %rax,%rsi
   f:	48 29 f2             	sub    %rsi,%rdx
  12:	c3                   	retq   

Both inputs and outputs are passed in registers. The return value is the
only thing stored on the stack.

> I recall to have seen some code for that.  How fast does it run
> currently on the various CPUs?

Don't know yet.

> Code comment:
> I think we cannot afford to do a separate lshift of the dividend operand
> when the divisor is just a few limbs.  We need to to shifting on-the-
> fly, however irksome that might be.  AN mpn_div_qr_1u_pi1 is called-for.

I think we'll definitely want mpn_div_qr_1u_pi1 for the most common
platforms. I was thinking, that maybe we could let it be an optional
function, with no C implementation, and resort to a separate mpn_lshift
if the function is missing.

But if needed, it's no big deal to extract a C mpn_div_qr_1u_pi1 from
divrem_1.c, with on-the-fly shifting. 


Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

More information about the gmp-devel mailing list