div_qr_1 interface

Niels Möller nisse at lysator.liu.se
Thu Oct 17 21:11:57 CEST 2013


nisse at lysator.liu.se (Niels Möller) writes:

> Consider the test compilation unit
>
>   typedef struct {
>     unsigned long q; unsigned long r;
>   } qr_t;
>   
>   qr_t divrem (unsigned long u, unsigned long d)
>   {
>     qr_t res;
>     res.q = u/d;
>     res.r = u - res.q*d;
>     return res;
>   }
>
> On x86_64 (and gnu/linux), gcc -c -O compiles this to
>
>    0:	48 89 f8             	mov    %rdi,%rax
>    3:	31 d2                	xor    %edx,%edx
>    5:	48 f7 f6             	div    %rsi
>    8:	48 89 fa             	mov    %rdi,%rdx
>    b:	48 0f af f0          	imul   %rax,%rsi
>    f:	48 29 f2             	sub    %rsi,%rdx
>   12:	c3                   	retq   

And I guess it is relevant to compare this to the same function,
reorganized to return the second value via a pointer:

  unsigned long divrem (unsigned long u, unsigned long d,
  		        unsigned long *qp)
  {
    unsigned long q;
    q = u/d;
    *qp = q;
  
    return u - q * d;
  }

which is compiled to

   0:	48 89 d1             	mov    %rdx,%rcx
   3:	48 89 f8             	mov    %rdi,%rax
   6:	31 d2                	xor    %edx,%edx
   8:	48 f7 f6             	div    %rsi
   b:	48 0f af f0          	imul   %rax,%rsi
   f:	48 89 01             	mov    %rax,(%rcx)
  12:	48 89 f8             	mov    %rdi,%rax
  15:	48 29 f0             	sub    %rsi,%rax
  18:	c3                   	retq   

If the caller is going to store the returned value directly in memory
anyway, there's little difference. And if the caller is going to operate
on the return value, and needs it in a register, I think struct return
should be epsilon more efficient.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list