div_qr_1 interface
Niels Möller
nisse at lysator.liu.se
Thu Oct 17 19:49:12 CEST 2013
Torbjorn Granlund <tg at gmplib.org> writes:
> I am not too enthusiastic about struct return types for critical
> functions. I expect this to be returned via a stack slot everywhere o
> almost everywhere.
As far as I understand, the most common ABIs for x86_64 and ARM (which
is pretty close to "almost everywhere"...) both return structs of this form
in two registers: %rax/%rdx, and %r0/%r1.
Consider the test compilation unit
typedef struct {
unsigned long q; unsigned long r;
} qr_t;
qr_t divrem (unsigned long u, unsigned long d)
{
qr_t res;
res.q = u/d;
res.r = u - res.q*d;
return res;
}
On x86_64 (and gnu/linux), gcc -c -O compiles this to
0: 48 89 f8 mov %rdi,%rax
3: 31 d2 xor %edx,%edx
5: 48 f7 f6 div %rsi
8: 48 89 fa mov %rdi,%rdx
b: 48 0f af f0 imul %rax,%rsi
f: 48 29 f2 sub %rsi,%rdx
12: c3 retq
Both inputs and outputs are passed in registers. The return value is the
only thing stored on the stack.
> I recall to have seen some code for that. How fast does it run
> currently on the various CPUs?
Don't know yet.
> Code comment:
>
> I think we cannot afford to do a separate lshift of the dividend operand
> when the divisor is just a few limbs. We need to to shifting on-the-
> fly, however irksome that might be. AN mpn_div_qr_1u_pi1 is called-for.
I think we'll definitely want mpn_div_qr_1u_pi1 for the most common
platforms. I was thinking, that maybe we could let it be an optional
function, with no C implementation, and resort to a separate mpn_lshift
if the function is missing.
But if needed, it's no big deal to extract a C mpn_div_qr_1u_pi1 from
divrem_1.c, with on-the-fly shifting.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list