adrien.prost-boucle at laposte.net
Wed Mar 15 19:56:09 UTC 2017
> I miss a case: 32 bits; to fully evaluate the impact of the patch+FP on
> one-limb operands in the range 1..62.
Isn't 64-bit and 32-bit data identical, for one mpn_sqrtrem1 call on x86-64?
I don't get why we would see a difference.
Or, do you mean we should add another test (along with tests for 1 and 2 limbs),
to check whether a 1-limb data fits in 32-bits?
And then call a 32-bit-only function, which in FP version would use the float data type with correction or double with no correction.
Such a branch would be taken in an extremely low amount of cases, while that code would need to be maintained, etc
> Did you try also with ABI=32 (16,32,48, and 64 bits)?
No, not yet.
I didn't know 16-bit and 48-bit ABI even existed... at least on a x86-64 machine?
> If you are ready for assembler, then let's go!
Not nearly ready xD
I'm not familiar with the way GMP manipulates its ASM code. And below average at m4 xD
So, I was thinking of implementing only the FP sqrt call as a in-instruction inline ASM.
I've been digging in the Intel instruction set manuals and GCC inline asm guides.
On one side, the Intel docs mentions these instructions:
FSQRT "square root of floating-point numbers", in basic arith instruction set of x87 FPU
SQRTSS SSE, sqrt of single-precision
SQRTSD SSE2, sqrt of double-precision
On the other side, this is what objdump says about compilation of standard sqrt*() functions:
sqrtf() (data type float) -> instruction SQRTSS, that's what I expected
sqrt() (data type double) -> instruction SQRTSD, that's what I expected
sqrtl() (data type long double) -> instruction FSQRT => what??
There are some more instructions around FSQRT... but not enough to do serious computing...
> I you are not... wasn't your C-only sqrtrem1 for ABI=32 almost ready?
There's some cleaning to do about shift of signed numbers and to better correspond to current mpn_sqrtrem1 function.
That's minor but I'll test it again from inside GMP to be sure.
Maybe not immediately... heavy work days.
More information about the gmp-devel