[PATCH] mpn/generic/perfsqr: Improve alternate (currently disabled) test.
marco.bodrato at tutanota.com
marco.bodrato at tutanota.com
Wed Mar 4 18:26:40 CET 2026
Ciao David,
4 mar 2026, 14:11 da sparks05 at proton.me:
> On Wednesday, March 4th, 2026 at 06:18, marco.bodrato at tutanota.com <marco.bodrato at tutanota.com> wrote:
>
> But yes, I saw and approve of this code...
>
You said it's messy :-)
>> Moreover, before falling back to the square root computation, a limb is added back, if needed,
>> to restore parity.
>> usize += off;
>> up -= off;
>>
>
> Ah! I missed this part, which fixes the error I was thinking of.
>
> My apologies!
>
You mean: that lines deserve a comment.
>> The current "plain C" implementation that we have in longlong.h
>> loops forever if the operand is zero.
>> https://gmplib.org/repo/gmp/file/tip/longlong.h#l2256
>>
> Good point. I was looking at all the asm implementations and
> forgot about that one! GCC's __builtin_ctz might do the same.
>
Is it really a good point?
A library using that implementation should also define COUNT_TRAILING_ZEROS_SLOW,
So that the alternative code you proposed can be used instead :-)
If we don't implement a test to define COUNT_TRAILING_ZEROS_SLOW,
we can replace the the old (disabled) code with the new one
(which is interesting also on the _ctz side, because it uses one less branch),
but I'm not sure we should enable it.
> Thank you!
>
Here is another version of the code, with a few more comments
(and I changed the name "off" to "odd").
You can improve it :-D
Ĝis,
m
-------------- next part --------------
A non-text attachment was scrubbed...
Name: perfsqr.diff
Type: text/x-patch
Size: 1916 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20260304/f6e60640/attachment.bin>
More information about the gmp-devel
mailing list