[PATCH] mpn/generic/perfsqr: Improve alternate (currently disabled) test.

marco.bodrato at tutanota.com marco.bodrato at tutanota.com
Wed Mar 4 18:26:40 CET 2026


Ciao David,

4 mar 2026, 14:11 da sparks05 at proton.me:

> On Wednesday, March 4th, 2026 at 06:18, marco.bodrato at tutanota.com <marco.bodrato at tutanota.com> wrote:
>
> But yes, I saw and approve of this code...
>

You said it's messy :-)


>> Moreover, before falling back to the square root computation, a limb is added back, if needed,
>> to restore parity.
>>     usize += off;
>>     up -= off;
>>
>
> Ah!  I missed this part, which fixes the error I was thinking of.
>
> My apologies!
>

You mean: that lines deserve a comment.


>> The current "plain C" implementation that we have in longlong.h
>> loops forever if the operand is zero.
>> https://gmplib.org/repo/gmp/file/tip/longlong.h#l2256
>>
> Good point.  I was looking at all the asm implementations and
> forgot about that one!  GCC's __builtin_ctz might do the same.
>

Is it really a good point?
A library using that implementation should also define COUNT_TRAILING_ZEROS_SLOW,
So that the alternative code you proposed can be used instead :-)

If we don't implement a test to define COUNT_TRAILING_ZEROS_SLOW,
we can replace the the old (disabled) code with the new one
(which is interesting also on the _ctz side, because it uses one less branch),
but I'm not sure we should enable it.


> Thank you!
>
Here is another version of the code, with a few more comments
(and I changed the name "off" to "odd").

You can improve it :-D

Ĝis,
m
-------------- next part --------------
A non-text attachment was scrubbed...
Name: perfsqr.diff
Type: text/x-patch
Size: 1916 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20260304/f6e60640/attachment.bin>


More information about the gmp-devel mailing list