div_qr_1n_pi1

Niels Möller nisse at lysator.liu.se
Wed Jun 30 19:42:59 UTC 2021


nisse at lysator.liu.se (Niels Möller) writes:

> nisse at lysator.liu.se (Niels Möller) writes:
>
>> You're idea of conditonally adding the invariant d * B2 at the right
>> place is also interesting,
>
> I've tried it out. Works nicely, but no speedup on my machine. I'm
> attaching another patch. There are then 4 methods:
>
> method 1: Old loop around udiv_qrnnd_preinv.
>
> method 2: The clever code from 10 years ago, with the microoptimization
>   I commited the other day.
>
> method 3: More or less the same as I posted a few days ago.
>
> method 4: Postpones the update u1 -= u2 d, off the critical recurrency
>   chain. Instead, conditionally adds in the constant B2 (B - d) to the
>   lower u limbs.

I'm tempted to commit this code. I.e., new variants (not enabled) +
tuneup changes. To see which variants are favorites on the various test
machines. Should give some guidance as to what's most promising for
assembly implementation.

What do you think?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list