div_qr_1n_pi1
Niels Möller
nisse at lysator.liu.se
Wed Jun 30 19:42:59 UTC 2021
nisse at lysator.liu.se (Niels Möller) writes:
> nisse at lysator.liu.se (Niels Möller) writes:
>
>> You're idea of conditonally adding the invariant d * B2 at the right
>> place is also interesting,
>
> I've tried it out. Works nicely, but no speedup on my machine. I'm
> attaching another patch. There are then 4 methods:
>
> method 1: Old loop around udiv_qrnnd_preinv.
>
> method 2: The clever code from 10 years ago, with the microoptimization
> I commited the other day.
>
> method 3: More or less the same as I posted a few days ago.
>
> method 4: Postpones the update u1 -= u2 d, off the critical recurrency
> chain. Instead, conditionally adds in the constant B2 (B - d) to the
> lower u limbs.
I'm tempted to commit this code. I.e., new variants (not enabled) +
tuneup changes. To see which variants are favorites on the various test
machines. Should give some guidance as to what's most promising for
assembly implementation.
What do you think?
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list