binvert_limb speedup on 64 bit machines with UHWtype
Marco Bodrato
bodrato at mail.dm.unipi.it
Tue Mar 1 10:59:38 CET 2022
Ciao,
Il 2022-02-27 16:52 Marco Bodrato ha scritto:
> Il 2022-02-25 17:06 John Gatrell ha scritto:
>> I tested using UHWtype in the macro for binvert_limb. On a 64 bit
>> machine
>> one of my programs gained a 3% speedup. On a 32 bit machine, there was
>> no
> Should we use uint8_fast_t, uint16_fast_t, uint32_fast_t for the
> different levels, and let the compiler choose? :-D
I tried code with uint_fast types, but it seems that the compiler is not
choosing the faster type, the 64-bits type is always used :-(
You should try to store also the 32-bits result into the half-type.
I mean: try replacing the following two lines in your code
__inv = 2 * __hinv - __hinv * __hinv * __n; /* 32 */
\
__inv = 2 * __inv - __inv * __inv * __n; /* 64 */
\
with
__hinv = 2 * __hinv - __hinv * __hinv * __n; /* 32 */
\
__inv = 2 * (mp_limb_t)__hinv - (mp_limb_t)__hinv * __hinv * __n; /*
64 */ \
Ĝis,
m
More information about the gmp-devel
mailing list