binvert_limb speedup on 64 bit machines with UHWtype

Marco Bodrato bodrato at mail.dm.unipi.it
Tue Mar 1 10:59:38 CET 2022


Ciao,

Il 2022-02-27 16:52 Marco Bodrato ha scritto:
> Il 2022-02-25 17:06 John Gatrell ha scritto:
>> I tested using UHWtype in the macro for binvert_limb. On a 64 bit 
>> machine
>> one of my programs gained a 3% speedup. On a 32 bit machine, there was 
>> no

> Should we use uint8_fast_t, uint16_fast_t, uint32_fast_t for the
> different levels, and let the compiler choose? :-D

I tried code with uint_fast types, but it seems that the compiler is not 
choosing the faster type, the 64-bits type is always used :-(

You should try to store also the 32-bits result into the half-type.

I mean: try replacing the following two lines in your code
     __inv = 2 * __hinv - __hinv * __hinv * __n;  /* 32 */               
\
     __inv = 2 * __inv - __inv * __inv * __n;     /* 64 */               
\
with
     __hinv = 2 * __hinv - __hinv * __hinv * __n;  /* 32 */               
\
     __inv = 2 * (mp_limb_t)__hinv - (mp_limb_t)__hinv * __hinv * __n; /* 
64 */ \

Ĝis,
m


More information about the gmp-devel mailing list