C implementation of mod_1_1
Niels Möller
nisse at lysator.liu.se
Wed Mar 2 14:12:33 CET 2011
Torbjorn Granlund <tg at gmplib.org> writes:
> Your mod_1 family improvements have moved MOD_1N_TO_MOD_1_1_THRESHOLD
> and MOD_1U_TO_MOD_1_1_THRESHOLD down by a few notches, making the
> average values 4 and 3, respectively.
Nice, but which changes do you think do this? I've hacked on the x86 and
x86_64 assembler, and the udiv_(q?)rnnd_preinv macros.
The new algorithm in mod_1_1.c is, as far as I'm aware, not enabled on
anything (MOD_1_1P_METHOD always 1).
BTW, the mod_1_1 tuning needs some further updates. I think it should be
like this:
* Determine the best MOD_1_1P_METHOD (done by tuneup, but then never
used for anything). Currently uses measurements on 10-limb inputs.
* Choose which mod_1_1p should be used. Always use the native version if
it exists (but maybe measure it and display a warning if it's slower
than the method 1 or method 2 C implementations code). If there's no
native implementation, select the best of method 1 and method 2.
Record the selection as a function pointer.
* Use that pointer when tuning the all other mod_1_1-related thresholds.
Is that right?
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list