C implementation of mod_1_1

Niels Möller nisse at lysator.liu.se
Wed Mar 2 14:12:33 CET 2011

Torbjorn Granlund <tg at gmplib.org> writes:

> Your mod_1 family improvements have moved MOD_1N_TO_MOD_1_1_THRESHOLD
> and MOD_1U_TO_MOD_1_1_THRESHOLD down by a few notches, making the
> average values 4 and 3, respectively.

Nice, but which changes do you think do this? I've hacked on the x86 and
x86_64 assembler, and the udiv_(q?)rnnd_preinv macros.

The new algorithm in mod_1_1.c is, as far as I'm aware, not enabled on
anything (MOD_1_1P_METHOD always 1).

BTW, the mod_1_1 tuning needs some further updates. I think it should be
like this:

* Determine the best MOD_1_1P_METHOD (done by tuneup, but then never
  used for anything). Currently uses measurements on 10-limb inputs.

* Choose which mod_1_1p should be used. Always use the native version if
  it exists (but maybe measure it and display a warning if it's slower
  than the method 1 or method 2 C implementations code). If there's no
  native implementation, select the best of method 1 and method 2.

  Record the selection as a function pointer.

* Use that pointer when tuning the all other mod_1_1-related thresholds.

Is that right?


Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

More information about the gmp-devel mailing list