taku2ro at gmail.com
Sat Oct 29 03:34:11 CEST 2011
I am using GMP for computing modular exponentiation by repeated
squaring (since the power are always a power of 2 and the modulus has
such a good property that a special way of computing remainder can be
used, I'm doing it by myself instead of using mpz_powm().) The modulus
is bigger than MUL_FFT_THRESHOLD and the time needed by multiplication
accounts for almost all of the running time of my application.
I have an Intel Core 2 Quad processor, and would like my application to
scale on it. Since I don't know how to introduce parallelism in the
repeated squaring part (if you do, please let me know), I am planning
to parallelize in the library level by OpenMP. I don't know much about
how the algorithms used are combined (it seems more complicated than the
manual describes), but firstly I added `#pragma omp for' line at the
beginning of the for loop in the base case of mpn_fft_mul_modF_K().
The running time almost halved, but since there are four cores, I would
like the application to run more faster.
Do you know other good places to put OpenMP directives?
Kentaro Yamamoto <call at taku2ro.dyndns.tv>
More information about the gmp-discuss