[PATCH v3 0/4] Add addmul_1, addmul_2, and mul_basecase for IBM z13 and later
Torbjörn Granlund
tg at gmplib.org
Fri Jun 23 11:05:56 CEST 2023
These improvements are now (finally!) in GMP repo.
I have not run any timing tests, as I trust you to worry about the
performance.
A mistake we GMP develiopers have made in the past is couting cycles for
inner loops for quite large trip counts, and then accidentally adding
overhead as a side effect of beating down the cycle count. One should
never forget that most bignum computations probably use moderately large
numbers, which means decreasing overhead.
Running commands like
tune/speed -p10000000 -C -s1-100 mpn_mul_basecase
tune/speed -p10000000 -C -s1-100 mpn_addmul_1.0xcafecafecafecafe
are helpful.
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list