The new code is faster on most x86-64 machines, see gmplib.org/devel/asm.html. I suppose we should replace the generic/mod_1_1.c? Have you looked into a mod_1_2 using the same ideas? Perhaps it will be tricky to get that to be as fast as possible without further restricting the divisor range? -- Torbjörn