Division by 9 (and other factors of 2^60 - 1)
Torbjorn Granlund
tg at gmplib.org
Tue Oct 13 10:08:43 CEST 2009
nisse at lysator.liu.se (Niels Möller) writes:
nisse at lysator.liu.se (Niels Möller) writes:
> Here's' some code for dividing using multiplication + Hensel-division
> by 2^60 - 1.
I've tried some timing now, running on shell.gmplib.org (currently a
core 2). The following variant,
Actually a core i7. (This is relevant since umul_ppmm alias the mul asm
instruction runs with a very different timing in core 2 and core i7.
The latter has a 10 cycle latency for the upper word, whereas the former
has a latency of just 8 cycles. On the other hand, i7 has better
throughput.)
(Niels: repentium behind shell is a core 2.)
Repeated divexact_by3 seems to be the most efficient way, so far, to
divide by 9 on x86_64. It might be fun to try to do two iterations of
bdiv_dbm1c as a single loop. That would be able to handle division by
9, 15 and 25.
Two-iteration bdiv_dbm1c is probably the way to go on AMD chips.
--
Torbjörn
More information about the gmp-devel
mailing list