Division by 9 (and other factors of 2^60 - 1)

Torbjorn Granlund tg at gmplib.org
Tue Oct 13 10:08:43 CEST 2009


nisse at lysator.liu.se (Niels Möller) writes:

  nisse at lysator.liu.se (Niels Möller) writes:
  
  > Here's' some code for dividing using multiplication + Hensel-division
  > by 2^60 - 1.
  
  I've tried some timing now, running on shell.gmplib.org (currently a
  core 2). The following variant,
  
Actually a core i7.  (This is relevant since umul_ppmm alias the mul asm
instruction runs with a very different timing in core 2 and core i7.
The latter has a 10 cycle latency for the upper word, whereas the former
has a latency of just 8 cycles.  On the other hand, i7 has better
throughput.)

(Niels: repentium behind shell is a core 2.)

  Repeated divexact_by3 seems to be the most efficient way, so far, to
  divide by 9 on x86_64. It might be fun to try to do two iterations of
  bdiv_dbm1c as a single loop. That would be able to handle division by
  9, 15 and 25.
  
Two-iteration bdiv_dbm1c is probably the way to go on AMD chips.

-- 
Torbjörn


More information about the gmp-devel mailing list