speed of mpq_cmp vs mpq_sub

Mon May 23 05:58:52 UTC 2016

[Re-sending, a paragraph was in the wrng place, making things
confusing.]

Marc Glisse <marc.glisse at inria.fr> writes:

  Note that after the multiplications by 3 and 5 (or 101 and 103), the
  highparts are different. Or even simpler if den2 = 2 * den1. So this
  would only apply in a few cases.

Right.  My proposal seems rather dense after some sleep.

Perhaps we could modify the code to:

 1. Check for equal denominators.

    The expected runtime for this assuming random operands is O(1).  The
    worst-case runtime is O(n) where n is the total size of the
    operands.  (When the optimisation triggers, the runtime is O(m)
    ... O(n) where m is the total size of the denominators.)

 2. Perform an initial cross-product just using the topmost bits of all
    operands.  Here we could either extract up to GMP_NUMB_BITS/2 bits
    and use plain *, or extract up to GMP_NUMB_BITS and rely on
    umul_ppmm.

    We can use the bit counting code already present to locate the bits
    which we need to extract.

    To make the tests as tight as possible we could use some simple
    interval arithmetic (as opposed to just applying a safety margin).

    This is O(1) but will need perhaps 20 cycles.

To avoid adding overhead for small operands, we could suppress the
trickery for expected products of (say) less than 3 limbs.

-- 
Torbjörn
Please encrypt, key id 0xC8601622