# speed of mpq_cmp vs mpq_sub

Marc Glisse marc.glisse at inria.fr
Mon May 23 07:44:14 UTC 2016

```On Mon, 23 May 2016, Torbjörn Granlund wrote:

> Marc Glisse <marc.glisse at inria.fr> writes:
>
>  Note that after the multiplications by 3 and 5 (or 101 and 103), the
>  highparts are different. Or even simpler if den2 = 2 * den1. So this
>  would only apply in a few cases.
>
> Right.  My proposal seems rather dense after some sleep.
>
> Perhaps we could modify the code to:
>
> 1. Check for equal denominators.
>
>    The expected runtime for this assuming random operands is O(1).  The
>    worst-case runtime is O(n) where n is the total size of the
>    operands.  (When the optimisation triggers, the runtime is O(m)
>    ... O(n) where m is the total size of the denominators.)
>
> 2. Perform an initial cross-product just using the topmost bits of all
>    operands.  Here we could either extract up to GMP_NUMB_BITS/2 bits
>    and use plain *, or extract up to GMP_NUMB_BITS and rely on
>    umul_ppmm.
>
>    We can use the bit counting code already present to locate the bits
>    which we need to extract.
>
>    To make the tests as tight as possible we could use some simple
>    interval arithmetic (as opposed to just applying a safety margin).
>
>    This is O(1) but will need perhaps 20 cycles.
>
> To avoid adding overhead for small operands, we could suppress the
> trickery for expected products of (say) less than 3 limbs.

Just checking: I don't think this will help when comparing p/q with
(2p+1)/2q. Anything that doesn't somehow involve the gcd of the
denominators will still require the full multiplications and be much
slower than a subtraction for that case. But that's probably inevitable,
since computing the gcd would likely be slower in many cases.

(I think in the program I posted the denominators are all powers of 2,
which helps for the gcd computation...)

Now your proposition may still help in other cases. The initial
cross-product / interval arithmetic thing sounds like it would handle
quite a few more cases than the current bitlength tests. In particular,
the *101 / *103 should become fast :-)

--
Marc Glisse
```