[Request for comments] Potential room for speedup when calculating divmod() of bases with many trailing 0 bits (in binary)

Marco Bodrato bodrato at mail.dm.unipi.it
Mon Nov 1 10:39:23 UTC 2021


Ciao,

Il 2020-09-21 17:30 Marco Bodrato ha scritto:
> Il 2020-09-14 18:50 Shlomi Fish ha scritto:
>> I was able to improve upon mpz_mod here:

>> This is in case the number that one divides by is itself divisible by 
>> a
>> large power of 2.
> 
> There are many special forms for the divisors that can stimulate
> writing special code to speed up the modulus computation. The form
> k*2^n is one of them.

>> Is there interest in incorporating such a change to the core GMP 
>> library?

> By the way, if you want to play with this, try the below patch for
> GMP, recompile it, and test your "benchmark" again.

I slightly reworked the patch. Should we apply it?

It accelerates the functions mpz_?div_{qr,r} when the denominator has 
low zero limbs.

In the general case, checking should cost a single (does UNLIKELY make 
it well predicted?) branch, and just a couple of additional operations 
on pointers are needed: rp + n0, and dl - n0.

We never specially handle such a simple special case with the mpn layer, 
but it may make sense for the mpz layer. We already handle low zero 
limbs in mpz_powm...

Comments?

diff -r a9440b272ec5 mpz/tdiv_qr.c
--- a/mpz/tdiv_qr.c     Sun Oct 31 14:59:02 2021 +0100
+++ b/mpz/tdiv_qr.c     Mon Nov 01 11:15:08 2021 +0100
@@ -36,7 +36,7 @@
  void
  mpz_tdiv_qr (mpz_ptr quot, mpz_ptr rem, mpz_srcptr num, mpz_srcptr den)
  {
-  mp_size_t ql;
+  mp_size_t ql, n0;
    mp_size_t ns, ds, nl, dl;
    mp_ptr np, dp, qp, rp;
    TMP_DECL;
@@ -95,7 +95,12 @@
        np = tp;
      }

-  mpn_tdiv_qr (qp, rp, 0L, np, nl, dp, dl);
+  for (n0 = 0; UNLIKELY (*dp == 0); ++dp)
+    {
+      rp [n0++] = *np++;
+      --nl;
+    }
+  mpn_tdiv_qr (qp, rp + n0, 0L, np, nl, dp, dl - n0);

    ql -=  qp[ql - 1] == 0;
    MPN_NORMALIZE (rp, dl);
diff -r a9440b272ec5 mpz/tdiv_r.c
--- a/mpz/tdiv_r.c      Sun Oct 31 14:59:02 2021 +0100
+++ b/mpz/tdiv_r.c      Mon Nov 01 11:15:08 2021 +0100
@@ -35,7 +35,7 @@
  void
  mpz_tdiv_r (mpz_ptr rem, mpz_srcptr num, mpz_srcptr den)
  {
-  mp_size_t ql;
+  mp_size_t ql, n0;
    mp_size_t ns, nl, dl;
    mp_ptr np, dp, qp, rp;
    TMP_DECL;
@@ -88,7 +88,12 @@
        np = tp;
      }

-  mpn_tdiv_qr (qp, rp, 0L, np, nl, dp, dl);
+  for (n0 = 0; UNLIKELY (*dp == 0); ++dp)
+    {
+      rp [n0++] = *np++;
+      --nl;
+    }
+  mpn_tdiv_qr (qp, rp + n0, 0L, np, nl, dp, dl - n0);

    MPN_NORMALIZE (rp, dl);

Ĝis,
m

-- 
http://bodrato.it/papers/


More information about the gmp-devel mailing list