[Request for comments] Potential room for speedup when calculating divmod() of bases with many trailing 0 bits (in binary)
Marco Bodrato
bodrato at mail.dm.unipi.it
Mon Nov 1 10:39:23 UTC 2021
Ciao,
Il 2020-09-21 17:30 Marco Bodrato ha scritto:
> Il 2020-09-14 18:50 Shlomi Fish ha scritto:
>> I was able to improve upon mpz_mod here:
>> This is in case the number that one divides by is itself divisible by
>> a
>> large power of 2.
>
> There are many special forms for the divisors that can stimulate
> writing special code to speed up the modulus computation. The form
> k*2^n is one of them.
>> Is there interest in incorporating such a change to the core GMP
>> library?
> By the way, if you want to play with this, try the below patch for
> GMP, recompile it, and test your "benchmark" again.
I slightly reworked the patch. Should we apply it?
It accelerates the functions mpz_?div_{qr,r} when the denominator has
low zero limbs.
In the general case, checking should cost a single (does UNLIKELY make
it well predicted?) branch, and just a couple of additional operations
on pointers are needed: rp + n0, and dl - n0.
We never specially handle such a simple special case with the mpn layer,
but it may make sense for the mpz layer. We already handle low zero
limbs in mpz_powm...
Comments?
diff -r a9440b272ec5 mpz/tdiv_qr.c
--- a/mpz/tdiv_qr.c Sun Oct 31 14:59:02 2021 +0100
+++ b/mpz/tdiv_qr.c Mon Nov 01 11:15:08 2021 +0100
@@ -36,7 +36,7 @@
void
mpz_tdiv_qr (mpz_ptr quot, mpz_ptr rem, mpz_srcptr num, mpz_srcptr den)
{
- mp_size_t ql;
+ mp_size_t ql, n0;
mp_size_t ns, ds, nl, dl;
mp_ptr np, dp, qp, rp;
TMP_DECL;
@@ -95,7 +95,12 @@
np = tp;
}
- mpn_tdiv_qr (qp, rp, 0L, np, nl, dp, dl);
+ for (n0 = 0; UNLIKELY (*dp == 0); ++dp)
+ {
+ rp [n0++] = *np++;
+ --nl;
+ }
+ mpn_tdiv_qr (qp, rp + n0, 0L, np, nl, dp, dl - n0);
ql -= qp[ql - 1] == 0;
MPN_NORMALIZE (rp, dl);
diff -r a9440b272ec5 mpz/tdiv_r.c
--- a/mpz/tdiv_r.c Sun Oct 31 14:59:02 2021 +0100
+++ b/mpz/tdiv_r.c Mon Nov 01 11:15:08 2021 +0100
@@ -35,7 +35,7 @@
void
mpz_tdiv_r (mpz_ptr rem, mpz_srcptr num, mpz_srcptr den)
{
- mp_size_t ql;
+ mp_size_t ql, n0;
mp_size_t ns, nl, dl;
mp_ptr np, dp, qp, rp;
TMP_DECL;
@@ -88,7 +88,12 @@
np = tp;
}
- mpn_tdiv_qr (qp, rp, 0L, np, nl, dp, dl);
+ for (n0 = 0; UNLIKELY (*dp == 0); ++dp)
+ {
+ rp [n0++] = *np++;
+ --nl;
+ }
+ mpn_tdiv_qr (qp, rp + n0, 0L, np, nl, dp, dl - n0);
MPN_NORMALIZE (rp, dl);
Ĝis,
m
--
http://bodrato.it/papers/
More information about the gmp-devel
mailing list