mpz_prevprime

Tue Mar 24 17:54:00 UTC 2020

On Tue, Mar 24, 2020 at 9:56 AM Marco Bodrato <bodrato at mail.dm.unipi.it>
wrote:

> Il 2020-03-24 01:10 Seth Troisi ha scritto:
> > (I'm don't think your ulong patch is needed but I can measure at a
>
> It's a small patch, but the speed-up it gives is probably too small to
> be worth adding.
>
> > My code doesn't use a sieve (and instead checks each number for
> > divisors similar to the old code) because the average gap for small
> > numbers is small 300000 / primepi(300000) = 11
>
> There is a small error in the code, it does not handle negative numbers
> correctly.
>
I'll write up a patch for add tests for negative numbers tonight

> For each divisor your code uses two operations, a product (prime*prime)
> and a remainder (t % prime). On many architectures computing t/prime and
> t%prime cost just one operation... that's why I'd use the ideas Torbjorn
> used in isprime() in dumbmp.c,
> https://gmplib.org/repo/gmp/rev/7524222ab7d4 currently in bootstrap.c .
>
> I propose a variation of your patch, you find it attached, on my
> computer it's faster.
>
Couple of small notes but otherwise this looks good

+/* NP_SMALL_LIMIT = prevprime (LAST_PRIME ^ 2) */
In the light of the day this can be bumped slightly to
prevprime(nextprime(LAST_PRIME)^2) - 1 = 316960

+  /* Technically this should be prev_prime(LAST_PRIME ^ 2) */
I'd remove this, It's covered by the comment above

+  if (q < prime)
+    return t;
+  if (r == 0)
+    break;

Can this be rearranged to

 +   if (r == 0)
 +      break;
>+   if (q <= prime)
 +      return t;

+  ASSERT (i < NUMBER_OF_PRIMES);
Should this be placed at the start of the loop or be?
+  ASSERT (i+1 < NUMBER_OF_PRIMES);

> > Because the measured threshold for this was much larger (~80 million),
> > it might actually make sense to use a sieve after some threshold
>
> > If you think that's a 1.5-3x speedup for inputs 16-30bits is important
> > I can try to dynamically generate a longer list of primes
>
> If writing the code is not too complex, it may be interesting to test if
> it is worth.
>
I'll try it out tonight

>
> > On Mon, Mar 23, 2020 at 2:38 PM Marco Bodrato
> > <bodrato at mail.dm.unipi.it> wrote:
>
> >>> It might also be useful to commit tune_nextprime_small which
> >>> dynamically selects a higher reps count for
> >>> mpz_nextprime input and helps increase precision.
> >>
> >> Uhm...
> >> j = 200000 / s->size;
>
> > This is a great point, I modified the code so It only changes the
> > outer loop count.
> > I included a screenshot showing how much this reduces variance over
> > multiple runs.
>
> Did you try tweaking with the -p parameter? It can be useful when a
> measure is "noisy".
>
Nope I had not, using -p3 seems to work just as well.
Thanks :)

>
> >> PS: a comment in the current code says
> >> /* Larger threshold is faster but takes ~O(n/ln(n)) memory.
> >> To be honest, the array sieve needs n/24 bytes, and the array
>
> > You are correct, let's change the comment by which I mean you :)
>
> Done.
>
> Ĝis,
> m
>
> --
> http://bodrato.it/papers/