bodrato at mail.dm.unipi.it
Tue Feb 11 22:13:01 UTC 2020
Il 2020-02-10 21:01 nisse at lysator.liu.se ha scritto:
> On my laptop, it gives a speedup of about 25% for larger sizes. Not
> how to tune for small sizes, but clearly the old code clamping the size
> of the prime table based on the bit size is better than doing nothing.
> The computation of all the moduli could be speed up by using ptab, and
> by using precomputed reciprocals. Not clear to me how much the speed of
> that part matters. I haven't run any profiling.
How often numbers pass the sieving if we use trial division up to
Approximately 1/(ln(primelimit)*e^g), where g is the Eulero-Mascheroni
Assume, we had primelimit = e^n and we increase it to e^(n+1).
The number of primes approximately was e^n/n, now it is e^(n+1)/(n+1).
1/(n*e^g) passed trial division, now only 1/((n+1)*e^g) pass it.
For the numbers that did not pass trial division, nothing changes.
For the numbers that did pass trial division, we add e^(n+1)/(n+1)-e^n/n
tests = e^n*(e-1-1/n)/(n+1). Can we assume a constant cost for each one
of this tests?
But for one every (n+1) of them, we can avoid any further test (they are
detected by the new trial divisions). Can we assume that the cost for
the additional test basically is the cost of Miller-Rabin? Let's call
this cost MR(x).
So, increasing primelimit from e^n to e^(n+1), we pay
e^n*(e-1-1/n)/(n+1) and we gain MR(x)/(n+1). It is worth doing if
primelimit (e^n) is comparable with the cost MR(x). But the cost of
Miller-Rabin is more than quadratic on the bit-size of the number x,
Maybe my estimates are wrong. If they are not, the limit for trial
division should be increased more than linearly on the size of the
numbers that are tested.
A possible error in my analysis: I assumed a "constant cost" for each
new prime. That's true when updating the residues, but each new number
in the initial loop on _tdiv_ui imply a cost that is linear on the
bit-size of the number x.
This observation gives again an almost linear relation between
primelimit and the bit-size of the number... and says that the speed of
the "update residues" loop is not so relevant, at least for large
numbers; but the speed of the initial moduli probably IS relevant.
More information about the gmp-devel