mpz_perfect_power companion

Mon Aug 19 17:39:21 UTC 2019

Marco pointed out that I incorrectly used 2 << 16 + 1, instead of 1 <<
16 +1.

Now I can see the difference! About 20% faster for sizes below 5000.
paste *.data | pr -t -e20 | tail -n +3 | awk '{ print $0, $4/$2 }'

On Sat, Aug 17, 2019 at 4:02 PM Seth Troisi <braintwo at gmail.com> wrote:

>
>
> On Fri, Aug 16, 2019 at 7:50 AM Marco Bodrato <bodrato at mail.dm.unipi.it>
> wrote:
>
>> Ciao,
>>
>> Il Mar, 26 Marzo 2019 10:09 pm, Seth Troisi ha scritto:
>> > Add companion for mpz_perfect_power_p that returns the exponent and
>> > optionally the root. Possible name mpz_rootexact.
>>
>> > I have experience using gmp and writing C++ code but not a lot of
>> > experience writing clean performant C code. It would be nice to confirm
>> > that the current maintainers have time to review (and help improve)
>> > patches that I'd be writing.
>>
>> Now, that some work started on gcd, I think that the release of a new GMP
>> version will not happen in short times. So yes, I think I'll have the time
>> to review your code!
>>
>> > I thought a good first step would be to add support for the two related
>> > functions (mpz_perfect_power_p, mpz_perfect_square_p) in tune/speed
>> which
>> > I already started on.
>>
>> I just pushed two optimisations that should impact on mpz_perfect_power_p:
>>
>> https://gmplib.org/repo/gmp/rev/042ea5bf2ed1
>> Changed brootinv.c to always exploit the fact that all exponentiations use
>> an even exponent, so that they can be computed as (x^2)^(e/2).
>>
>> https://gmplib.org/repo/gmp/rev/d863c8b31626
>> Some copying around of values is avoided with a flipflop that tracks where
>> the value is and should be... Moreover unneeded initialisations are
>> skipped for small exponents.
>>
>> I'll tried to measured the impact of adding these two patches but it
> seems to not
> make a visual difference (see attached gnuplot)
>
>
>> > https://github.com/sethtroisi/libgmp/pull/1/files
>> >
>> https://patch-diff.githubusercontent.com/raw/sethtroisi/libgmp/pull/1.patch
>>
>> I believe that you measure functions should be able to detect if the two
>> patches above are effective in speeding-up mpz_perfect_power_p.
>>
>> A couple of comments about them:
>>
>> in both SPEED_ROUTINE_MPN_PERFECT_{POWER,SQUARE} you use:
>> s->size * GMP_LIMB_BYTES
>> but I think you are using the wrong constant. You need to estimate the
>> exponent e such that p^e has a given number of limbs, right?
>>
>> The number of bits stored in each limb is GMP_NUMB_BITS. That's why I'd
>> suggest:
>> s->size * GMP_NUMB_BITS
>> instead.
>>
>> Yes that seems more correct. Updated (see
> https://github.com/sethtroisi/libgmp/pull/1)
>
>>
>> Maybe it's better to use mpz_ui_pow_ui and mpz_mul_ui, to avoid storing
>> small numbers in mpz_t variables, if not needed?
>>
>>
>> in SPEED_ROUTINE_MPN_PERFECT_POWER, maybe using
>> p = 2^16+1; r = 2^17 - 1; and log2(r) = 17, can be more clean than using
>> nextprime on fixed constants...
>>
>> This is a good idea, I've adapted it into my patch
>
>
>> For the same reason, I'd suggest the same r for
>> SPEED_ROUTINE_MPN_PERFECT_SQUARE too. But, why not simply squaring a
>> random number of half the size?
>>
>> After I made this change I occasionally see the timing measures a very
> fast value, I
> assume this is because some short circuiting logic allows for faster
> results, is this
> expected? something I should investigate? See 2nd attachment for an
> example.
>
>>
>> Of course in both cases you may need some special handling for small sizes
>> on CPUs with small bit sizes (i.e. when you get power <= 1).
>>
>> I don't understand this fully. I generally passing -s larger than ~200
> but it seems to work
> with any s (it might be measuring the timing of a slightly larger input
> and measuring
> the non-asymptotic behavior).
>
>>
>> In general, a couple of functions more that can be timed is always a nice
>> enhancement for the library...
>>
>>
>> Ĝis,
>> m
>>
>> --
>> http://bodrato.it/papers/
>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: perfect_combined_compare.png
Type: image/png
Size: 23504 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20190819/9f87bf15/attachment-0001.png>