[Request for comments] Potential room for speedup when calculating divmod() of bases with many trailing 0 bits (in binary)
Shlomi Fish
shlomif at gmail.com
Wed Sep 23 06:13:35 UTC 2020
Hi Marco!
On Wed, Sep 23, 2020 at 12:30 AM Marco Bodrato <bodrato at mail.dm.unipi.it> wrote:
>
> Ciao,
>
> Il 2020-09-22 14:18 Shlomi Fish ha scritto:
> > linking with that gnump made wo-shift faster than with-shift with the
> > original gnump:
>
> You mean that your implementation of the trick was not needed any more,
> right?
>
Yes. gnump with your patch applied running the normal mpz_fdiv_qr code
was faster than the unpatched gnump running my shift-divmod wrapping
code.
> > It is still slower than pypy3 running the python code:
>
> Well, maybe there is too much overhead in the benchmark you prepared. I
> glanced at it and saw a lot of _init/_clear, assert, a new type, a
> scratch_type... just for computing some divisions?
I was trying to implement a reusable API in the .h file. The .c
benchmark is somewhat cluttered due to the USE_SHIFT define.
> Or maybe the numbers are not big enough to let GMP really enter in his
> range of usefulness... (if most of the test involves e.g. 32-bits
> numbers, then GMP may be far slower than some other implementations).
>
I think most of them are above 32-bits (given i do `newval = val*val`
at every stage).
> You should try to write more concise benchmarks (someone also asked for
> more concise e-mails :-)
>
I'll see what I can do.
--
Shlomi Fish https://www.shlomifish.org/
Buddha has the Chuck Norris nature.
Please reply to list if it's a mailing list post - http://shlom.in/reply .
More information about the gmp-discuss
mailing list