Two 10-lines functions
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Wed Feb 3 16:19:32 CET 2010
Ciao Niels,
> That multiplication should be less than 2^-9 of the total work (I get
Simple but effective estimation, I agree.
> enabling deeper recursion does not justify rounding up the size more
> than a few limbs (say, to a multiple of 8 or 16 or so). Beyond this,
> rounding up is justified by the need to get an appropriate size for the
> largest fft call.
But there are some artefact that I can not understand on timings of mpn_mul.
Yesterday, on shell, I did run:
$ tune/speed -s95000-260000 -t327 -Plargemul mpn_mul
I attach the graph I obtained.
There are steps, and they are expected... but the line between steps is
similar to a saw, and this sounds strange to me!
I thought the reason could be the non-monotonic behaviour of the function
mpn_fft_best_k.
I mean: if the value of k decreases, bnm1_next_size will round to a
smaller power-of-2, resulting in less recursions and reduced freedom for
the non-largest fft call.
I tested the very simple patch below. Results are a very little bit
smoother, but sometimes worst than without the patch (as you can see from
the attached graph). This is not the right direction.
Today I was going back in unanswered mail and I've found the _2expm1 code
and I was tempted to test it... but you are right also that direction
should not give measurable differences for this sizes.
Any other idea?
Timings for mpn_mul should be almost monotonic, right?
Regards,
Marco
--
http://bodrato.it/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: large_mul.png
Type: image/png
Size: 5500 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20100203/1aaca17e/attachment.png>
More information about the gmp-devel
mailing list