Two 10-lines functions

bodrato at mail.dm.unipi.it bodrato at mail.dm.unipi.it
Wed Feb 3 16:19:32 CET 2010


Ciao Niels,

> That multiplication should be less than 2^-9 of the total work (I get

Simple but effective estimation, I agree.

> enabling deeper recursion does not justify rounding up the size more
> than a few limbs (say, to a multiple of 8 or 16 or so). Beyond this,
> rounding up is justified by the need to get an appropriate size for the
> largest fft call.

But there are some artefact that I can not understand on timings of mpn_mul.
Yesterday, on shell, I did run:
$ tune/speed -s95000-260000 -t327 -Plargemul mpn_mul

I attach the graph I obtained.

There are steps, and they are expected... but the line between steps is
similar to a saw, and this sounds strange to me!

I thought the reason could be the non-monotonic behaviour of the function
mpn_fft_best_k.
I mean: if the value of k decreases, bnm1_next_size will round to a
smaller power-of-2, resulting in less recursions and reduced freedom for
the non-largest fft call.
I tested the very simple patch below. Results are a very little bit
smoother, but sometimes worst than without the patch (as you can see from
the attached graph). This is not the right direction.

Today I was going back in unanswered mail and I've found the _2expm1 code
and I was tempted to test it... but you are right also that direction
should not give measurable differences for this sizes.

Any other idea?
Timings for mpn_mul should be almost monotonic, right?

Regards,
Marco

-- 
http://bodrato.it/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: large_mul.png
Type: image/png
Size: 5500 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20100203/1aaca17e/attachment.png>


More information about the gmp-devel mailing list