speed of unbalanced multiplication
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Fri Feb 8 20:29:18 CET 2013
Ciao,
Il Ven, 8 Febbraio 2013 11:42 am, Torbjorn Granlund ha scritto:
> bodrato at mail.dm.unipi.it writes:
> I agree, but ... the only difference I could see on my netbook is not
> memory alignment, but "position".
>
> Was this reproduced on any non-Linux system? Perhaps Linux somehow
> messes up caching and/or TLD for certain address ranges?
The timings I posted on this list was measured on shell, a FreeBSD system.
I tested my patch on my netbook (atom-linux) and shell (K10-fbsd), on both
mul results was worst than mul_n results before the patch, and equivalent
after it.
Removing the (now pushed) patch, on shell I obtain:
$ tune/speed -o addrs -s 800000 mpn_mul_n
mpn_mul_n
dst 801E00040 src 800E00040 801600040 (cf sp approx 7FFFFFFFC37C)
800000 0.646598000
$ tune/speed -o addrs -s 800000 mpn_mul
mpn_mul
dst 801E00040 src 802C00040 801600040 (cf sp approx 7FFFFFFFC36C)
800000 0.680945000
Different addresses, different speed.
With the patch, always on shell, I get:
$ tune/speed -o addrs -s 800000 mpn_mul_n
mpn_mul_n
dst 801E00040 src 800E00040 801600040 (cf sp approx 7FFFFFFFC37C)
800000 0.644599000
$ tune/speed -o addrs -s 800000 mpn_mul
mpn_mul
dst 801E00040 src 800E00040 801600040 (cf sp approx 7FFFFFFFC36C)
800000 0.645062000
Same addresses, same speed.
--
http://bodrato.it/papers/
More information about the gmp-devel
mailing list