Iterative Cache Optimized Karatsuba-Ofman multiplication
user42 at zip.com.au
Tue Mar 16 23:22:11 CET 2004
Josh Liu <zliu2 at student.gsu.edu> writes:
> I'm currently working on an iterative cache optimized Karatsuba-Ofman
> multiplication algorithm.
You'll want toom3 here, if that's not already what you mean. The
thresholds mean the karatsuba code is used on data that fits in L1
cache already on most systems.
I don't think we've even got an actual analysis of how good or bad the
present situation is in respect of caching.
The assembler code is, in theory, supposed to help in this sort of
area by running at L2 throughput when presented with L2 operands, but
not sure if that happens everywhere, and it's probably slower than L1
anyway, so localization is advantageous.
More information about the gmp-devel