Iterative Cache Optimized Karatsuba-Ofman multiplication algorithm

Kevin Ryde user42 at
Tue Mar 16 23:22:11 CET 2004

Josh Liu <zliu2 at> writes:
> I'm currently working on an iterative cache optimized Karatsuba-Ofman
> multiplication algorithm.

You'll want toom3 here, if that's not already what you mean.  The
thresholds mean the karatsuba code is used on data that fits in L1
cache already on most systems.

I don't think we've even got an actual analysis of how good or bad the
present situation is in respect of caching.

The assembler code is, in theory, supposed to help in this sort of
area by running at L2 throughput when presented with L2 operands, but
not sure if that happens everywhere, and it's probably slower than L1
anyway, so localization is advantageous.

More information about the gmp-devel mailing list