Overhead vs cache effectiveness
Wed, 27 Nov 2002 09:08:27 +1000
Torbjorn Granlund <email@example.com> writes:
> Experiments on Alpha ev6
> indicate that we can double the mpn_add_n speed for huge
> operands. Likewise, ev6 mpn_addmul_1 can become 20% faster
> for the basecase multiply range.
I guess the first thing is to look for that sort of thing, ie. where
improvements should be made. No doubt there's some chips where the
code is already as good as it can be (old stuff like p5, smart
prefetchers like ppc630, if rumours about that chip are true).
Unfortunately I'm not sure if speed.c does a very good job measuring
L2 or main memory.
Taking the difference in speed between operations of say 256kbytes and
257kbytes ought to give a likely result, but maybe there'd be some way
to flush L1 and run out of L2 from the start, or force a minimum
number of repetitions of a routine so it gets into a steady state, or
something like that.