Several Points ... dwt
user42 at zip.com.au
Sat May 1 00:02:22 CEST 2004
Josh Liu <zliu2 at student.gsu.edu> writes:
> On a side note, simple profiling indicates that low level functions,
> mainly the negation function,
> takes up as much as 30% of the running
> time of the Sch\"onhage-Strassen algorithm. Perhaps it is better to
> use prefetching and non-temporal writing in the implementation of the
> complement function, instead of the macro that is currently
Perhaps new code could combine it into another operation being done at
the same time (a shift or add say). Unless a separately maintained
sign bit could dispense with any negating.
> or faster addition and copying operations are required.
The first step is always to find out how fast they're going now. If
for instance they're already L2 or main memory throughput speed then
there's nothing more can be done.
More information about the gmp-devel