Athlon XP Single limb multiply

Kevin Ryde user42 at
Mon Dec 22 10:20:12 CET 2003

"Rickey Bowers Jr." <bit at> writes:
> Sorry I do not use those tools.  The change basically involves putting zero
> in ECX and moving that value into EBX - reducing the loop code to 14 bytes -
> instead of the 17 bytes currently used.

Are you assuming src==dst?  The normal mpn_mul_1 of course takes a
separate destination pointer, and that uses up all the registers.

Putting the loop counter in memory to free up a register for a zero
would be a possibility though.

> Should I not post any more MASM code?

We're always interested in good, fast, well-tested assembly code, for
any processor, but we have no means to actually run masm syntax.

(Apart maybe from what recent gas can do, which I believe is only a

More information about the gmp-devel mailing list