Possible new T3-T5 mul_1

Torbjorn Granlund tg at gmplib.org
Tue Apr 2 21:59:17 CEST 2013

David Miller <davem at davemloft.net> writes:

  > main:	sethi	%hi(2800000000), %g5
  > 1:	mulx	%g1, %g1, %g1
  > 	mulx	%g1, %g1, %g1
  > 	mulx	%g1, %g1, %g1
  > 	mulx	%g1, %g1, %g1
  > 	brnz	%g5, 1b
  > 	 dec	%g5
  > 	retl
  > 	 nop
  This runs in 47.234 seconds.
I'd like to understand carry flag renaming,
which we will need for fast addmul_k.

Does things like this run well, i.e., ideally at 5 cycles?
Some chips performs great register renaming, except that
the carry bit is really just one bit.

	.global	main
main:	save	%sp, -176, %sp
	sethi	%hi(2800000000), %g5
1:	addcc	%g7, %g7, %l0
	addxccc	%g7, %g7, %l1
	addxccc	%g7, %g7, %l2
	addxccc	%g7, %g7, %l3
	addcc	%g7, %g7, %l4
	addxccc	%g7, %g7, %l5
	addxccc	%g7, %g7, %l6
	addxccc	%g7, %g7, %l7
	brnz	%g5, 1b
	 dec	%g5


More information about the gmp-devel mailing list