Possible new T3-T5 mul_1

Wed Apr 3 01:05:05 CEST 2013

David Miller <davem at davemloft.net> writes:

  From: Torbjorn Granlund <tg at gmplib.org>
  Date: Tue, 02 Apr 2013 21:59:17 +0200

  > 	.global	main
  > main:	save	%sp, -176, %sp
  > 	sethi	%hi(2800000000), %g5
  > 1:	addcc	%g7, %g7, %l0
  > 	addxccc	%g7, %g7, %l1
  > 	addxccc	%g7, %g7, %l2
  > 	addxccc	%g7, %g7, %l3
  > 	addcc	%g7, %g7, %l4
  > 	addxccc	%g7, %g7, %l5
  > 	addxccc	%g7, %g7, %l6
  > 	addxccc	%g7, %g7, %l7
  > 	brnz	%g5, 1b
  > 	 dec	%g5
  > 	ret
  > 	 restore

  This runs in 4.922 seconds.

Good, so 5 cycles.  (Your system runs not at 2.8 GHz as I assumed, but
slightly more.)

  I have to admit that I'm a bit surprised.

It is not really a high-performance pipeline, but it has some aspects of
high-performance pipelines.  Carry reg renaming has been around since at
least AMD K7.

I rescheduled the addmul_2 and mul_2.  If I have not misunderstood this
pipeline, we should finally reach 3.5 c/l and 3 c/l, respectively.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sparct34-aormul_2.asm
Type: application/octet-stream
Size: 5574 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20130403/51d13694/attachment.obj>
-------------- next part --------------

-- 
Torbj?rn