[PATCH] 64-bit Popcount/Hweight for T3 and later

Torbjorn Granlund tg at gmplib.org
Mon Mar 25 20:24:02 CET 2013


David Miller <davem at davemloft.net> writes:

  Technically we could use this on some chips we don't distinguish on
  a fine enough granularity yet.  For example we can assume popc is
  available on T2 as well as UltraSPARC-IV.
  
  But for now, just T3 and later.
  
I suppose we should mention this as a comment in the code.

  I think that popc runs in the multiplier unit on T4, and thus has
  similar characteristics.  It fully pipelines but has a latency of
  12 cycles.
  
That's one deep pipeline!

  2013-03-22  David S. Miller  <davem at davemloft.net>
  
  	* mpn/sparc64/ultrasparct3/hamdist.asm: New file.
  	* mpn/sparc64/ultrasparct3/popcount.asm: New file.
  
The code is in.  Thanks for this contribution!  I also updated the
asm.html tables.  You have a lot of work to do before the T4 column is
filled in with optimal code...

I actually wrote a v9 popcount a while back.  It is about 5 times as
large as yours, and I don't think it runs faster enough be worth it.
I attached it anyway.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sparc64-popcount.asm
Type: application/octet-stream
Size: 2232 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20130325/0ed45407/attachment.obj>
-------------- next part --------------

-- 
Torbj?rn


More information about the gmp-devel mailing list