[PATCH] 64-bit Popcount/Hweight for T3 and later
Torbjorn Granlund
tg at gmplib.org
Mon Mar 25 20:24:02 CET 2013
David Miller <davem at davemloft.net> writes:
Technically we could use this on some chips we don't distinguish on
a fine enough granularity yet. For example we can assume popc is
available on T2 as well as UltraSPARC-IV.
But for now, just T3 and later.
I suppose we should mention this as a comment in the code.
I think that popc runs in the multiplier unit on T4, and thus has
similar characteristics. It fully pipelines but has a latency of
12 cycles.
That's one deep pipeline!
2013-03-22 David S. Miller <davem at davemloft.net>
* mpn/sparc64/ultrasparct3/hamdist.asm: New file.
* mpn/sparc64/ultrasparct3/popcount.asm: New file.
The code is in. Thanks for this contribution! I also updated the
asm.html tables. You have a lot of work to do before the T4 column is
filled in with optimal code...
I actually wrote a v9 popcount a while back. It is about 5 times as
large as yours, and I don't think it runs faster enough be worth it.
I attached it anyway.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sparc64-popcount.asm
Type: application/octet-stream
Size: 2232 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20130325/0ed45407/attachment.obj>
-------------- next part --------------
--
Torbj?rn
More information about the gmp-devel
mailing list