sse2

Torbjorn Granlund tege@swox.com
08 Dec 2002 16:49:20 +0100


Sam Halliday <fommil@yahoo.ie> writes:

  just to let people know, incase noone checked already.. i managed to
  build gmp with gcc-3.2.1 on a P4 with the flags
  -O3 -march=pentium4 -fomit-frame-pointer -mmmx -mfpmath=sse -msse2
  to make use of the new sse2 instructions for the P4, and all the make
  checks worked out OK!
  
GMP doesn't use much floating point, at least not in the C
code.  Therefore, enabling sse2 floating point isn't going to
affect the speed of GMP measurably.

For the P4 we use sse2 integer operations in environments
that support that, but currently just on the mmx registers.
That code lives in mpn/x86/pentium4/sse2.  Using xmm
registers and get actual speedups isn't simple, as
throughput generally halve, and latency sometimes quadruple
with the xmm integer operations.

-- 
Torbjörn