gmpbench: Ultra40 Solaris10 Studio11

Jens Elkner jel+iws at iws.cs.uni-magdeburg.de
Sun Apr 8 22:20:48 CEST 2007


On Sun, Apr 08, 2007 at 01:06:09PM +0200, Torbjorn Granlund wrote:
> Jens Elkner <jel+iws at iws.cs.uni-magdeburg.de> writes:
> 
>   GMPbench results made on an Sun Ultra 40 (AMD Opteron 245, 2813 MHz)
>   using Solaris Express (SunOS 5.11 b55b) and Sun Studio11 compiler.
>   
> I have a fix for the incompatibility of lshift.asm and rshift.asm.

Sounds good (don't like hacking around in third party software)!

> The fix will be part of with 4.2.2.  Now, there is another problem
> with using the Sun compiler, which is that it doesn't support the
> inline assembly of longlong.h.  That badly hurts your numbers.

Would the following be an option:
"Support For SSE/SSE2 Integral Media Intrinsics

          This release supports intrinsic functions for SSE2
          128-bit XMM register integral media-instructions.
          Include the sunmedia_intrin.h header file in the source
          code and specify the -xbuiltin option to take advantage
          of these functions. Furthermore, these intrinsic func-
          tions require SSE2 support so specify options such as
          -xarch=sse2, -xarch=amd64, or -xtarget=opteron.

          Essentially, the compiler generates inline code for
          these instrinsic functions. This is easier than manipu-
          lating the instructions through assembly language and
          it can be optimized by the compiler.

          For more information about intrinsics, explanations for
          the function prototypes contained in the header files,
          and the data types used by these functions, see the
          'Intel C++ Intrinsics Reference' section of the
          Intel(R) C++ Compiler for Linux Systems manual.
"
Looks like:
extern __m128i _mm_loadl_epi64_64(long long const *p);
extern __m128i _mm_load_si128_64(long long const *p);
extern __m128i _mm_loadu_si128_64(long long const *p);
extern void _mm_storel_epi64_64(long long *p1, __m128i p2);
extern void _mm_store_si128_64(long long *p1, __m128i p2);
extern void _mm_storeu_si128_64(long long *p1, __m128i p2);


> With gcc, your could get about 9600 as the final score.

Yes - with gcc 3.4.3 (the standard one coming with Solaris):

***** GMPbench version 0.1 *****
Using CFLAGS = "-mtune=opteron -march=opteron -m64 -O4 -I/tmp/_root/usr/include" from your environment
Using CC = "gcc" from your environment
Using LIBS = "-L /tmp/_root/usr/lib/amd64 -lgmp" from your environment
Using compilation command: gcc -mtune=opteron -march=opteron -m64 -O4 -I/tmp/_root/usr/include foo.c -o foo -L /tmp/_root/usr/lib/amd64 -lgmp
Using gmp version: 4.2.1
Compiling benchmarks
Running benchmarks
  Category base
    Program multiply
      multiply 128 128
      GMPbench.base.multiply.128,128 result: 27032993
      multiply 512 512
      GMPbench.base.multiply.512,512 result: 6723658
      multiply 8192 8192
      GMPbench.base.multiply.8192,8192 result: 60139
      multiply 131072 131072
      GMPbench.base.multiply.131072,131072 result: 878
      multiply 2097152 2097152
      GMPbench.base.multiply.2097152,2097152 result: 29.6
    GMPbench.base.multiply result: 49056
    Program divide
      divide 8192 32
      GMPbench.base.divide.8192,32 result: 957592
      divide 8192 64
      GMPbench.base.divide.8192,64 result: 992176
      divide 8192 128
      GMPbench.base.divide.8192,128 result: 446274
      divide 8192 4096
      GMPbench.base.divide.8192,4096 result: 113325
      divide 8192 8064
      GMPbench.base.divide.8192,8064 result: 1681442
      divide 131072 8192
      GMPbench.base.divide.131072,8192 result: 2293
      divide 131072 65536
      GMPbench.base.divide.131072,65536 result: 1159
      divide 8388608 4194304
      GMPbench.base.divide.8388608,4194304 result: 2.88
    GMPbench.base.divide result: 29779
  GMPbench.base result: 38221
  Category app
    Program rsa
      rsa 512
      GMPbench.app.rsa.512 result: 13315
      rsa 1024
      GMPbench.app.rsa.1024 result: 2693
      rsa 2048
      GMPbench.app.rsa.2048 result: 424
    GMPbench.app.rsa result: 2477.3
  GMPbench.app result: 2477.3
GMPbench result: 9730.6

Regards,
jens.

BTW: Also tried Studio12 compiler suite - no difference wrt. Studio11
--
Otto-von-Guericke University     http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany         Tel: +49 391 67 12768


More information about the gmp-devel mailing list