[PATCH] Fix wrong code generation for AMD Fam 11h CPUs in 32-bit mode

Mon Mar 5 23:41:42 CET 2012

This fixes a CPU classification error in gmp-5.0.4 and gmp-5.0.3.

The error is that AMD Family 11h CPUs (found mostly in laptops)
are classified by config.guess as "k10", which causes gmp to
pass "-march=amdfam10" to gcc during compilation.

However, AMD Fam 11h is not based on the K10 core but on the older
Fam 0Fh core (K8 or K9).  In particular, Fam 11h lacks the ISA
extensions that K10 has relative to Fam 0Fh.  You can confirm this
by checking the AMD "Bios and Kernel Developer Guide" documents
for Fam 10h and 11h and comparing their "Major Changes Relative
to Family 0Fh processors" sections, inspecting /proc/cpuinfo on
Linux, or using "gcc -v -march=native" and observing the flags
passed to the "cc1" command.

Compiling gmp with gcc -m32 -march=amdfam10 on a Fam 11h CPU
causes massive test suite failures:

- a few tests detect errors and bail:

/tmp/gmp-5.0.4/tests/refmpn.c:1207: GNU MP assertion failed: carry < divisor
/bin/sh: line 5: 22299 Aborted                 ${dir}$tst
FAIL: t-mod_1

/bin/sh: line 5: 22343 Aborted                 ${dir}$tst
FAIL: t-get_d

mpn_hgcd and hgcd_ref returned different values
op1=1BDDFF867272A9296AC493C251D7F46F09A5591FE
op2=2A68A916450A7DE006031068C5DDB0E5C
/bin/sh: line 5: 22837 Aborted                 ${dir}$tst
FAIL: t-hgcd

- many tests crash:

  n    =0x/bin/sh: line 5: 19615 Segmentation fault      ${dir}$tst
FAIL: t-count_zeros
/bin/sh: line 5: 22471 Segmentation fault      ${dir}$tst
FAIL: t-toom22
/bin/sh: line 5: 22494 Segmentation fault      ${dir}$tst
FAIL: t-toom32
/bin/sh: line 5: 22517 Segmentation fault      ${dir}$tst
FAIL: t-toom33
/bin/sh: line 5: 22540 Segmentation fault      ${dir}$tst
FAIL: t-toom42
/bin/sh: line 5: 22563 Segmentation fault      ${dir}$tst
FAIL: t-toom43
/bin/sh: line 5: 22586 Segmentation fault      ${dir}$tst
...

- and other tests hang in apparent infinite loops.

I've reproduced these failures with gcc 4.6.3, 4.5.3, and 4.4.6,
on 64-bit Linux with -m32, and on 32-bit Linux.  However there
is nothing Linux-specific about the issue.

For some reason the test suite does work when compiled with
-m64 and -march=amdfam10, but since -m64 implies many other
code generation changes I consider that a fluke.

gmp-5.0.2 works because in that release Fam 11h isn't recognized
at all so gmp optimizes for a more generic x86 CPU.

The simplest fix is to change config.guess to classify Fam 11h
as "k8" rather than "k10".  That's what the patch below does.

gcc -march=native shows that -march=k8-sse3 is preferred for this
CPU, but getting that automatically would require more changes in
configure.in.  Users can always override CFLAGS if they want to.

/Mikael

2012-03-05  Mikael Pettersson  <mikpe at it.uu.se>

	* config.guess: Correct classification of AMD Fam 11h CPUs.

--- gmp-5.0.4/config.guess.~1~	2012-02-10 11:23:05.000000000 +0100
+++ gmp-5.0.4/config.guess	2012-03-05 22:38:34.000000000 +0100
@@ -808,8 +808,8 @@ main ()
 	case 16:		/* K10 */
 	  cpu_64bit = 1, modelstr = "k10";
 	  break;
-	case 17:		/* AMD Internal, assume future K10 */
-	  cpu_64bit = 1, modelstr = "k10";
+	case 17:		/* Fam 11h, K9-like */
+	  cpu_64bit = 1, modelstr = "k8";
 	  break;
 	case 18:		/* Llano, uses K10 core */
 	  cpu_64bit = 1, modelstr = "k10";