Illegal subtraction in tmp-dive_1.s

Fri Apr 17 10:29:36 CEST 2009

Dennis Clarke <dclarke at blastwave.org> writes:

  > Dennis Clarke <dclarke at blastwave.org> writes:
  >
  >   Well, these lines scream endian issues :
  >
  >          want 0x12, 0x34, 0x56, 0x78,
  >          got  0x78, 0x56, 0x34, 0x12,
  >
  >   And I am curious how a compiler can produce that sort of result.
  >
  > Surely a buggy compiler might cause that, but it might be a GMP bug too.
  > I have no way of telling.

  I can now confirm that if I use GCC 4.3.3 as well as ld/as etc from
  recent binutils that the whole damn thing compiles fine and passes all
  tests.

Good.  (And I am glad you didn't use GCC 4.3.2 which miscompiles
mpn/generic/rootrem.c, incidentally!)

  That really bugs me because every compiler, the commercial grade Studio
  packages which are worth a ton of money, fail to build this code.

You should perhaps discuss this with Sun.

  Do I trust GCC and then also trust the test results ?

GCC is not flawless.  I've run into countless bugs in it over the years.
But bugs in GCC tend to get fixed quite quickly, while Sun's tools' bugs
and limitations stick.  (Since GMP now is used by GCC, GCC looks better
to me, presumably since the GCC team runs into their GMP related bugs
before they release.)

  Do I trust that the Sun Studio compilers are doing the "correct" thing in
  accordance with specs and standards and thus GMP is at fault?

  I can't tell.

The only way to know is to debug each problem.  Some (like the assembly
syntax thing) should be very easy.  Other things, like the apparent
endianess problem might take a little more time, unless you're an
experienced bug catcher.

I am sure my attitude "it is the compiler's fault" when I see GMP bug
reports annoy a few people.  I have had to eat my words on a few
occasions, but most of the time I have been right.

Now we have a case with a code base the passes on a very large set of
platforms (http://gmplib.org/devel/testmachines.shtml) admittedly mostly
using GCC, while Sun's compilers fail in all sorts of ways.  I'd say
that this looks like Sun's compilers suck.

GMP has triggered a baffling number of compiler bugs over the years.
I've probably wasted a year of my life isolating, reporting, working
around, compiler bugs.  I've found out that several "commercial grade"
(whatever that means) compilers are released with no or minimal testing.
GCC used to be largely untested too before releases until the early
90'ies, when somebody put together the first test suite for it.

Why does GMP appear to trigger so many compiler bugs?  First, it uses
integer arithmetic to its limits.  GMP expects every operation on
unsigned types to be well-defined (as mandated by the ISO standard).
For example, a = b + 4711, cy = a < 4711, must generate carry-out from
the addition in cy.  Over-smart compiler writers might implement the
second comparison by subtracting 4711 from a, and the check the sign bit
of the difference.  Unfortunately, this is not correct, and I've
explained why to compiler writers a number of times now.

Most programs will work just fine even if these sort of expressions are
miscompiled.

Except for expecting correct expression behaviour, GMP triggers bugs
because its test suite actually covers 100% of the code.  Most programs
have no test suite at all, or very basic test suites.

  I do know that if I set my CFLAGS thus :

  CFLAGS=-march=i486 -mno-mmx -mno-sse -m32 -pthreads

  everything works. every test passes.

  If I try -march=pentiumpro then it compiles fine but fails :

  make[4]: Entering directory
  `/export/home/dclarke/gmp-4.3.0-pentium_pro/tests/mpn'
  PASS: t-asmtype
  PASS: t-aors_1
  PASS: t-divrem_1
  PASS: t-fat
  PASS: t-get_d
  PASS: t-instrument
  PASS: t-iord_u
  PASS: t-mp_bases
  PASS: t-perfsqr
  PASS: t-scan
  FAIL: t-hgcd
  PASS: t-matrix22
  ==================================
  1 of 12 tests failed

You have a bunch of bugs to report to Sun!  ;-)

  > Note that GMP runs on both little-endian and big-endian machines.  And
  > it does run on x86-solaris, so it does not seem like some GMP stupidity
  > about assuming solaris means big endian.

  I do have a full build on Sparc 32-bit and Sparc 64-bit but only the
  libgmp tests run and none of the cxx tests can even be compiled. Again,
  Studio 11 is at work here.

I think our C++ experts have more to say about Sun's C++ compiler.  Marc
and others have fought with several versions of it, but found out that
they all failed compiling GMP.  Marc, am I right here?

(There are some problems with GMP's configure in that it does not always
use the same ABI for C and C++.  That typically shows up as cxx category
check failures.)

  I have been digging into this all day. Really, it looks like it will be
  all night as well. :-(

Happy hacking!  :-)

  > I suspect you'd have a simpler life if you compiled using gcc instead of
  > that pesky "Studio" compiler.  At least, GNU software likes gcc.

  The next logical step after that would be to toss Solaris over my shoulder
  and just say "use Linux" because it is a GNU thing ? Sorry, all my
  production gear is running Solaris on every rev from Solaris 8 to snv_111.

GCC works fine under Solaris.

-- 
Torbjörn