GMP «Arithmetic without limitations» GMP testing status

This is an automatically generated page of GMP testing results. More recent results are at the top. Click on the dates to see the corresponding log file.

How to interpret the tables

Test results are for this repository: gmp-6.1

There are three tables each with three report categories. The three tables are for unexpected failures, successes, and expected failures. The three report categories are build, check, and tuneup.

Build failures might mean that we have run into a GMP portability problem, or that the build system disk is full, or whatever.

Check failures are more alarming. Isolated failures are typically due to compiler bugs, but it might also be a system dependent GMP bug. Occasionally there are massive failures which means that we introduced a GMP bug.

Build and check failures are marked using red.

Expected build or check failures mean that the systems have some bug which makes them unable to handle the used GMP configurations.

Tuneup failures are not uncommon and can typically be ignored, therefore we indicate them using yellow. A common reason for these failures is that the affected system was too loaded for reliably measuring execution times. Another common cause is poor clock handling in the kernel.

A few years ago we added testing with clang, which resulted in lots of new failures. These are not caused by GMP bugs, but rather by clang bugs, or on some case by compatibility problems between gcc and clang (the latter claims to be gcc and thereby is assumed to be compatible). It is in most cases not possible for the GMP team to resolve this situation. For now, avoid clang if you care about correctness, but if you use clang, by all means remember to run make check after the build! (Please see Table 3 for failure details.)

Spurious SIGKILL of 32-bit processes under the kernel Linux

2018-06-29: Debian 10 started to SIGKILL processes randomly when using the 32-bit ABI on a 64-bit install. The killed processes use minimal resources while there are plenty of available system resources. Presumably this is a kernel bug. Things worked correctly with kernel 4.14.0-3 but started failing after moving to 4.16.0-2.

2018-07-10 update: This seems to be a generic linux kernel problem. After moving the Gentoo systems to 4.14.52, we see random SIGKILL of 32-bit processes there too.

2018-07-21 update: Debian 9 now also fails after their update from 4.9.0-6 to 4.9.0-7.

2018-07-30 update: Ubuntu 18.04 with the latest kernel now also fails. The poorly tested kernel patches are spreading!

There seem to be another problem with newer 64-bit kernels, which is that they crash midway through testing. We only thus far saw this with Debian 4.17 series (but we have not explored which kernel has which bug). It might be the same bug as the one discussed above, i.e., a SIGSEGV is sent to some unsuspecting process; if a system process is hit the system will crash or appear to have crashed.

What is the cause of these Linux problems? We don't know, but this is a time of fervent Meltdown and Spectre workarounds, so perhaps these problems are related to that?

These bugs affect all x86-64 CPUs in the GMP test environment (AMD and Intel, the latest as well as older ones).

(We have re-configured the GMP testing system to treat these systems as unstable. This will make the testing system rerun failing tests and to only report reproducible failures. We also are reverting to older, correctly functioning kernels. As a result, the spurious errors should clear.)

2018-09-23 update: We updated Gentoo/Xen on most systems, to better work around the Meltdown and Spectre hardware bugs. Now we get massive bogus GMP builds/test failures on Intel systems. Why do we even bother trying to write quality software when we are fed with an ever increasing flood of crap? Torbjörn's time spent on GMP development the past 10 months have been perhaps 20 hours, and then perhaps 1000 hours on putting out fires.

Table 1. Unexpected failures
config host ABI build check tuneup mini
TABLE EMPTY: No unexpected failures Table generated: 2018-10-03 18:43:00 (UTC)
Table 2. Successes
config host ABI build check tuneup mini
Total: 0 hosts/configs (since the dawn of time) Table generated: 2018-10-03 18:43:00 (UTC)
Table 3. Expected failures
config host ABI build check tuneup mini
Total: 0 hosts/configs (since the dawn of time) Table generated: 2018-10-03 18:43:00 (UTC)

Table notes (not all failures mentioned here might currently appear above):

  1. The test failures for slmdeb64v8 and glmdeb64v8 when using clang are due to compiler bugs.
  2. Both mips64el and mips64eb using clang fail to build without build-time workarounds. These failures are due to that clang surprisingly by default generates mips32r2 code on mips64r1 platforms. Defaulting to 32-bit code is odd, defaulting to r2 code on r1 hardware is simply silly.
  3. The test failures on mips64el and mips64eb using clang are due to clang compiler bugs.
  4. The test failures on mipsel-deb* and mipseb-deb* for the n32 ABI are due to qemu bugs. (Confirmed for qemu 2.5 through 2.10; only "user-mode" qemu is affected, full system emulation works fine.)
  5. The failures on armhf-wheezy using clang (3.0) are caused by compiler bugs. First the compiler generates code which its built-in assembler cannot handle, but this only generates warnings. Unsurprisingly, compiled binaries crash.
  6. The test failures on power7 using clang (3.5) are almost surely caused by clang compiler bugs. The failures on ppc64el and powerpc-jessie are the same bug.
  7. The build failures on using clang (3.5) are related to the need of a non-standard library.
  8. The test failures on panda using clang (3.5) are caused by a crash in random seeding. It is almost surely due to clang compiler bugs. The test failures on are the same bug.
  9. The build failures on arm64-debv9 using clang (3.6) are caused by compiler crashes.
  10. The build failures on ppc64-debv9 with mode32 using gcc (6.1) are caused by disappeared __int128 support.
  11. The build failures on the many *deb32v7 systems are caused by sloppy command line parsing in clang. It looks like the unusual x32 ABI fails, but it is actually a plain 32-bit build on a 32-bit system which fails to build.


Q1: Why do you test on these host types and not my favourite host type?

A1: We test on the systems available to us.

Q2: Why do you test on several outdated processors?

A2: Testing on a broad range of systems improves portability and broadens the code coverage of the tests. See also A1.

Q3: There are vendor "test drives" where you could get access to more system types. Why don't you take advantage of that?

A3: While providing free access seems nice of the hardware manufacturers, our experience is that it is a lot of work for us to run tests in these environments. Also, we actually read the fine print of the usage agreements, and their terms are often totally unacceptable ("all your code belongs to us").