get_d_2exp test failures: clang -emit-llvm/-flto/-O4; icc -ipo/-fast

Sun Aug 18 17:05:59 CEST 2013

Masters of the bignum! I think I have found a bit of an issue in your
test-suite.

When building GMP with Clang and libLTO, t-get_d_2exp fails in the mpz and
mpf test dirs.
It seems to occur regardless of other optimizations or use of asm or
target; I have tested (with LLVM/Clang 3.3 everything from '-O0 -emit-llvm
-std=gnu99' to '-Ofast -flto -fomit-frame-pointer-fexceptions -fshort-wchar
-fshort-enums -fstrict-enums -funroll-loops -fvectorize -fslp-vectorize
-fslp-vectorize-aggressive -march=core-avx-i -mtune=core-avx-i -mllvm
-x86-use-vzeroupper -std=c11'; and the results are always as seen here:
http://gist.github.com/raw/6261379.

After I isolated the failure to LTO, I confirmed this is also an issue when
built with Intel's link-time inter-procedural optimizer (icc has a few
other failures as well, but they're not relevant here):
http://gist.github.com/raw/6261386. Looking through the compiler lint (all
warnings: http://gist.github.com/raw/6261392) and the static analyzer, some
interesting things did crop up, but they didn't seem to have to do with
this issue. Here's the full analysis (base64, txz):
http://gist.github.com/raw/6261402.

After a bit more digging, I am of the completely inexpert opinion that this
is neither a compiler/optimizer bug nor a bug in GMP proper. It's a bug in
the definitions of the mp[f][z]_get_d_2exp functions themselves, as
inherited from libc. If I am reading the spec correctly, strictly speaking,
an IEEE754 'binary64' double with a value >= 2^53 cannot be represented
with sub-integer precision, so a multiplicand in the range of 0.5<=abs(d)<1
is ...problematic. This issue doesn't typically arise without LTO, because
the the executable's code and the code from the library know nothing about
each other, and therefore the values are kept in an extended precision
format. The but since the inter-procedural optimizers get to optimize
the code from both pools at once, it sees that there is "no longer a need"
for this precision and optimizes for efficiency.

These results that are produced are not wrong, mathematically or logically,
they just are wrong because the definition is a value less
0.5<=abs(d)<1, 1 exclusive.
But 1·2^n obviously a better representation than 0.5·2^n+1; and so we get
1·2^n, proscribed rules be damned.

I don't know if this is a sign of something deeper amiss. I think not, but
if anyone want's to take a look, here is the intermediate temp code for the
mpz t-gets_2exp: http://gist.github.com/6261423, and here is the actual
LLVM IR object/assembly, http://gist.github.com/6261433.
Perhaps someone who can better parse this stuff can shed some light.

Otherwise, here is
my attempt at a 'light-touch'

patch to correct

 for these aberrant
 values, within the test programs as they are without having to change the
definition of the function or expand the
accepted criteria, at least for now.

Patch:
 http://gist.github.com/raw/6261452

Best,
G