get_d_2exp test failures: clang -emit-llvm/-flto/-O4; icc -ipo/-fast

Tue Aug 20 03:45:53 CEST 2013

On 2013-08-18 08:05:59 -0700, bathtubdev wrote:
> When building GMP with Clang and libLTO, t-get_d_2exp fails in the mpz and
> mpf test dirs.
> It seems to occur regardless of other optimizations or use of asm or
> target; I have tested (with LLVM/Clang 3.3 everything from '-O0 -emit-llvm
> -std=gnu99' to '-Ofast -flto -fomit-frame-pointer-fexceptions -fshort-wchar
> -fshort-enums -fstrict-enums -funroll-loops -fvectorize -fslp-vectorize
> -fslp-vectorize-aggressive -march=core-avx-i -mtune=core-avx-i -mllvm
> -x86-use-vzeroupper -std=c11'; and the results are always as seen here:
> http://gist.github.com/raw/6261379.
> 
> After I isolated the failure to LTO, I confirmed this is also an issue when
> built with Intel's link-time inter-procedural optimizer (icc has a few
> other failures as well, but they're not relevant here):
> http://gist.github.com/raw/6261386. Looking through the compiler lint (all
> warnings: http://gist.github.com/raw/6261392) and the static analyzer, some
> interesting things did crop up, but they didn't seem to have to do with
> this issue. Here's the full analysis (base64, txz):
> http://gist.github.com/raw/6261402.
> 
> After a bit more digging, I am of the completely inexpert opinion that this
> is neither a compiler/optimizer bug nor a bug in GMP proper. It's a bug in
> the definitions of the mp[f][z]_get_d_2exp functions themselves, as
> inherited from libc. If I am reading the spec correctly, strictly speaking,
> an IEEE754 'binary64' double with a value >= 2^53 cannot be represented
> with sub-integer precision, so a multiplicand in the range of 0.5<=abs(d)<1
> is ...problematic.

No, I don't see any problem. The GMP manual says for mpz_get_d_2exp:

 -- Function: double mpz_get_d_2exp (signed long int *EXP, mpz_t OP)
     Convert OP to a `double', truncating if necessary (i.e. rounding
     towards zero), and returning the exponent separately.

i.e. the integer is truncated to the precision of the double.
So, 2**54-1 (which is 54 bits 1) would be truncated to 2**54-2.
But according to the result of the test, the memory representation
of the "double" result is the one of 1.0, which is wrong.

Then this is either a bug in the compiler or a bug in GMP. In the
latter case, probably a bug in "mpn/get_d.c", which is a symlink,
necessarily to mpn/generic/get_d.c if I understand correctly (as
there are no other versions). Then the code depends on some macros
defined at configure time, but your bug report is incomplete: one
doesn't have an idea of what macros are defined. You need to provide
the config.log file.

Also, what is the architecture? 32 bits? 64 bits? According to
https://gist.github.com/bathtub/6261386/raw, it's x86_64, but
one can't really be sure.

> This issue doesn't typically arise without LTO, because the the
> executable's code and the code from the library know nothing about
> each other, and therefore the values are kept in an extended
> precision format.

If the architecture is x86_64, then there's no extended precision
for the "double" C type since SSE is used (unless there's a bug in
the compiler). Anyway there are 2 cases concerning get_d.c: either
it considers the first 53 bits and everything is fine, or it
considers more bits (in case the format is not recognized), in
which case the 54-bit exact result 2**54-1 that will be obtained
in the computation will be rounded to 2**54, hence an incorrect
result (the GMP 5.1.2 code seems wrong if FORMAT_RECOGNIZED is not
defined: where does the truncation occur???). But here LTO shouldn't
matter, unless it has an effect on the definition of the macros.

> The but since the inter-procedural optimizers get to optimize the
> code from both pools at once, it sees that there is "no longer a
> need" for this precision and optimizes for efficiency.

No, in theory, LTO mustn't change the behavior, or there's a bug in
the compiler. On x86_64, everything should be fine. On 32-bit x86,
code may be affected by GCC bug 323 (probably not specific to GCC),
depending on the context. But this shouldn't matter (see above).

> These results that are produced are not wrong, mathematically or
> logically,

They are wrong because this is *not* what is documented.
And the test itself is correct.

-- 
Vincent Lefèvre <vincent at vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)