GMP 5.1.1: Valgrind reports incorrect read in __gmpn_copyd (called from __gmpz_mul_2exp)

Mon Mar 4 18:17:01 CET 2013

FAIL, I sent this only to Marco.

2013/2/23  <bodrato at mail.dm.unipi.it>:
> Ciao,
>
>> If the code wants to access memory as if it were allocated up to a
>> 16-byte-aligned boundry, why not allocate enough up to a
>> 16-byte-aligned boundry?
>
> The code only loads in a register (and ignore) some bytes in the same
> cache-line the CPU have read from memory. It does not need to have them
> reserved "as if it were allocated". Allocating them would waste memory.
>
> By the way, is there a way to ask malloc a 16-byte-aligned block?

Maybe GMP could have configure look for a function to align limb
arrays at multiples of 16 (or higher) on systems where SIMD wants to
operate on aligned data, to avoid a speed penalty from working on
partial data at loop start. When such a function is available, GMP
could as well ensure alignment of the end of the allocated memory. If
no alignment functions are available, there's not much we can do,
short of telling the memory checker to shut up about it. If the start
of each allocated block is 16-aligned, then making the end 16-aligned
would not waste memory - it would just avoid 8-byte gaps. There would
be some additional slack, however, on programs that frequently
alternate between GMP allocations and own allocations of lesser
alignment. I don't know if this is a serious concern.

>> I.e., instead of telling the memory checker
>> to ignore those invalid accesses, actually making them valid
>
> You are right, this is not a very good strategy. The best one I can see is:
>  - patch Valgrind so that --partial-loads-ok=yes works also for SSE etc...;
>  - patch Valgrind so that this option can be selectively activated on a
> per-function basis;
>  - write a short list (for each ABI) of functions that need this option.

The advantage of this approach is that it detects faulty accesses
which aligning the end of allocated space would hide, e.g., writing
beyond what would have been the un-aligned end. The disadvantage is
that it is specific to one memory checker (Valgrind), and needs new
functionality added to it.

So far I think there are 4 options:

1. Just use --partial-loads-ok=yes. Easy, works with existing Valgrind
patch, nothing to change in GMP, but weakens the checks. Other memory
checker may still complain.
2. --partial-loads-ok=yes on per-function basis. Stricter checking
than 1., but needs to get changes into Valgrind. Other memory checker
may still complain.
3. Align end of allocated memory. Needs aligned malloc, some changes
to GMP, and checks are weaker than 1. Works with all memory checkers
tho.
4. Align end of allocated memory, use client requests to tell Valgrind
what to check. Perfect strictness with Valgrind if desired, no false
positives with any checker, but needs aligned malloc and probably a
lot of scattered changes to GMP to add the client requests.

Any other ideas? Note: Marco pointed out to me that the memory
alignment requirement of the last two can already be implemented by
using custom allocation methods. This is certainly true, but I think
it would be worthwhile to have the *default* build of GMP avoid
spurious Valgrind errors. If you are familiar with GMP, it's easy
enough to hackattack the code to make the errors messages go away, but
considering how widely used GMP is now, it would be nice if users
didn't have to.

> In the meanwhile, there are some possible workaround. I rewrite a line you
> may have missed in my previous message.
>
> Try to "rm $(grep -rl fastsse mpn/x86_64)", before "./configure" .

I saw it but did not try it yet. In fact, the only "invalid read"
errors I saw so far were all from copy[di](), so hiding the SSE
versions of those might perhaps be enough. I think it would be
worthwhile to avoid spurious Valgrind (or other memory checker) errors
with a plain vanilla GMP, though, as not everyone may be able or
willing to link against a modified GMP for debugging purposes.

Alex