bug in gmp_fprintf?

Vincent Lefevre vincent at vinc17.net
Tue Dec 1 16:07:50 UTC 2015


On 2015-12-01 16:12:53 +0100, Leif Leonhardy wrote:
> Vincent Lefevre wrote:
> > Note that my proposal would be backward compatible. Well, almost.
> > It is not clear what GMP should do if the C printf fails because of
> > the overflow on the return value (as mentioned above).
> 
> Well, probably pass the error code through.

Not exactly. In the C printf, there is no error code. A negative
value signals an error, but this value is unspecified. In GMP, the
value -1 (and only this value) signals a write error.

> And then it may not be obvious where the error comes from

In POSIX, one can also check errno, which gives more information about
the error:

  http://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html

But AFAIK, that's POSIX only.

So, in case of error in the C printf, GMP could return -1.

> (i.e., I'm not sure whether e.g. -2 is already used for other
> conditions).

The -2 I suggested was for gmp_*printf. It is not already used.

> But as I said, more importantly I wouldn't consider a successful
> *printf() call writing more than INT_MAX characters "some kind of
> error condition", hence not return any negative value, since most
> programs (if at all) check exactly the latter condition I think,
> with the exception of *snprintf() calls where it's more likely a
> non-negative return value is used as well.

A *printf() call writing more than INT_MAX characters is not
successful. I would say that it is undefined behavior in C99
(and probably in C11 too, but I don't have the final standard)
because this case is not part of a specified case. In POSIX,
it is explicitly described as a failure (but with well-defined
behavior, so that the user can do something useful with this).

> > The GMP documentation also says:
> > 
> >   All the functions can return -1 if the C library `printf' variant in
> >   use returns -1, but this shouldn't normally occur.
> > 
> > I wonder what is meant by that. In case of failure, the printf() of
> > the C library returns a negative value, not necessarily -1.
> 
> Yep, see above.

You are not answering. The GMP documentation says "if the C library
`printf' variant in use returns -1", while there is nothing in the C
standard about the return value -1 specifically; this makes no sense.

> >> returning 'long' for the whole family of *printf()s, e.g. with an 'l'
> >> suffix (or probably prefix).
> > 
> > As you noticed, a "long" may not be sufficient. On 32-bit machines, it
> > is typically of the same size as an int: 32 bits.
> 
> Sure.  So on systems where LONG_MAX isn't bigger than INT_MAX, either
> printing more than INT_MAX characters wouldn't be supported (the
> undocumented status quo even on 64-bit systems), or one would need 'long
> long' (and in addition functions with an 'll' suffix, say), or 'ssize_t'
> and SSIZE_MAX, but the latter again isn't necessarily wider.

For the current functions, it would be better to fix them and document
the behavior, as I've said. I don't see any good reason not to support
them. This is not a clear user error, and undefined behavior is a bad
choice.

> >> The original functions could become wrappers of those, returning just
> >> the number of characters written modulo MAX_INT (to be documented of
> >> course), and the usual [negative] value(s) in case of an error.  (I.e.,
> >> the return type would remain 'int' for these, such that "ordinary" code
> >> wouldn't need any changes.)
> > 
> > It is a very bad idea. For instance, if the function returns 17,
> > one wouldn't know whether the number of characters written is 17
> > or something much higher. So, the return value would not be truly
> > informative *even in common cases*. Sticking to MAX_INT would be
> > much better. Or a specific negative value as I suggested (-1 is
> > reserved for write errors in GMP, so that this is fine).
> 
> I wouldn't say (intentionally) writing more than INT_MAX characters is a
> common case

I've never said that!

> (otherwise the original bug would presumably have been
> reported earlier), and likewise it's IMHO rather unlikely to
> accidentally write exactly N*INT_MAX characters more than intended.
> 
> But while returning INT_MAX when >=INT_MAX characters have been written
> isn't much more informative than returning a modulus either, it's
> probably the better solution, because similar is more commonly used.

This solution is more informative than the modulus solution because
in the common cases, i.e. if the function writes less than INT_MAX
characters, one knows the number of characters written. For instance,
if one gets 17, then this means that 17 characters have been written.
With your modulus solution, if one gets 17, then one doesn't know
whether 17 characters have been written or INT_MAX+17, or more. There
isn't always enough context for the caller to decide. For instance,
the mpfr_*printf functions could be affected. Here's a part of the
current MPFR code (in vasprintf.c), where gmp_vasprintf is involved:

static int
sprntf_gmp (struct string_buffer *b, const char *fmt, va_list ap)
{
  int length;
  char *s;

  length = gmp_vasprintf (&s, fmt, ap);
  if (length > 0)
    buffer_cat (b, s, length);

  mpfr_free_str (s);
  return length;
}

With the modulus solution, the code is plainly wrong, and it is
not possible to detect whether length is not the true length.
Said otherwise, gmp_vasprintf could not be used at all (well,
possibly except with the %n workaround, so that length would be
used only to test whether a write error occurred, i.e. whether
it is -1).

-- 
Vincent Lefèvre <vincent at vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


More information about the gmp-bugs mailing list