bug in gmp_fprintf?
Leif Leonhardy
not.really at online.de
Tue Dec 1 15:12:53 UTC 2015
Vincent Lefevre wrote:
> On 2015-12-01 03:04:49 +0100, Leif Leonhardy wrote:
>> Vincent Lefevre wrote:
>>> On 2015-11-30 17:57:13 +0100, Torbjörn Granlund wrote:
>>>> I should add that a big problem with gmp_*printf remains:
>>>>
>>>> When printing more than MAX_INT characters, the return value makes
>>>> little sense.
>>
>> The existing functions should IMHO never return a negative value upon
>> success.
>
> Printing more than MAX_INT characters can be regarded as some kind
> of failure (distinct from write error).
>
>>> C's printf has the same problem, in particular for GNU libc:
>>>
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=5424
>>>
>>> (a bug I reported 8 years ago). This has been solved by returning
>>> a negative value and setting errno to EOVERFLOW so that it is not
>>> confused with a real error. However this is less a problem in the
>>> standard printf because it does not have to deal with high-precision
>>> numbers, so that in practice, there are workarounds to avoid this
>>> overflow if one really wants to.
>>>
>>>> In order to fix that, we need to change the return value of these
>>>> functions from int to e.g. long. But that's a change which is not 100%
>>>> source or binary compatible.
>>>
>>> Alternatively, you could decide to return -1 in case of true error
>>> and -2 in case of overflow on the return value (since you may not
>>> want to use errno). To get the number of characters written, the
>>> user could still use %n with an adequate length modifier, so that
>>> there is no loss of information.
>>
>> For backwards compatibility, I'd rather introduce /new/ functions
>
> Note that my proposal would be backward compatible. Well, almost.
> It is not clear what GMP should do if the C printf fails because of
> the overflow on the return value (as mentioned above).
Well, probably pass the error code through. And then it may not be
obvious where the error comes from (i.e., I'm not sure whether e.g. -2
is already used for other conditions). But as I said, more importantly
I wouldn't consider a successful *printf() call writing more than
INT_MAX characters "some kind of error condition", hence not return any
negative value, since most programs (if at all) check exactly the latter
condition I think, with the exception of *snprintf() calls where it's
more likely a non-negative return value is used as well.
> The GMP documentation also says:
>
> All the functions can return -1 if the C library `printf' variant in
> use returns -1, but this shouldn't normally occur.
>
> I wonder what is meant by that. In case of failure, the printf() of
> the C library returns a negative value, not necessarily -1.
Yep, see above.
>> returning 'long' for the whole family of *printf()s, e.g. with an 'l'
>> suffix (or probably prefix).
>
> As you noticed, a "long" may not be sufficient. On 32-bit machines, it
> is typically of the same size as an int: 32 bits.
Sure. So on systems where LONG_MAX isn't bigger than INT_MAX, either
printing more than INT_MAX characters wouldn't be supported (the
undocumented status quo even on 64-bit systems), or one would need 'long
long' (and in addition functions with an 'll' suffix, say), or 'ssize_t'
and SSIZE_MAX, but the latter again isn't necessarily wider.
>> The original functions could become wrappers of those, returning just
>> the number of characters written modulo MAX_INT (to be documented of
>> course), and the usual [negative] value(s) in case of an error. (I.e.,
>> the return type would remain 'int' for these, such that "ordinary" code
>> wouldn't need any changes.)
>
> It is a very bad idea. For instance, if the function returns 17,
> one wouldn't know whether the number of characters written is 17
> or something much higher. So, the return value would not be truly
> informative *even in common cases*. Sticking to MAX_INT would be
> much better. Or a specific negative value as I suggested (-1 is
> reserved for write errors in GMP, so that this is fine).
I wouldn't say (intentionally) writing more than INT_MAX characters is a
common case (otherwise the original bug would presumably have been
reported earlier), and likewise it's IMHO rather unlikely to
accidentally write exactly N*INT_MAX characters more than intended.
But while returning INT_MAX when >=INT_MAX characters have been written
isn't much more informative than returning a modulus either, it's
probably the better solution, because similar is more commonly used.
-leif
More information about the gmp-bugs
mailing list