Add mpz_inp_str to mini-gmp

Niels Möller nisse at lysator.liu.se
Sat Jul 9 20:21:46 UTC 2016


Austyn Krutsinger <akrutsinger at gmail.com> writes:

> As I have it now, the only code that is duplicated between my mpz_inp_str
> and mpz_set_str is the removal of any leading white spaces between in the
> string.

I think that duplication is fine. One could add a helper function to eat
whitespace, but I don't think that really makes things simpler.

I'd like to hear Torbjörn's opinion on using white space as terminator,
i.e., reading a file containing "12345678 " with base 8 will result in
an error, which I suspect differs from the behavior of mpz_inp_str in
the real gmmp.

Some more detailed comments further below.

> I have updated mini-gmp to support up to base 62 also.

Please do this as a separate patch.

> I'm really not sure of the optimal way to
> allocate then realloc memory for the input buffer. I suppose some kind of
> statistical analysis of the size of strings read in from file streams would
> be ideal, but since I don't have such statistics, I have arbitrarily chosen
> the sizes you see.

I think typical cases will be quite small. I'd suggest an initial size
of 100 characters, and then increasing the size by a factor of 3/2
at a time. Possibly with some arbitrary upper limit to fail in a
predictable way for a malicious input file.

> int
> mpz_inp_str (mpz_t x, FILE *stream, int base)
> {
>   int c;
>   size_t nread = 0;
>   size_t size = 1000;
>
>   if (stream == 0)

Either "!stream",  or "stream == NULL".

>     stream = stdin;
>
>   char *buf = (char *) gmp_xalloc (size);

The (char *) cast is unnecessary (unless we try to support compilation
as C++, do we?).

>   /* Skip leading whitespace */
>   do
>     {
>       c = getc (stream);
>     } while (isspace (c));
>
>   /* Read input until end of file or a space character */
>   while ((c != EOF) && (isspace (c) == 0))

I think  "!isspace (c)" is clearer.

In in the case c == EOF, I think one ought to also have an

  if (ferror (stream)) 
    {
      gmp_free (buf);
      return 0;
    }

Not sure how the real gmp handles i/o errors in mpz_inp_str, but
returning success seems dangerous.

>     {
>       if (nread >= size - 1)

This looks prone to off-by-one errors (even if I believe it is correct).
Maybe let size be the maximum number of digits, with an actual
allocation of size+1 to accomodate the terminating NUL character?

>         {
>           /* Increase input buffer size */
>           size_t old_size = size;
>           size += 100;

I think a multiplicative increase is appropriate. E.g., "size = 3*size/2;".

>           buf = (char *) gmp_xrealloc (buf, old_size, size);

You don't need to bother with passing old_size, mini-gmp is documented
to always pass 0 for this argument (see mini-gmp/README). BTW, the
restriction to base <= 36 is also documented there.

>         }
>
>         buf[nread++] = c;
>
>       c = getc (stream);
>     }
>
>   buf[nread] = '\0';  /* Ensure null terminated string */

"NUL terminated".

>   ungetc (c, stream);

Not sure if ungetc is safe when c == EOF?

>   c = mpz_set_str(x, buf, base);

Don't reuse the c variable, use a separate int res.

>   gmp_free (buf);
>   /* 0 on error else number of characters read excluding null-terminator */

Drop " excluding...", or change "null-" to "NUL ".

>   return (c == -1 ? 0 : nread);
> }

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list