Add mpz_inp_str to mini-gmp

Fri Jul 8 15:18:08 UTC 2016

On Fri, Jul 8, 2016 at 4:43 PM, Niels Möller <nisse at lysator.liu.se> wrote:

> nisse at lysator.liu.se (Niels Möller) writes:
>
> > tg at gmplib.org (Torbjörn Granlund) writes:
> >
> >> I haven't looked at mini-gmp's mpz_set_str, but I assume it is O(n^2),
> >> so not much is lost by using a trivial implementation like the one
> >> above.
> >
> > It tries to be O(n) when base is a power of two.
>
> And I should add that for non-powers-of-two, it's also a bit more
> sophisticated than what you sketch, for good or bad. If m is the largest
> integer for which bb = base^m fits in a limb, the inner loop collects m
> digits at a time, followed by an outerloop doing an mpn_mul_1 by bb and
> an mpn_add_1.
>
> So the O(n^2) constant is mostly independent of the base, and n is the
> limb size, not the size in digits.
>
> The interesting functions are mpn_set_str_bits and mpn_set_str_other.
>
> And to limit code-duplication between mpz_inp_str and mpz_set_str, one
> approach (which would work regardless of how the input termination
> question is solved) is to use a shared helper-function
> convert_to_digit_value, collect into a buffer, and then call mpn_set_str
> which is responsible for the above tricks.
>

As I have it now, the only code that is duplicated between my mpz_inp_str
and mpz_set_str is the removal of any leading white spaces between in the
string. Since mpz_set_str can be called from functions other than
mpz_inp_str that check is still going to have to remain. As you alluded to
earlier, with the ambiguity of the GMP documentation, we would still need
both mpz_inp_str and mpz_set_str to handle strings their own way until a
standard is described in documentation.

I have updated mini-gmp to support up to base 62 also. You can find the
attached patch for review. This patch also includes the below
implementation for mpz_inp_str. I'm really not sure of the optimal way to
allocate then realloc memory for the input buffer. I suppose some kind of
statistical analysis of the size of strings read in from file streams would
be ideal, but since I don't have such statistics, I have arbitrarily chosen
the sizes you see.

int
mpz_inp_str (mpz_t x, FILE *stream, int base)
{
  int c;
  size_t nread = 0;
  size_t size = 1000;

  if (stream == 0)
    stream = stdin;

  char *buf = (char *) gmp_xalloc (size);

  /* Skip leading whitespace */
  do
    {
      c = getc (stream);
    } while (isspace (c));

  /* Read input until end of file or a space character */
  while ((c != EOF) && (isspace (c) == 0))
    {
      if (nread >= size - 1)
        {
          /* Increase input buffer size */
          size_t old_size = size;
          size += 100;
          buf = (char *) gmp_xrealloc (buf, old_size, size);
        }

        buf[nread++] = c;

      c = getc (stream);
    }

  buf[nread] = '\0';  /* Ensure null terminated string */
  ungetc (c, stream);

  c = mpz_set_str(x, buf, base);
  gmp_free (buf);
  /* 0 on error else number of characters read excluding null-terminator */
  return (c == -1 ? 0 : nread);
}

--Austyn

> Regards,
> /Niels
>
> --
> Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
> Internet email is subject to wholesale government surveillance.
> _______________________________________________
> gmp-devel mailing list
> gmp-devel at gmplib.org
> https://gmplib.org/mailman/listinfo/gmp-devel
>

-- 

--Austyn
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Allow-up-to-base-62-and-added-mpz_inp_str.patch
Type: application/octet-stream
Size: 6388 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20160708/a00ab67d/attachment.obj>