String conversion anomalies
Jerry James
james at xemacs.org
Wed Apr 14 21:20:30 CEST 2004
First, the requested version information from my Pentium 4 box running
RedHat 9.
GMP version 4.1.2, as distributed with RedHat 9 (gmp-4.1.2-2)
GCC version 3.2.2, as distributed with RedHat 9 (gcc-3.2.2-5)
uname -a says:
Linux diannao.ittc.ku.edu 2.4.20-30.9 #1 Thu Feb 26 11:20:56 CST 2004 i686 i686 i386 GNU/Linux
The use of GMP within XEmacs is going quite well. I am still getting a
trickle of bug reports, though. Since most of them have to do with
converting strings to numbers, I packaged up a bunch of them into one
bug report. The basic problem is that in some cases, GMP does not
convert strings like strtol and strtod do. I have written a program
that I will attach here to illustrate the differences, and some actual
bugs. Some of the bugs may be documentation bugs, but at least one is a
library bug.
-------------- next part --------------
/* Test the GMP's ability to read slightly nonstandard strings as numbers.
For each test, we place a default value into the variable before attempting
to read each string. This lets us detect when GMP leaves the old value
untouched (or partially untouched in the case of ratios). The default
values are:
mpz_t: 79
mpq_t: 79/2
mpf_t: 79.1
*/
#include <gmp.h>
#include <stdlib.h>
#define TEST_MPZ_T(string) do { \
int valid; \
mpz_set_ui (bignum, 79UL); \
valid = mpz_set_str (bignum, string, 0); \
gmp_printf ("Convert \"%s\": integer = %d,%s mpz_t = %Zd\n", \
string, atoi (string), valid ? " INVALID" : "", bignum); \
} while (0)
#define TEST_MPQ_T(string) do { \
int valid; \
mpq_set_ui (ratio, 79UL, 2UL); \
valid = mpq_set_str (ratio, string, 0); \
mpq_canonicalize (ratio); \
gmp_printf ("Convert \"%s\":%s mpq_t = %Qd\n", string, \
valid ? " INVALID" : "", ratio); \
} while (0)
#define TEST_MPF_T(string) do { \
int valid; \
mpf_set_d (bigfloat, 79.1); \
valid = mpf_set_str (bigfloat, string, 0); \
gmp_printf ("Convert \"%s\": float = %f,%s mpf_t = %Ff\n", \
string, atof (string), valid ? " INVALID" : "", bigfloat); \
} while (0)
int main()
{
mpz_t bignum;
mpq_t ratio;
mpf_t bigfloat;
/* Bignum tests */
mpz_init (bignum);
TEST_MPZ_T ("");
TEST_MPZ_T ("+10");
TEST_MPZ_T ("54321 etc");
mpz_clear (bignum);
/* Ratio tests ***/
mpq_init (ratio);
TEST_MPQ_T ("");
TEST_MPQ_T ("+3/4");
TEST_MPQ_T ("3/+4");
TEST_MPQ_T ("3/4 etc");
TEST_MPQ_T ("3/-4");
TEST_MPQ_T ("7/8/9");
mpq_clear (ratio);
/* Floating point tests */
mpf_set_default_prec (256UL);
mpf_init (bigfloat);
TEST_MPF_T ("");
TEST_MPF_T ("+123.456");
TEST_MPF_T ("123.456 etc");
TEST_MPF_T ("1.23456E02");
TEST_MPF_T ("1.23456E+02");
TEST_MPF_T ("12.34E+03");
mpf_clear (bigfloat);
return 0;
}
-------------- next part --------------
Each test puts a default number into the associated variable before
attempting to read from the string, so that I can tell when the value of
the variable was not changed. The defaults are 79 for mpz_t, 79/2 for
mpq_t, and 79.1 for mpf_t. The output I get from running this program
is (with line numbers inserted for the commentary below):
1 Convert "": integer = 0, INVALID mpz_t = 79
2 Convert "+10": integer = 10, INVALID mpz_t = 79
3 Convert "54321 etc": integer = 54321, INVALID mpz_t = 79
4 Convert "": INVALID mpq_t = 79
5 Convert "+3/4": INVALID mpq_t = 79/2
6 Convert "3/+4": INVALID mpq_t = 3/2
7 Convert "3/4 etc": INVALID mpq_t = 3/2
8 Convert "3/-4": mpq_t = -3/4
9 Convert "7/8/9": INVALID mpq_t = 7/2
10 Convert "": float = 0.000000, INVALID mpf_t = 79.100000
11 Convert "+123.456": float = 123.456000, INVALID mpf_t = 79.100000
12 Convert "123.456 etc": float = 123.456000, mpf_t = 123.456000
13 Convert "1.23456E02": float = 123.456000, mpf_t = 123.456000
14 Convert "1.23456E+02": float = 123.456000, mpf_t = 123.456000
15 Convert "12.34E+03": float = 12340.000000, mpf_t = 12340.000000
Commentary:
#1-#3: The documentation says that a zero return value means that all
characters of the string are valid for a number of the indicated type
and base. It does not indicate what happens when not all characters are
valid. The strtol function attempts to convert as much of the string as
it can; mpz_set_str appears to just give up and do nothing. Personally,
I find the strtol behavior much nicer, because in XEmacs, I often know
that a string starts with a number, and I want to convert that part of
the string into a number. I know the rest is nonnumeric, but I don't
care. This behavior of GMP means that I have to hunt down the last
numeric character, save the next character, put a null there, call
mpz_set_str, then put the old next character back again.
#4: Same as #1 -- empty strings are invalid, unlike with strotol which
treats them as 0.
#5: Same as #2 -- leading plus signs are apparently not allowed,
although I do not see that mentioned in the documentation anywhere.
This would be nice to have, because time strings often contain plus
signs. (I became aware of this behavior because a Gnus user reported
that he was getting "invalid time string" messages when using the GMP
bignum support.)
#6: Notice what happened here. GMP reported that the string was invalid
... AFTER changing the numerator! It should either leave the numerator
alone or (preferably) handle the + syntax.
#7: Same as #3.
#8: Just to show that #6 is reasonable from a GMP perspective.
#9: Again, the numerator was changed before GMP decided that the number
was invalid.
#10: Same as #1 and #4.
#11: Same as #2 and #5.
#12: This shows that mpf_set_str is willing to ignore trailing junk and
not report that the result is invalid, unlike mpz_set_str and
mpq_set_str.
#13-#15: These all behave fine. Just once I got some kind of infinite
recursion inside XEmacs doing a conversion of this kind, but didn't
think to save the core file. Now I cannot reproduce it. :-(
Regards,
--
Jerry James
http://www.ittc.ku.edu/~james/
More information about the gmp-bugs
mailing list