Base (2 to 62) limitation for set_str initialization and output

Sergio Martin eienburuu at gmail.com
Wed Dec 12 00:27:52 CET 2012


Hi all dear people discussing GMP,

I think I get the rationale behind limiting the base by which a number can
be initialized from, or outputted to a string array (or stdin/stdout, for
that matter). Correct me if I'm wrong: it is limited by the ammount of
alphanumeric symbols (including case sensitivity) that can be used to
represent a number.

While it makes perfect sense viewed from that angle, I can't help but think
of the space - memory or secondary storage - efficiency loss for big
(really big) numbers. Large numbers stored in files, transmitted by
network, or held in memory will occupy much less space if they are encoded
in base 256. Using the maximum base 62 representation would occupy ~4-times
more bytes in memory than the same number represented with a base 256 would.

Note that I'm just talking about the set_str and output functions, and that
I know that the internal management of the limbs is perfectly
memory-efficient (or tends to be, at least). Also, I know that one could
create it's own conversion program without much problems, but I feel
convenient that this would be implemented within GMP.

I've seen the mpz_out_raw(..), and mpz_inp_raw(...) functions, and they
seem to fit for that purpose. However, they both requiere a mpz integer to
have been already initialized. Had that initialization been done with the
base 2..62 limitation, the problem would still persist.

While I don't have a personal need for it, I'm really interested in
managing really big numbers, and I couldn't help but think of the space I
could spare if I could input unsigned integers stored as base 256 files.

Please let me know if this has been thought or discussed before; if it is
implemented and I missed it from reading the manual; or if I'm wrong in any
point, please let me know.

Thanks for your time,

Sergio Martin


More information about the gmp-discuss mailing list