Base (2 to 62) limitation for set_str initialization and output

Wed Dec 12 15:25:35 CET 2012

  While it makes perfect sense viewed from that angle, I can't help but
think

>   of the space - memory or secondary storage - efficiency loss for big
>   (really big) numbers. Large numbers stored in files, transmitted by
>   network, or held in memory will occupy much less space if they are
> encoded
>   in base 256. Using the maximum base 62 representation would occupy
> ~4-times
>   more bytes in memory than the same number represented with a base 256
> would.
>
> I think you need to redo your maths about the space savings.  Using base
> 62 and then using a byte will take about 34% more memory than using a
> plain binary encoding.  (If you don't believe me, I suggest that you try
> a few examples, then muse about the maths.)
>
> Indeed, I got that wrong. I should have been more careful with that.

>   I've seen the mpz_out_raw(..), and mpz_inp_raw(...) functions, and they
>   seem to fit for that purpose. However, they both requiere a mpz integer
> to
>   have been already initialized. Had that initialization been done with the
>   base 2..62 limitation, the problem would still persist.
>
> I cannot follow your reasoning here.
>

> There are various functions that would allow you to read and write in
> any 2^t base, for positive integral t.
>
>
I wrote the previous email because the mpz_inp_raw() function fitted
perfectly for loading the file I created containing a raw binary number,
except for the fact that it needed additional information at the beginning.
I mistakenly thought that, for some reason, I HAD to use GMP init_str() to
create that additional information, but it just as simple as adding the
size information myself with a simple script. By doing that, my original
problem is pretty much solved.

Thanks for the fast response. I can continue studying this now.

Sergio Martin