GMP and 64-bit systems
arndt at jjj.de
Sun Jun 1 11:30:17 CEST 2008
* librik at panix.com <librik at panix.com> [Jun 01. 2008 18:25]:
> Hi there. I'd like to make some comments spurred by the recent
> complaints about GMP's compatibility with 64-bit Windows.
> The main issue people have with GMP on Win64 is really a more general
> problem of non-standard C coding practice. In fact, this practice can
> be cleaned up systematically, which will improve the quality of GMP's
> code and thereby make it work better with a wider variety of 64-bit
> The problem is: GMP's code implicitly assumes the LP64 model of
> 64-bit C types. This is not the only 64-bit model; LLP64 and ILP64
> are alternatives.
I'd think that LP64 is used in very many software projects with
the notable exception of software that is exclusively or mainly
developed on/for Windows. Can someone please comment on this?
What 64-bit archs (of any importance) are not using LP64?
> The solution is: Use the existing "mp_size_t" type consistently
> throughout the GMP source code and public interfaces to refer to a
> number of limbs, bytes, or bits. Then typedef "mp_size_t" to a 32-bit
> or 64-bit fundamental C type. This is already what's done with the
> "mp_limb_t" type and the "LONG_LONG_LIMB" & "GMP_SHORT_LIMB"
> preprocessor macros.
> mp_size_t is already used in many places in the GMP code for this
> purpose. But it's not everywhere yet -- many functions and structs
> still use a bare "long int". Until all the longs are eliminated,
> the code is not fully portable.
Making software portable is a good thing indeed, fully agreed.
Making it really really fully portable
-- causes significant extra work
(note a char doesn't even need to be eight bits!)
-- makes it vastly more complex and can render testing
a real challenge.
-- may lead to waste of performance or features.
> What's the issue with 64-bit C types? Here's a quick background document
> for you, the "Aspen paper" describing the three models and why most Unix
> systems standardized on LP64.
> A summary for the impatient: there are three ways to extend C types to
> a world where pointers and size_t's are 64 bits wide.
> * LP64 defines "int" as 32 bits and "long" as 64 bits.
> * LLP64 defines "int" and "long" as 32 bits, and an additional type
> (usually called "long long") as 64 bits.
> * ILP64 defines "int" and "long" as 64 bits.
> ALL THREE ARE EQUALLY VALID APPROACHES.
I dare to disagree. Assuming that the type long is a machine word,
and, when used as an array index, allows to index all memory that is
addressable, is what I call a sane model.
Making long==int _and_ smaller than 64 (on a 64-bit arch) seems
to be a concession so portability to very suboptimal code.
> The Aspen document looks at existing Unix source code and the parameters
> of POSIX standard function calls, and decides that LP64 is the best choice
> for portability based on these specific constraints.
> Similarly, the people who extended Microsoft Windows to 64 bits looked
> at their existing code base and the parameters of Windows API function
> calls, and chose LLP64 as the best model in that case.
> (Old Crays were ILP64 systems, but, in my experience, the main use of
> that model now is in 64-bit extensions of Fortran libraries.)
> Since I spend a lot of my work time cleaning up people's less-than-
> portable 64-bit source code, please allow me a short moment of stupid,
> unfair, irrational ranting:
> ** If you use the "long" type as an integer guaranteed to hold a pointer
> or a memory size, you are a BAD C PROGRAMMER. I don't care if it's
> what you're used to. It's NOT CORRECT. Please stop! **
Casting pointers to and from integer types is bad in the first place.
Still, a model where long cannot hold a full address, is IMHO not sane.
> The only acceptable use of "long" is when you need a variable that's
> guaranteed to be longer than a "short", or when you have to talk to
> an operating system API function. Otherwise it is best to avoid it,
> because "it does not mean what you think it means."
With a sane model long==generic-machine-word.
> A conclusion: any integer type intended to be 32 bits on 32-bit
> systems and 64 bits on 64-bit systems cannot be a basic C type. It
> must be a typedef type, whose identity is controlled by an #ifdef in
> some header file.
Yes that's the price for the full adherence to the standard. I
suggest sticking to LP64 (and saying so in the doc!), and, when it's
not there, bailing out with an error (or warning).
Or falling back to a safe but potentially slow code branch.
> This isn't the only 64-bit limitation in GMP. As Torbjorn has pointed
> out, the _mp_size and _mp_alloc fields of the mpz_struct are currently
> ints and not mp_size_t's. Therefore, no mpz integer can be larger than
> 2^31 limbs, even on 64-bit computers. But making that change really
> would break backward binary compatibility. Still, I believe it should
> be done!
> Finally, I need to apologize to Torbjorn and other people on this list.
> He and I discussed the need for an mp_size_t rewrite many months ago.
> I promised to work on it. And then I got busy and overworked, and went
> radio silent, and never followed through. I had hoped that all these
> backward-incompatible changes could wait for GMP 5.0, but it seems as
> though a crisis has come to a head. If I had done the work earlier,
> perhaps the rancor of the last few days might have been averted. I
> dropped the ball, and this is my fault. I hope that whatever happens,
> the mp_size_t cleanup will proceed, which would answer most Win64
> people's objections.
> - David Librik
> librik at panix.com
> gmp-discuss mailing list
> gmp-discuss at swox.com
P.S.: I was quite disappointed that the type long long was not used
for "two machine words" but stayed at 64 bit, this also reeks like a
concession to questionable code.
More information about the gmp-discuss