Arithmetic without limitations?

Fabrice Bellard fabrice at
Thu Feb 11 18:48:58 CET 2010

On 02/11/10 14:38, Torbjorn Granlund wrote:
> Paul Zimmermann<Paul.Zimmermann at>  writes:
>    >  My idea for GMP has long been to make "hierarchical locality" take care
>    >  of it all.  A row in in the k-dimensional matrix would fit into L1
>    >  cache, a plane would fit into memory, further dimensions would live in
>    >  swap space (not exlicit files).
>    I'm not sure this will work. Here is a concrete example, on a Core 2 with
>    16Gb of RAM and 4Gb of swap. I'm trying to multiply two numbers of 6e9
>    decimal digits, thus using about 2.5G of memory each.
>    With GMP 5.0.1, top says:
> [snip]
> The developments I was talking about are not in GMP 5.0.1.  The FFT
> there has poor locality (which is mainly a property of its large
> coefficient FFT).  Attemtping to compute large product with operands too
> large for main memory will just result in early retirement of the swap
> disk.  :-)
> Besides, one will need lots of swap space for computing with large
> numbers.  That's the natural way; You need to compute with a huge data
> set?  Configure a huge swap area!  4 Gb (which I take as 4 gibibyte) is
> not good for huge computations, and really strangely small for a machine
> with 16 gibibyte RAM.
 > Special explicit swap files in a general purpose library is not imho a
 > good design.  In a special purpose program, perfectly fine.

Relying on the OS swap is possible, but to have good performance you 
will need to give hints to the OS to do the prefetching from the disk 
because unless the OS uses very clever heuristics it won't be able to 
prefetch the data correctly. This case will happen in case of 
discontinuous accesses which are needed to compute a DFT (you need to do 
either a matrix transposition or DFTs on matrix columns which makes 
discontinuous memory accesses).

Overall, it is probably as difficult to give hints to the OS as to 
directly make the corresponding disk I/Os !

In my case, where I used explicit disk I/Os, I found that it was very 
interesting in terms of performance to do raw I/Os (O_DIRECT flag in 
open() syscall). It shows that the OS (=Linux) disk cache is far from 
optimal in this particular case where the I/O patterns are very regular 
and where it makes no sense to cache the data for later use.

Another point is that it is very convenient to have one file per 
mpz/mpf/... on the disk in case you want to restart a huge computation 
from a known checkpoint.

 > (Perhaps
 > one could consider an optional interface in GMP where one makes
 > available explicit swap files, I haven't thought about that.)

I think it should be possible to disable the compilation of the "out of 
core" support because it won't be useful for many users. Having 
different functions for disk aware mpz/mpf/... is a good idea if you 
want to avoid modifying the existing code. It has the advantage of not 
adding extra tests for the disk case in the existing code.



More information about the gmp-devel mailing list