Arithmetic without limitations?

Joerg Arndt arndt at
Mon Feb 15 10:21:25 CET 2010

* Torbjorn Granlund <tg at> [Feb 15. 2010 09:35]:
> Joerg Arndt <arndt at> writes:
>   Also multi-threading is very advantageous
>   for speed, even for single core CPUs.  Not sure all or even any of
>   this has to be in GMP (nonwithstanding the coolness factor).
> Why would one want multi-threading for single-code CPUs?

With double buffering, because the read/write thread mostly only needs
to trigger the I/O that is much done by hardware _parallel_ to the
computation (thread).

fxtbook p.450:
If multi-threading is available, one can use a double buffer
technique: split the workspace into halves, one (CPU-intensive) thread
does the FFTs in one half and another (hard disk intensive) thread
reads and writes in the other half. This keeps the CPU busy during
much of the hard disk operations, avoiding the waits during disk
activity at least partly.

If I/O takes just as long as the FFTs (and CPU work for triggering
I/O can be neglected) than performance with multi-threading is
double of single-thread performance even with only one core.

> An efficient disk-based computation would need to do timely
> "prefetching" from the disk, asking for the data for the next
> computation step.  It needs to make sure to leave enough space to store
> that input in RAM, while still working on a different part of the
> computation.

Note with double buffering only half of the available
RAM can be used for the FFT, the other half is
written back (and then refilled) from disk.
This used to be a 'real' issue when machines had
much less RAM, today this is nothing to worry about.

> Likewise, it needs to write intermediate results to disk.
> parallel

> There is a suitable set of functions for this, aio_*.  We could manage
> these from the the single computation thread
> Unfortunately, there seem to be no portable ways of doing what we need
> when dealing with swap space.  There is posix_madvise, but its
> MADV_WILLNEED is not powerful enough.

Not having platform independent threads (as part of the language
or at least in the standard C-libs) is indeed a painful issue.

I'd suggest to offer threads (if at all) only for POSIX systems.

> -- 
> Torbjörn

Here is a 'do-nothing' test with (C++) I/O-thread lib
I wrote in 1999:

#include "iothread.h"

#include <assert.h>
#include <iostream>
using namespace std;

    int fn=0;  // workspace size, set to zero so don't really do anything
    double *fr=0, *fi=0;  // here be data
    int fd1=0,  fd2=0;   // file descriptors
    long off=0;   // offset in files

    iothread ioth;  // start the thread for the I/O jobs

    // these are used to signal that buffer1/2 is in RAM:
    condition buf1ok, buf2ok;

    // read files into buffer
    ioth.seekread(fd1,off,fr,fn); // read data into buffer 1
    ioth.sigcond(&buf1ok);        // ... and signal when done
    ioth.seekread(fd2,off,fi,fn); // then read buffer 2
    ioth.sigcond(&buf2ok);        // ... and signal when done

    ioth.waitcond(&buf1ok);  // wait until buf1 is in ram
    // do NUMBER CRUNCHING on buffer 1
    // while buffer 2 is read in the background
    ioth.seekwrite(fd1,off,fr,fn); // write buffer 1 back to disk
    ioth.seekread(fd1,off,fr,fn);  // ... and read new data into it 
    ioth.sigcond(&buf1ok);         // ... signal when done

    ioth.waitcond(&buf2ok);  // wait until buf2 is in ram
    // do NUMBER CRUNCHING on buffer 2
    // while buffer 1 written/read in the background

    // do NUMBER CRUNCHING on buffer 1
    // while buffer 2 is read in the background

    // do NUMBER CRUNCHING on buffer 2
    // while buffer written in the background

    ioth.waitfinish();  // wait until all jobs are completed

    cout << "ciao." << endl;
    return 0;

More information about the gmp-devel mailing list