GMP and CUDA

Vincent Diepeveen diep at xs4all.nl
Sat Jun 13 01:24:16 CEST 2009


On Jun 13, 2009, at 1:01 AM, Morten Gulbrandsen wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Vincent Diepeveen wrote:
>> Cuda is mighty interesting Allan,
>> note that for what GMP is doing on paper ATI/AMD has faster  
>> hardware and
>> cheaper.
>> The bandwidth unlimited cuda card is the tesla, which is very very
>> expensive for what it
>> delivers.
>>
>> Basically it is a 32 bits platform, no matter the idiotic claims  
>> there,
>> and there is no description from
>> Nvidia which hardware instructrions they support. So support of  
>> anything
>> on Tesla is a total fulltime job
>> as you first have to figure out how and what.
>
> Dear Vincent,
> Dear Allan,
> Dear Torbjorn,
>
>
>
> http://www.ibm.com/developerworks/power/cell/
>
> I think for the eight core cell processor you can get the SDK which  
> has
> the fedora Linux C++ compiler.
>
>
> in the playstation3 there is the IBM cell processor,
>

By todays standards this is not such an interesting processor.
I say it polite. the power6 is way more interesting there.

3.17Ghz and 6 cores available single issue and single precision this
is not so impressive.

The newer CELL2 (which is NOT in the playstations) is much better.

Yet it delivers 75 Gflop if we look at it from a very optimistic  
viewpoint
if i calculate back from roadrunner, which has nodes of 220 watt
each carrying 2 cells, so 150 watt roughly.

This is for a single software program where all the cpu's (also x86)  
have
been optimized for as the sporthall top500 gets measured by exactly 1  
program.

Regrettably that program is not similar to the calculation GMP is doing.
The cpu's are like 90%+ efficient on it.

It is rather wishful thinking that for GMP you can get 90% efficiency  
out of cell2.

Documentation from ibm processors is very good, at least when i  
checked for power6.
No complaints there.

The architecture is however not so happy for GMP last time i checked  
CELL,
but i just checked paper specs of course, did not do any tests as i  
do not own such a million
dollar box.

It is easy to ship IBM questions, yet their email server seems broke.  
Past 15 years i never
got answers from them.

With respect to the cpu i agree with the remarks that prof Aad v/d Steen
in his latest report which you can download from his homepage.

GPU's are more interesting of course. Suppose that from those 4 Tflop  
they deliver single precision
with a few clever algorithmis tricks/enhancements you can use that  
quite effectively for a transform.

Now that is very interesting.

I would argue the reason that roadrunner is such a fast machine  
thanks to cell2 is because manufacturing
cpu's is a billion dollar business and HPC within the cpu market is a  
very specialistic market that is not interesting
for the big players.

Such factories cost many billions. In fact the 2nd law of Moore has  
been described by intel. Each new generation
process technology that takes care within 18 months we have 2x more  
transistors to our avail also requires a
factory 2x more expensive.

For a few hobbyists like me and a few uni researchers it is not  
interesting then to produce a 2 Tflop double precision cpu,
which is peanuts to design by the way, with a small team.

The only thing they design is generic hardware, or hardware that can  
do something that benefits hundreds of millions of users.
So single precision hardware for graphics is there to stay.

Larrabee might be coming, yet that is a 700mm^2 cpu.

So it'll eat massive power and if it's under $7500 a piece i would be  
amazed.

CELL also was designed to be a cpu like that. It had years delay and  
only now ps3 slowly gains popularity thanks to more game titles.
The cpu is total outdated by now. The fact that a cpu delivering 75  
gflop double precision still is seemingly fast, tells more about how
little spezialized cheap double precision gflop hardware there is,  
than that it shows that CELL2 is a winner cpu. It is not.
It's the one eyed in the land of the blind in hpc right now.

GPU's in companies already took over most calculations, they just do  
not brag publicly about it.

In meantime next generation GPU will double the number of single  
precision gflops handsdown again as it keeps scaling
and there is hundreds of millions of users for them.

So running well on gpu's is definitely a clever decision. Gambling at  
a cell processor is not right now.

Where can i buy the thing?

For 500 euro i can assemble a node for my home second hand quadrics  
network with inside a fast quadcore cpu phenom2 955 and a few gpu's.
So it combines a bunch of functions together. In a year from now i  
can upgrade the GPU or the cpu. How about CELL there.
Is there a cell3 coming that's 600 gflops and 200 euro?

Vincent

> I thnk yellow dog and fedora runs on the cell  which is basically the
> same architecture as the PPC.
>
>
> and the cell processor also runs on some IBM blade servers.
>
>
> For the nvidia CUDA architecture I don't know much.
>
> Sony and nvidia has Compilers but I have so far only investigated the
> IBM SDK.  As long as the playstation 3 is on the market it could be
> interesting to both access the cell 8xCPU and the Nvidia  GPU.
>
> I have seen one report about a beowolfcluster with sonyplaystation3.
>
> http://www.wired.com/techbiz/it/news/2007/10/ps3_supercomputer
>
>
> Why can't you give it a go with the IBM SDK and port or compile GMP to
> the Cell processor?
>
>
> sincerely yours,
>
> Morten
> 0x81802954
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (SunOS)
> Comment: For keyID and its URL see the OpenPGP message header
>
> iEYEARECAAYFAkoy3lwACgkQ9ymv2YGAKVT1rgCfUQ1YWQaGP0G1R8gkSdw0sKxD
> /jcAnR18uYUwQ9uwx4EDQKkAZby0QEgT
> =adZa
> -----END PGP SIGNATURE-----
>



More information about the gmp-discuss mailing list