FW: GMP news

Sun Nov 23 12:26:05 CET 2008

"Rev. Chris Korda" <victimofleisure at gmail.com> writes:

  I'm doing time-escape fractals, and I'm finding that e.g. with 768 bits of
  precision, the calculations are painfully slow; approximately three orders
  of magnitude slower than equivalent calculations done with 64-bit FPU or
  SSE. Obviously this isn't a fair comparison, but still I'd like to shorten
  my deep zoom render times by at least an order of magnitude. I'm running on
  a distributed network of Pentium Core 2 processors, using GMP 4.1 with Brian
  Gladman's Pentium4 assembler support.

If you're staying with the several years old GMP 4.1, and using Pentium4
code on Core 2, then you should not expect great performance.

  Is it possible that I might get improved performance by writing extended
  precision routines in SSE? This would be a third option between hardware
  64-bit (useless for deep zooms but very fast) and GMP (unlimited zoom but
  brutal render times). The proposed routines would have a fixed precision
  that's a multiple of the machine's word size, e.g. 768 bits. I only need to
  support a small number of operations--two actually, square and add--and the
  square only has to operate on numbers between zero and one, which makes
  things much easier.

First, make sure you're running a good 64-bit operating system such as
64-bit GNU/Linux or 64-bit FreeBSD, or perhaps 64-bit Slowaris.  You
should almost surely stay away from SSEx instructions, since they only
provide 32x32->64 bit multiplication, while you have 64x64->128 bit
multiplication in the integer register set.

64-bit Unix, GMP 4.2 for now, Jason Martin's core 2 patches is my
adivce.

-- 
Torbjörn