Runs generic code version on VIA processors

Tue Jul 27 18:41:35 CEST 2010

Hi Agner,

I address specific issues below. As a broad answer though:

* I like the idea of switching to using CPUID feature bits. This is only for
x86 right? x86 is probably a huge share of GMP installations, but I want to
be clear that these are not present on POWER, Sun, Alpha, Cray, mainframe,
...

* If this were to be considered, and perhaps implemented in GMP, it would be
quite an undertaking, yes? And require a huge amount of testing on a lot of
different hardware, yes? I'm actually not sure, so I'm asking.

* Might your research project include prototyping some of the code changes
that you are talking about and benchmarking the improvements? Many
improvements that come into GMP/MPFR come from knowledgeable users with an
expertise in an area, who submit code.

* If this were to be added to GMP's "todo lists", I think this would fit
into Bright Ideas.

On Tue, Jul 27, 2010 at 4:12 AM, Agner Fog <agner at agner.org> wrote:

> GMP version 5.0.1, x86
>
> The performance on VIA processors is poor because __gmpn_cpuvec_init in
> fat.c chooses the generic version of all functions, while the MMX and SSE2
> versions are only activated on Intel and AMD processors.
>
> The problem is that the CPU dispatching is based on vendor strings and CPU
> family and model numbers rather than on CPUID feature bits. This is
> fundamentally wrong for several reasons:
>
> * The __gmpn_cpuvec_init function assumes that all processors with family
> and model numbers bigger than the currently known processors support at
> least the same instruction sets as the ones we have. There is no guarantee
> that this will hold true in the future where low-power light-weight
> processors are becoming more popular. The only safe way to tell if a CPU
> supports a particular instruction set (e.g. SSE2) is to check the CPUID
> feature bits.
>
> An interesting point, very logical. I'm wondering if you know of any cases
now where processors with bigger family/model do not support at least the
same instructions?

> * You are making it difficult for new CPU vendors to enter the market when
> you put them at a disadvantage by giving them only the generic code path.
>

Depending on how difficult the GMP team is making it for new CPU vendors,
maybe new CPU vendors could provide the GMP team with loaner h/w?

>
> * Systems that use virtualization, emulation or FPGA softcores are gaining
> more use. You cannot make any assumptions about vendor strings and family
> numbers on such systems.

We can't? Does GMP not work inside VMware or Hyper-V?

>

> * You are making different code versions for different brands of processors
> with the same instruction set. The performance advantage you can gain by
> this is minimal at best. The disadvantages are that the fat binary becomes
> fatter and there are more versions to test and maintain.
>
> * The code needs to be updated every time there is a new processor on the
> market. Obviously, you don't have the resources for that. Much of the source
> code is from 2005 or earlier.
>

It doesn't. The code _might_ be updated every time there is a new prevalent
processor.

>
> * The time it takes from you make a change in the source code till the
> updated code makes it way through the application software to the end user
> is at least one year, and more commonly two or more years. The specific
> processor you are optimizing for is likely to be obsolete at the time your
> code is running on the end user's computer.
>
> This is an over-generalization. If SSE2 chips came out one year, then GMP
updated to use SSE2 the next year, those benefits are still being used 10
years later, on any chip that includes SSE2.

> I will therefore propose that the CPU dispatching system (i.e.
> __gmpn_cpuvec_init) should test only the CPUID feature bits (MMX, SSE2,
> SSE3, and so on) and not look at any vendor strings, family, or model
> numbers.
>
> You are saying at
> http://gmplib.org/list-archives/gmp-announce/2010-January/000024.html that
> there are VIA specific optimizations in version 5.0.0. Can you please tell
> me where they are? There doesn't seem to be support for it in
> __gmpn_cpuvec_init?
>
> The background for this bug report needs explanation:
>
> I am doing research on some improper behavior of Intel software that
> cripples performance on non-Intel processors. See my blog for details:
> http://www.agner.org/optimize/blog/read.php?i=49
>
> I have made a software tool that can change the CPUID vendor string on VIA
> processors (it is more difficult to do on Intel and AMD processors). I found
> that Mathematica runs faster on a VIA processor when the vendor string is
> changed to GenuineIntel or AuthenticAMD. This was due to two function
> libraries used by Mathematica, namely Intel Math Kernel Library (MKL) and
> GMP. I was surprised by this. It is difficult to blame Intel for improper
> practices when Gnu people are doing the same.
>
> The Mathematica package includes GMP.dll. I don't know how to tell which
> version it is.
> _______________________________________________
> gmp-bugs mailing list
> gmp-bugs at gmplib.org
> https://gmplib.org/mailman/listinfo/gmp-bugs
>

-- 
Sam Rawlins