Runs generic code version on VIA processors
tg at gmplib.org
Sun Aug 8 15:43:33 CEST 2010
Agner Fog <agner at agner.org> writes:
GMP version 5.0.1, x86
The performance on VIA processors is poor because __gmpn_cpuvec_init
in fat.c chooses the generic version of all functions, while the MMX
and SSE2 versions are only activated on Intel and AMD processors.
Which fat.c? Are you talking about mpn/x86/fat/fat.c or
The problem is that the CPU dispatching is based on vendor strings and
CPU family and model numbers rather than on CPUID feature bits. This
is fundamentally wrong for several reasons:
* The __gmpn_cpuvec_init function assumes that all processors with
family and model numbers bigger than the currently known processors
support at least the same instruction sets as the ones we have.
I wasn't aware of this. Please be more specific.
* You are making it difficult for new CPU vendors to enter the market
when you put them at a disadvantage by giving them only the generic
Not really "generic", they will still get assembly loop support.
I actually doubt falling back to the 'features' bits for unrecognised
processors will be better typically, than assume the processor is
similar to the last recognised processor in the same family.
* Systems that use virtualization, emulation or FPGA softcores are
gaining more use. You cannot make any assumptions about vendor strings
and family numbers on such systems.
We cannot? But can we then make any assumptions about anything CPUID
returns? Which specific information from CPUID will be invalid for such
* You are making different code versions for different brands of
processors with the same instruction set. The performance advantage
you can gain by this is minimal at best. The disadvantages are that
the fat binary becomes fatter and there are more versions to test and
This is absolutely false, and shows that you do not appreciate the sort
of optimisation being done in GMP.
The available instructions do not tell much about which instructions are
good to use in GMP, unfortunately.
* The code needs to be updated every time there is a new processor on
the market. Obviously, you don't have the resources for that. Much of
the source code is from 2005 or earlier.
Yes, as new processors come out, we need to make sure they are
recognised and that they existing code works well for them. Fortunately
for us, fundamentally new microarchitectures are rare.
(GMP usually recognises new processors long before they enter the
market, since the CPU manufacturers tell us about their plans and the
assinged CPUID numbers.)
OK, so much of GMP's sources are from 2005 and earlier. What is your
* The time it takes from you make a change in the source code till the
updated code makes it way through the application software to the end
user is at least one year, and more commonly two or more years. The
specific processor you are optimizing for is likely to be obsolete at
the time your code is running on the end user's computer.
I think you exaggerate somewhat here to prove your point...
I will therefore propose that the CPU dispatching system
(i.e. __gmpn_cpuvec_init) should test only the CPUID feature bits
(MMX, SSE2, SSE3, and so on) and not look at any vendor strings,
family, or model numbers.
If we did that, it would make GMP's performance drop by a large factor.
You are saying at
that there are VIA specific optimizations in version 5.0.0. Can you
please tell me where they are? There doesn't seem to be support for it
We claim VIA *nano* optimisations. A "find gmp-5.0.0 -name nano" should
help you find some of the relevant code.
The fat support might be lagging somewhat. You need to be more specific
about your config if you want to a more specific response.
More information about the gmp-bugs