[PATCH] Improve System z support and add some tuning
krebbel at linux.vnet.ibm.com
Fri Oct 7 09:48:05 CEST 2011
On 10/04/2011 12:53 AM, Torbjorn Granlund wrote:
> Awaiting paperwork, I committed basic s390x support to the mainline
> repository. I could do this work after I got an emulator up and running
> on a system here.
As David already told you there are mainframe virtual machines available for developers:
> (2) Write some more crititcal assembly routines, at least submul_1 and
> invert_limb. (I assume the latter will beat division, that infact
> division instructions should never ever be used, just like on
> x86_64. But I don't know the quotient time(dlgr)/time(mlgr), which
> basically is what determines this.)
I can help with this after finishing the paperwork. This unfortunately will take a while.
> (3) Improve inline assembly 32-bit support for processors with support
> for MLR/ALR/ALCR etc. I notice that you made an effort along these
> lines, but I am not sure it was done right. At least, the gcc on my
> Debian system does not define the predef your code relies on.
mlr/dlr are available since z900 in both esa and z/architecture mode. There isn't a macro
defined by GCC for each CPU level. However these can be used for the -m31 -mzarch mode in
longlong.h since there is the __zarch__ macro defined and zarch requires at least z900. So
with my patch the instructions are used for the s390x ABI=32 build.
You are right, in fact they could be used for s390 -march=z900 as well. But since s390 is
only rarely used anymore I think the ABI=32 s390x build should always be used for 32 bit
code so I would like to focus on this one regarding optimizations.
> (4) Should we perhaps use 64-bit limbs for the 31-bit ABI, when using a
> 64-bit processor? As far as I understand, this should work, and it
> would run much faster. (This would be akin to the N32 MIPS ABI and
> the HPPA 2.0N ABI.)
Yes I also thought about this but didn't implement it yet. GCC is already (since 4.6.0)
using 64 bit registers in 32 bit code when compiling with -m31 -mzarch. This requires the
kernel to save/restore the upper 32 bits of the 64 bit registers when doing signal
handling. Here a link to the GCC patch for the gory details:
> (I noticed that the gcc compiler port leaves a lot to be desired; it
> generates poor code in ways that hurt GMP. Specifically, it generates
> division instructons for X/C for constants C. By adding a umulditi3
> pattern to gcc/config/s390.md, this would be fixed.)
Fixed in GCC mainline: http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00546.html
More information about the gmp-devel