[PATCH] Improve System z support and add some tuning

Andreas Krebbel krebbel at linux.vnet.ibm.com
Fri Oct 7 09:48:05 CEST 2011

Hi Torbjorn,

On 10/04/2011 12:53 AM, Torbjorn Granlund wrote:
> Awaiting paperwork, I committed basic s390x support to the mainline
> repository.  I could do this work after I got an emulator up and running
> on a system here.

Great! Thanks!

As David already told you there are mainframe virtual machines available for developers:

> (2) Write some more crititcal assembly routines, at least submul_1 and
>     invert_limb.  (I assume the latter will beat division, that infact
>     division instructions should never ever be used, just like on
>     x86_64.  But I don't know the quotient time(dlgr)/time(mlgr), which
>     basically is what determines this.)

I can help with this after finishing the paperwork. This unfortunately will take a while.

> (3) Improve inline assembly 32-bit support for processors with support
>     for MLR/ALR/ALCR etc.  I notice that you made an effort along these
>     lines, but I am not sure it was done right.  At least, the gcc on my
>     Debian system does not define the predef your code relies on.

mlr/dlr are available since z900 in both esa and z/architecture mode. There isn't a macro
defined by GCC for each CPU level. However these can be used for the -m31 -mzarch mode in
longlong.h since there is the __zarch__ macro defined and zarch requires at least z900. So
with my patch the instructions are used for the s390x ABI=32 build.
You are right, in fact they could be used for s390 -march=z900 as well. But since s390 is
only rarely used anymore I think the ABI=32 s390x build should always be used for 32 bit
code so I would like to focus on this one regarding optimizations.

> (4) Should we perhaps use 64-bit limbs for the 31-bit ABI, when using a
>     64-bit processor?  As far as I understand, this should work, and it
>     would run much faster.  (This would be akin to the N32 MIPS ABI and
>     the HPPA 2.0N ABI.)

Yes I also thought about this but didn't implement it yet. GCC is already (since 4.6.0)
using 64 bit registers in 32 bit code when compiling with -m31 -mzarch. This requires the
kernel to save/restore the upper 32 bits of the 64 bit registers when doing signal
handling. Here a link to the GCC patch for the gory details:


> (I noticed that the gcc compiler port leaves a lot to be desired; it
> generates poor code in ways that hurt GMP.  Specifically, it generates
> division instructons for X/C for constants C.  By adding a umulditi3
> pattern to gcc/config/s390.md, this would be fixed.)

Fixed in GCC mainline: http://gcc.gnu.org/ml/gcc-patches/2011-10/msg00546.html



More information about the gmp-devel mailing list