mpn_cnd_add_n

Niels Möller nisse at lysator.liu.se
Sun Mar 10 12:39:43 CET 2013


Torbjorn Granlund <tg at gmplib.org> writes:

> OK with me, but either test on powerpc64 before checking in, or keep an
> eye on nightbuild regressions and fix any problem there.

I checked it now, but darn, it did break on powerpc. Register r6 was
used instead of the symbolic name "n" in a couple of places. With the
following additional changes (complete file in the tree ~nisse/hack/gmp
on shell) it appears to work again.

It would be good if someone who actually understands powerpc assembly
could have a look before I check it in. And maybe it's better to replace
the hardcoded r7 when used in the loop by some symbolic name? Something
like

  define(`ulimb', `r7')	C Overlap with n input argument

Regards,
/Niels

--- a/mpn/powerpc64/mode64/aorscnd_n.asm        Sun Mar 10 10:00:12 2013 +0100
+++ b/mpn/powerpc64/mode64/aorscnd_n.asm        Sun Mar 10 12:30:31 2013 +0100
@@ -64,11 +64,11 @@
        subfic  cnd, cnd, 0
        subfe   cnd, cnd, cnd
 
-       rldicl. r0, r6, 0,62    C r0 = n & 3, set cr0
+       rldicl. r0, n, 0,62     C r0 = n & 3, set cr0
        cmpdi   cr6, r0, 2
-       addi    r6, r6, 3       C compute count...
-       srdi    r6, r6, 2       C ...for ctr
-       mtctr   r6              C copy count into ctr
+       addi    n, n, 3 C compute count...
+       srdi    n, n, 2 C ...for ctr
+       mtctr   n               C copy count into ctr
        beq     cr0, L(b00)
        blt     cr6, L(b01)
        beq     cr6, L(b10)
@@ -122,7 +122,7 @@
        b       L(ret)
 
 L(b00):        CLRCB                   C clear/set cy
-L(go): ld      r6, 0(up)       C load s1 limb
+L(go): ld      r7, 0(up)       C load s1 limb
        ld      r27, 0(vp)      C load s2 limb
        ld      r8, 8(up)       C load s1 limb
        ld      r9, 8(vp)       C load s2 limb
@@ -139,8 +139,8 @@
        addi    up, up, 32
        addi    vp, vp, 32
 
-L(top):        ADDSUBC r28, r27, r6
-       ld      r6, 0(up)       C load s1 limb
+L(top):        ADDSUBC r28, r27, r7
+       ld      r7, 0(up)       C load s1 limb
        ld      r27, 0(vp)      C load s2 limb
        ADDSUBC r29, r9, r8
        ld      r8, 8(up)       C load s1 limb
@@ -164,7 +164,7 @@
        and     r0, r0, cnd
        bdnz    L(top)          C decrement ctr and loop back
 
-L(end):        ADDSUBC r28, r27, r6
+L(end):        ADDSUBC r28, r27, r7
        ADDSUBC r29, r9, r8
        ADDSUBC r30, r11, r10
        ADDSUBC r31, r0, r12



-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list