68000 issue in longlong.h

Niels Möller nisse at lysator.liu.se
Thu Mar 4 08:26:46 UTC 2021


"selco at t-online.de" <selco at t-online.de> writes:

>>I disabled the asm mult16x16 to hunt the bug and with the generic
>> version it run well. So the problem had to be the asm.
>>Here I compared the generated asm: http://franke.ms/cex/z/oG53bK and
>> you see that only one register is used instead of two, since the
>> modification is not recognized.
>>
>>/* here --> */             "=d" (__umul_tmp1),
>>
>>to
>>
>>/* here --> */             "=&d" (__umul_tmp1),
>>
>>does the magic.

To be clear, the meaning of this &, according to the docs, is to tell
gcc that the "output" register assigned to __umul_tmp1 can't overlap the
inputs. If I read it correctly, __umul_tmp1 is %2 in the asm template,
and the b input is %5. I've forgotten most I knew about 68k assembly,
but it looks to me like %5 is used twice, and %2 is used in between,
which could be a problem if they're assigned the same register. But not
sure how that would interact with "%2" ((USItype)(a)), which if I get it
right forces this input to be allocated in the same register as
__umul_tmp1 output.

The sqr_basecase function uses a couple of umul_ppmm(rp[11], lpl, ul, ul),
so my best guess is that we get all *three* of a, b, __umul_tmp1 allocated
in the same register.

If you could show the generated code (after gcc's register allocation)
*and* point out precisely where things go wrong, that would be helpful.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


More information about the gmp-bugs mailing list