div2 and nails
Torbjorn Granlund
tg at swox.com
Thu Dec 11 19:11:30 CET 2003
nisse at lysator.liu.se (Niels Möller) writes:
What's the proper definition of div2 with nails? In particular, how
should the bits be laid out in the input to div2? Each input number is
two words. For clarity, I'll assume that we have 32-bit words and 2bit
nails, generalizations should be easy. Then bits could be distributed
in at least these three ways:
high word low word
A. |xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|
64 significant bits, i.e. all bits in the input are used, just like
in the non-nail case.
B. |00xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|00xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|
60 significant bits, each input word contains 30 bits, and 2 nail
bits which must be zero.
C. |0000xxxxxxxxxxxxxxxxxxxxxxxxxxxx|xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|
60 significant bits. The least significant word gets 32 bits, and
the most significant word gets 28 bits, and 4 nail bits which are
zero.
Alternatively, the "must be zero" part could be replaced with "the
nail bits are ignored".
A and C have the drawback that one may need bits from more than two
limbs to assemble one div2 input word. B has the drawback that the
layout is incompatible with that used by sub_ddmmss, which is
blissfully ignorant of any nail usage.
So what's the Right Thing to do?
We intensionally don't define how all internal interfaces
work when GMP_NAIL_BITS != 0. Some functions will want to
use the spare bits cleverly, other functions will want to
keep them at zero.
For div2 we might choose to let it work exactly the same
with and without nail. That's how div2 in the current
mpn/generic/gcdext.c works, I think. But I suspect that it
could be made to run faster if a special limb version would
be used, with the high bits clear at function entry, i.e.,
proposal B above. As you have observed, sub_ddmmss can not
be used in this case.
--
Torbjörn
More information about the gmp-devel
mailing list