div2 and nails

Thu Dec 11 19:11:30 CET 2003

nisse at lysator.liu.se (Niels Möller) writes:

  What's the proper definition of div2 with nails? In particular, how
  should the bits be laid out in the input to div2? Each input number is
  two words. For clarity, I'll assume that we have 32-bit words and 2bit
  nails, generalizations should be easy. Then bits could be distributed
  in at least these three ways:

      high word                        low word
  A. |xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|

     64 significant bits, i.e. all bits in the input are used, just like
     in the non-nail case.

  B. |00xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|00xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|

     60 significant bits, each input word contains 30 bits, and 2 nail
     bits which must be zero.

  C. |0000xxxxxxxxxxxxxxxxxxxxxxxxxxxx|xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx|

     60 significant bits. The least significant word gets 32 bits, and
     the most significant word gets 28 bits, and 4 nail bits which are
     zero.

  Alternatively, the "must be zero" part could be replaced with "the
  nail bits are ignored".

  A and C have the drawback that one may need bits from more than two
  limbs to assemble one div2 input word. B has the drawback that the
  layout is incompatible with that used by sub_ddmmss, which is
  blissfully ignorant of any nail usage.

  So what's the Right Thing to do?

We intensionally don't define how all internal interfaces
work when GMP_NAIL_BITS != 0.  Some functions will want to
use the spare bits cleverly, other functions will want to
keep them at zero.

For div2 we might choose to let it work exactly the same
with and without nail.  That's how div2 in the current
mpn/generic/gcdext.c works, I think.  But I suspect that it
could be made to run faster if a special limb version would
be used, with the high bits clear at function entry, i.e.,
proposal B above.  As you have observed, sub_ddmmss can not
be used in this case.

-- 
Torbjörn