binvert_limb speedup on 64 bit machines with UHWtype

Torbjörn Granlund tg at gmplib.org
Sun Feb 27 15:42:53 CET 2022


John Gatrell <gatrelljm at gmail.com> writes:

  I think you missed why the 0x7F is unnecessary. If you start with 8 bits
  and divide by 2 then the top bit must become zero. gcc does this itself and
  suppresses the 0x7F. So this idea will help other compilers start with
  8-bits to achieve the same. The same trick can be used in hand-crafted
  assembler implementations.

You misread the code; n is a full limb.  Masking by 0x7F is absolutely
not unnecessary!

Replacing the bitwise logical and maskiong with a cast to "unsigned
char" is not portable.  (But swapping the division and the mask
operation, adjusting the mask accordingly, is semantically equvalent to
the original code.  As I said in my previous reply, this might help some
compiler.)

-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list