Basecase assembly optimisation project

Torbjorn Granlund tg at gmplib.org
Wed Oct 2 19:14:24 CEST 2013


Ondřej Bílka <neleai at seznam.cz> writes:

  I am writing a tool that might be useful, a simple optimizer of assembly
  routines. You need to write a benchmark that measures performance and
  prints elapsed time and assembly file. Currently it has two optimization
  patterns, first is enclosing block of code without control flow
  instructions in
  
  # EVOLVE_START
  code
  # EVOLVE_END
  
  and evolver will try different schedulings to find optimal one.
  
  Second one is when you choose between different instructions, where you 
  write alternatives by following form.
  
  cmp $1, %rax #| cmp $1, %al
  
  A sequences can be done in following way
  
  cmp $1, %rax; ja .foo #| test %eax, %eax; jne .foo
  
  sounds interesting?
  
It sounds a lot like David Harvey's and my tool, which we call the
"loopmixer".  We never wrote it neatly enough for public release, but
we've used it for the last 4 years or so for tweaking the assembly code
of GMP.  (There is a slightly cryptic message in many asm files in GMP
about this.)

We don't handle alternatives currently, except with a loop around the
loopmixer.  One could think of several classes, some known to the tool,
others not-always-valid only by source file annotation.  Examples of the
former would be "xor rax,rax" vs "xor eax,eax" and "mov reg,reg" vs lea
(reg),reg.

For x86-64, CPUs are affected more or less by which register is used,
which could be understood by the tool.  (E.g., rbp and r13 lack he
shorter 0-offset address for; rsp and r12 require an extra 0x24 when
used as base ptrs; r8-r15 require REX prefix.)

Taking all this into account might through us out in the land of
combinatorial explosion.

Keeping software pipeline depth down would also be useful.  Our tool
doesn't understand that.  When one give instruction choices, manual
preferences and automatic preferences would be useful.  Automatic
preferences could be to use a short insn, manual could be to prefer mov
over lea to make code more readable.  (Not that mov is much more
readable, but to avoid a random mix of he two in similar contexts.)

-- 
Torbjörn


More information about the gmp-devel mailing list