shoup at cs.nyu.edu
Wed Mar 23 22:07:02 UTC 2016
Thanks! The compile farm may be a good resource. I had access to it in the past. Maybe I still do.
Marc Glisse <marc.glisse at inria.fr> wrote:
>On Wed, 23 Mar 2016, Victor Shoup wrote:
>> This may be a bit off topic, but I figure the people on this list
>> might know something about this.
>> In some code I've been developing lately (NTL related, of course),
>> I've been making more use of the __uint128_t type that is available
>> on gcc (and its clang and icc clones). It's all ifdef'd properly, so
>> only use it when it actually works.
>> Anyway, I find that on x86-64 machines and recent gcc's, the compiler
>> does a pretty good job of code generation...much better than I recall
>> some years ago. However, I was wondering about the 64-bit ARM
>> machine. I don't have access to such a machine, but I tried some
>> out at https://gcc.godbolt.org (which is a very convenient site, by
>> I was somewhat surprised that the code generated there by gcc-4.8 for
>> 64-bit ARM was terrible: a 64x64->128 mul gets mapped to
>> a generic128x128->128 function call.
>You realize ARM64 barely existed at the time of gcc-4.8? If gcc-5, or
>better yet a snapshot of gcc-6, still generates suboptimal code, please
>report to https://gcc.gnu.org/bugzilla/ with a testcase, and the asm
>would like gcc to generate instead.
>> So I'm starting to question whether relying on __uint128_t is such a
>> Maybe it would be better for me to isolate all of that code so that I
>> just drop in appropriate assembly (as in GMP's longlong.h),
>> as an alternative.
>It is always a compromise...
>> I could also ask gcc people what their plans for future optimizations
>> in this area are, but I don't know who or where to ask.
>You could ask on gcc at gcc.gnu.org, but reporting bugs when you see
>suboptimal code generated seems much more likely to get you answers,
>by showing constructive interest it may spark further optimizations.
>If this is for the development of free software, the GCC compile farm
>includes some aarch64 machines on which you could experiment.
Sent from my Android device with K-9 Mail. Please excuse my brevity.
More information about the gmp-devel