[patrick.pelissier@gmail.com: Fwd: Optimizing 1/x]
Ashod Nakashian
saghmos at xter.net
Tue Sep 13 23:30:42 CEST 2005
Paul Zimmermann wrote:
> Well, I assume then that the benefits of such a method does *at least*
> pay-off. The next question would be how slower is the next fastest
> non-patented algorithm compared to HP's? Also, although I'm anything but
> a lawyer, I'd be surprised if law forbids using a patented algorithm for
> educational/non-profit purposes. So I assume that one could argue that
> making patch (with the said HP algorithm) available would benefit many
> and the burden of following the patent laws would be on the user, not
> the provider of the code.
>
> Sorry, the HP patent only concerns division (y/x), not inversion (1/x).
> For division, I expect the next fastest non-patented algorithm to be
> about 10% to 20% slower. Also, it will use more memory.
Ok, I undertand that it's a division algorithm, but I though you had
also experimented with inversion, no? If you have, I'm interested in
knowing if there are substantial speed improvements over the current
approach of creating an mpn from the the ui value of 1, and then
dividing it with the divisor. May be even theoretically, you can give me
some advise as if it is worth implementing an inversion algorithm. I
expect to find some benefits from such an approach (a specialized path
for inversion) but in practice there might be some hidden overhead that
trashes the benefits, that's why I need advice before I dive in.
10-20% is not an astronomical difference. 10-20% variations are not
unusual between CPU models and optimization parammeters. It's probably
not even worth chasing a license. But how much more memory? Also 10-20%
or much more?
>
> Have you considered making contributions with your implementation to GMP
> or seperate from GMP in form of patches to GMP?
>
> When it will be finished, I'll submit my implementation to GMP.
>
Great! Looking forward to it. Let me know if I can contribute with some
of my idle cycles, mmm, time to the same cause. I can assembly-optimize
pretty well. I'm actually currently playing with the AMD-64 port of the
assembly modules. Will post a patch when my efforts start paying off.
> Out of curiosity, how much faster is the HP algorithm compared to the
> current GMP division implementation. I assume it's sub linear (or is it
> not?), but in practice is it close to log2(n) + C? Its complexity is
> probably a function in the multiplication algorithm's complexity.
>
> -Ash
>
> The current GMP division uses a recursive algorithm for large operands,
> with complexity O(M(n) log(n)), whereas Newton's division has complexity
> O(M(n)), so we can expect a more-than-constant speedup once Newton's division
> is in GMP.
>
A substantial gain, I'd say. So basically we will be left with the Mul
complexity, which floats at about (IIRC) O(n^1.125) for numbers with
larger-than-fft-threshold bits.
> Paul
>
>
More information about the gmp-discuss
mailing list