udiv_qr_3by2 vs divappr
Niels Möller
nisse at lysator.liu.se
Mon Aug 27 19:46:30 UTC 2018
paul zimmermann <Paul.Zimmermann at inria.fr> writes:
> page 1: the division instruction is now much faster than before on modern
> processors
According to https://gmplib.org/~tege/x86-timing.pdf, they're still an
order of magnitute slower than multiplication. E.g 86 vs 3 cycles on
Intel skylake. And in addition, multiplication is piplined, division
likely isn't.
Do you have any suggestion for better wording than "vastly slower" ?
> page 2: sigificant -> significant
Fixed.
> page 3, line 2 of Section 3, remove "The output"
Fixed.
> page 3, line 5 of Section 3: <d_1, d_0 0> -> <d_1, d_0, 0>
I think <d_1, d_0> is right.
> line 6: correesponds -> corresponds
Fixed.
> page 3, The algorithm: <d_1, d_0 0> -> <d_1, d_0, 0>
Should be <d_1, d_0>, as above.
> step 3: u_2 -> u_1, u_1 -> u_0
Fixed.
> page 4, line 5: d_1 - d_0 -> d_0 - d_1
Fixed.
> page 4, line 7: inputes -> inputs
Fixed.
> page 4, line 13: the the -> the
Fixed.
> page 4, line 20: u_1 -> u_0
Fixed.
> page 4, line 21: The last terms -> term
Fixed.
> page 4, line 27: for the upper bound, I agree you can subtract the maximal
> value of the u_0 term in the proof of Theorem 3 from [4], but this gives
> \beta-1, not \beta.
This is a bit subtle, maybe I should explicitly repeat the proof form
[4] with needed modifications. My argument is that the proof in [4] is
of the form
\tilde r = ... + u0 < ... + \beta <= c = max (\beta^2 - D, q_0 \beta)
But if u_0 < \beta is the only thing that lets us conclude that \tilde r
< c, with strict inequality, then there's an off-by-one error here.
Now there are multiple bounds involved; to get \tilde r = c we'd also
need something like u_1, u_0 and q_0 maximal and d_0 = 0.
> page 4: "it follows that r_1 + d_1 + 1 > \beta". I do not agree, take for
> example u_1 = u_0 = 0, d_1 = 128, d_0 = 1 with \beta = 256. You get r_1 = 127,
> and r_1 + d_1 + 1 = 256 = \beta. But r_1 + d_1 + 1 >= \beta is enough for the
> rest of the proof.
Good catch, your example seems to be correct.
> page 4, line -2: on line 8 -> on line 9
Fixed.
> page 5, line 6: \beta r_1 + d_0 -> \beta r'_1 + d_0
Fixed.
> For the case R' < 0, you did not prove the lower bound for r'_1 <= d_1 - 2,
> and did not prove the upper bound for r'_1 >= d_1 - 1.
I probably thought they were "trivial", but I have to double check, and
say something about it.
> Case R' >= 0: if the -\beta term is replaced by -(\beta-1), then all I can get
> is r_1 <= \beta - d_1 - 1.
That is problematic, for this case, we really need that r_1 + (d_1 + 1)
doesn't overflow. I think we need to strengthen the proof to get a
-\beta term, and possibly single out any problematic special cases.
> I wonder whether we can get rid of that term, or always subtract -1.
> In the latter case (assuming the proof with the p_0 > 0 extra term is correct),
> the only problem would be when p_0 = 0.
I have looked at that, so far without success. If I remember correctly,
always subtracting -1 can be treated in the analysis as extending the
single-word range for r_0 to 0 <= r_0 <= \beta. But that breaks the
analysis of the R' < 0 case. We have
{r_1, r_0} >= q_0 \beta
We want to conclude r_1 >= q_0, but that no longer holds if we allow r_0
= \beta. To make it work, we'd need to streighten the lower bound
R' >= q_0 \beta - \beta^2
Hmm... but from another look at the proof from [4], seems to give us
strict inequality
R' > q_0 \beta - \beta^2
with no extra effort. So maybe we can do that.
So we really should redo the proof from [4] in the current setting, and
prove the bounds we need. Which seems to be
R' < max (\beta^2 - D, q_0 \beta) - \beta
R' >= - D
R' > q_0 \beta - \beta^2
While the current arguments don't rule out equality.
Thanks for the careful review! Updated version (with all the things that
were easy to fix) at
https://www.lysator.liu.se/~nisse/misc/schoolbook-divappr.pdf
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list