d.harvey at unsw.edu.au
Wed Oct 5 02:49:00 CEST 2011
On Oct 3, 2011, at 11:23 PM, Niels Möller wrote:
> I've merged David's mulmid implementation (based on the June 9 revision
> in the shell:~dmharvey/projects/gmp/mulmid repository). As far as I
> understood, the paper work is in order for that contribution?
> I've had a quick look through the code, and I've spent a few hours
> reviewing changes and writing reasonably detailed ChangeLog entries.
> Some questions and (mostly minor) comments:
> * mpn_toom42_mulmid_itch: Should this be a macro, real function, or
> inline function? Currently it's a macro, but with a lower-case name
> like a function.
It should probably be a macro, I had it formatted the same way as mpn_toom22_mul_itch().
> * There's code in try.c (which I'm not very familiar with) to test the
> code, but it is not exercised by a regular make check.
Yes. I thought the idea of try.c is to run more exhaustive tests on the code, whereas make check just tries a few corner cases that have been known to cause problems on some systems.
> * In the testing code, the functions refmpn_toom42_mulmid and
> refmpn_mulmid are defined, as wrappers around
> refmpn_mulmid_basecase, but as far as I have seen, they're unused.
Hmmmm. I'm not quite sure what I was thinking when I did that.
Now when I look at again, it seems we should only really need the general case refmpn_mulmid(). I wrote a comment there:
/* FIXME: this could be made faster by using refmpn_mul and then subtracting
off products near the middle product region boundary */
If we have enough trust in refmpn_mul, and if refmpn_mul is "asymptotically fast", then that is the simplest solution.
The only use I can see for the unused functions would be to have matching interfaces, e.g. refmpn_toom42_mulmid accepts a scratch parameter. Otherwise I think it's safe to remove them.
> * In try.c, there's one change related to the handling of
> SIZE_CEIL_HALF, which is not obviously related to the new mulmid
> code. I havn't tried to figure out if this is correct or what the
> problem was.
Previously, SIZE_CEIL_HALF was used only for testing mpn_sqrtrem(), as a measure of the size of the output. But for mulmid, it sometimes uses SIZE_CEIL_HALF as the size of one of the inputs. The new code handles that.
> * In the assembly files, I think the instruction
> shr $1, %al C restore carry
> can be replaced by
> shr %al C restore carry
> for a bit more compact object code.
Hmmm. Next time I will keep that in mind, I didn't realise the object code is different. I'm slightly worried that such a change might affect the speed, these processors can be quite fickle. For now I suggest you change all the occurrences that don't occur in the inner loops. Unless you have time to check the inner loop performance, I suggest leaving the ones in the inner loops for now (perhaps add a comment to this effect).
> * I seem to remember that an earlier incarnation of the mulmid
> implementation used a mullo function which returned two limbs more
> than the current mullo. Is that obsolete now?
I wrote an implementation of bdiv_q that used that sort of mullo, maybe that's what you are remembering. But it was never part of the mulmid code.
> * There's currently no code using mulmid. Did you have an
> implementation of invert.c or binvert.c using mulmid? If so, what's
> the status?
Status is that I haven't thought about it for a while! I did write a rough invert_appr(), but it was mainly for profiling purposes, I don't think I ever worked through all the error estimates carefully. It's not a difficult algorithm, it's probably easier to start from scratch than to figure out exactly how that code worked (or failed to work).
Definitely I agree invert_appr() is one of the first things we should try as an application of the mulmid code. I believe Paul Zimmermann is interested in this too.
More information about the gmp-devel