mulmid

Wed Oct 5 02:49:00 CEST 2011

On Oct 3, 2011, at 11:23 PM, Niels Möller wrote:

> Hello,
> 
> I've merged David's mulmid implementation (based on the June 9 revision
> in the shell:~dmharvey/projects/gmp/mulmid repository). As far as I
> understood, the paper work is in order for that contribution?

Yes.

> I've had a quick look through the code, and I've spent a few hours
> reviewing changes and writing reasonably detailed ChangeLog entries.

Thanks.

> Some questions and (mostly minor) comments:
> 
>  * mpn_toom42_mulmid_itch: Should this be a macro, real function, or
>    inline function? Currently it's a macro, but with a lower-case name
>    like a function.

It should probably be a macro, I had it formatted the same way as mpn_toom22_mul_itch().

>  * There's code in try.c (which I'm not very familiar with) to test the
>    code, but it is not exercised by a regular make check.

Yes. I thought the idea of try.c is to run more exhaustive tests on the code, whereas make check just tries a few corner cases that have been known to cause problems on some systems.

>  * In the testing code, the functions refmpn_toom42_mulmid and
>    refmpn_mulmid are defined, as wrappers around
>    refmpn_mulmid_basecase, but as far as I have seen, they're unused.

Hmmmm. I'm not quite sure what I was thinking when I did that.

Now when I look at again, it seems we should only really need the general case refmpn_mulmid(). I wrote a comment there:

/* FIXME: this could be made faster by using refmpn_mul and then subtracting
   off products near the middle product region boundary */

If we have enough trust in refmpn_mul, and if refmpn_mul is "asymptotically fast", then that is the simplest solution.

The only use I can see for the unused functions would be to have matching interfaces, e.g. refmpn_toom42_mulmid accepts a scratch parameter. Otherwise I think it's safe to remove them.

>  * In try.c, there's one change related to the handling of
>    SIZE_CEIL_HALF, which is not obviously related to the new mulmid
>    code. I havn't tried to figure out if this is correct or what the
>    problem was.

Previously, SIZE_CEIL_HALF was used only for testing mpn_sqrtrem(), as a measure of the size of the output. But for mulmid, it sometimes uses SIZE_CEIL_HALF as the size of one of the inputs. The new code handles that.

>  * In the assembly files, I think the instruction
> 
> 	shr	$1, %al		C restore carry
> 
>    can be replaced by
> 
>        shr	%al		C restore carry
> 
>    for a bit more compact object code.

Hmmm. Next time I will keep that in mind, I didn't realise the object code is different. I'm slightly worried that such a change might affect the speed, these processors can be quite fickle. For now I suggest you change all the occurrences that don't occur in the inner loops. Unless you have time to check the inner loop performance, I suggest leaving the ones in the inner loops for now (perhaps add a comment to this effect).

>  * I seem to remember that an earlier incarnation of the mulmid
>    implementation used a mullo function which returned two limbs more
>    than the current mullo. Is that obsolete now?

I wrote an implementation of bdiv_q that used that sort of mullo, maybe that's what you are remembering. But it was never part of the mulmid code.

>  * There's currently no code using mulmid. Did you have an
>    implementation of invert.c or binvert.c using mulmid? If so, what's
>    the status?

Status is that I haven't thought about it for a while! I did write a rough invert_appr(), but it was mainly for profiling purposes, I don't think I ever worked through all the error estimates carefully. It's not a difficult algorithm, it's probably easier to start from scratch than to figure out exactly how that code worked (or failed to work).

Definitely I agree invert_appr() is one of the first things we should try as an application of the mulmid code. I believe Paul Zimmermann is interested in this too.

david