mulmid

Wed Oct 5 11:26:10 CEST 2011

David Harvey <d.harvey at unsw.edu.au> writes:

> Yes. I thought the idea of try.c is to run more exhaustive tests on
> the code, whereas make check just tries a few corner cases that have
> been known to cause problems on some systems.

I think make check (and the test programs it runs) is intended to do as
extensive tests as is possible in a reasonable time. Some rather subtle
bugs have been found by running, e.g.,

  while GMP_CHECK_RANDOMIZE=0 ./t-foo ; do true ; done

over night. And also make check is what the nightly builds run, I think.

As I understand try.c, it's intended to detect writes outside of the
intended areas, and not primarily to detect other miscomputations. Some
of the make check tests also uses marker limbs before and after certain
areas to detect overwrites, but try.c does that a lot more
systematically.

> /* FIXME: this could be made faster by using refmpn_mul and then subtracting
>    off products near the middle product region boundary */
>
> If we have enough trust in refmpn_mul, and if refmpn_mul is
> "asymptotically fast", then that is the simplest solution.

Sounds a bit to tricky to get right. My gut feeling is that one
shouldn't do tricky things in the refmpn functions. But on the other
hand, it may be good to have a reference implementation which is
radically different from the one under test.

>>    can be replaced by
>> 
>>        shr	%al		C restore carry
>> 
>>    for a bit more compact object code.
>
> Hmmm. Next time I will keep that in mind, I didn't realise the object
> code is different. I'm slightly worried that such a change might
> affect the speed, these processors can be quite fickle.

I agree it should be left untouched until someone has the time to really
benchmark changes.

When speaking of assembler files, do you recall what the bottlenecks are
for the various functions? Carry propagation latency, or instruction
decoding, or something else?

>>  * I seem to remember that an earlier incarnation of the mulmid
>>    implementation used a mullo function which returned two limbs more
>>    than the current mullo. Is that obsolete now?
>
> I wrote an implementation of bdiv_q that used that sort of mullo,
> maybe that's what you are remembering. But it was never part of the
> mulmid code.

Could be that.

> Definitely I agree invert_appr() is one of the first things we should
> try as an application of the mulmid code. I believe Paul Zimmermann is
> interested in this too.

I imagine any code doing newton iteration and any code using wraparound
tricks is a candidate for using mulmid. Maybe invert_appr is the
simplest of those cases?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.