two possible coding porjects
   
    stephen-nuchia@utulsa.edu
     
    stephen-nuchia@utulsa.edu
       
    Fri, 25 Jul 2003 19:48:10 -0500 (CDT)
    
    
  
Some applications, e.g. Pollard's Rho algorithm, do one heck
of a lot of a^2 mod n for constant n and varying a.  I have
in mind a memory-intensive optimization for this case.
Make one shifted copy of n for each bit position in a limb.
Pad each copy with as many zero limbs as n has to start with.
Now you can get any shift of n up to n^2 just by fudging
the pointer and size fields.
Since modular squaring is the core of modular power, this
should do the trick there as well.  All that would be needed
is a way to determine that the preparation is worthwhile.
Actually, once the data field is prepared it should accellerate
just about any division-related operation against n.
I may or may not take the time to fool with this -- I have a lot
of other code and papers to write over the next few months --
but if I get sufficiently irritated at the speed of the existing
code (and nobody tells me it's wasted effort) I will give it a shot.
Is there perhaps a more subtle optimization that obviates this one?
Second issue:  I have a paper I'm shopping around that improves
upon the Miller-Rabin primality test in some interesting ways.
I am doing a reference implementation of my version but as a
happy user of gmp I'd like to offer you guys a chance to
integrate it early.  Let me know if you would be willing to
agree to keep it confidential until publication is arranged.
Wanna scoop Wolfram? :-)
OK, third issue.  I'm rather terrified by the way memory is
managed in the internals code I've read.  Would you consider
a mass migration to a scheme along the following lines?
allocated==0 => single-limb datum in ptr field (union)
allocated>0 => reallocable storage of size <allocated>
allocated<0 => specially-managed storage of size |allocated|
this would at the very least admit error reporting when
non-reallocable storage overflows.  A good set of macros
and some kind of static usage checker a la lint would make
this scheme just about as robust as you can get in C.
So long, and thanks for all the fish!
-steve