[Gmp-commit] /var/hg/www: 4 new changesets
mercurial at gmplib.org
mercurial at gmplib.org
Mon Nov 14 20:08:10 UTC 2016
details: /var/hg/www/rev/261828c41a8e
changeset: 226:261828c41a8e
user: Torbjorn Granlund <tg at gmplib.org>
date: Mon Nov 14 21:04:40 2016 +0100
description:
Remove long obsolete file.
details: /var/hg/www/rev/9677b537fe2a
changeset: 227:9677b537fe2a
user: Torbjorn Granlund <tg at gmplib.org>
date: Mon Nov 14 21:05:44 2016 +0100
description:
Update for new test result file location.
details: /var/hg/www/rev/fc3a5be0bfd9
changeset: 228:fc3a5be0bfd9
user: Torbjorn Granlund <tg at gmplib.org>
date: Mon Nov 14 21:07:25 2016 +0100
description:
New file.
details: /var/hg/www/rev/fde1697b36e3
changeset: 229:fde1697b36e3
user: Torbjorn Granlund <tg at gmplib.org>
date: Mon Nov 14 21:08:01 2016 +0100
description:
Many updates.
diffstat:
devel/GMPng.html | 195 +++--
devel/index.html | 11 +-
devel/mini-gmp-status.html | 170 +++++
devel/new.css | 6 +-
devel/repo-usage.html | 2 +-
devel/testmachines.shtml | 17 -
devel/testsystems.html | 1285 +++++++++++++++++++++----------------------
gmpbench.html | 104 +-
index.html | 18 +-
new.css | 4 +-
pi-with-gmp.html | 6 +-
robots.txt | 2 +-
security.html | 6 +-
13 files changed, 998 insertions(+), 828 deletions(-)
diffs (truncated from 3204 to 300 lines):
diff -r aad0a84b55d6 -r fde1697b36e3 devel/GMPng.html
--- a/devel/GMPng.html Tue Sep 06 14:32:30 2016 +0200
+++ b/devel/GMPng.html Mon Nov 14 21:08:01 2016 +0100
@@ -13,13 +13,13 @@
<body>
<div id="top">
- <table style="width:100%; background-color:#e8e8e8;">
+ <table style="width:100%; background-color:#303030;">
<tr>
<td style="text-align:left;">
<svg width="180px" height="60px" version="1.1"
viewBox="0 0 1500 500"
xmlns="http://www.w3.org/2000/svg">
- <rect x="0" y="0" width="1500" height="540" fill="#e8e8e8" />
+ <rect x="0" y="0" width="1500" height="540" fill="#303030" />
<text x="0" y="440" fill="#e00000" font-size="540" font-family="arial" font-weight="bold">
GMP
</text>
@@ -30,7 +30,7 @@
</td>
<td style="text-align:center;">
<span style="font-size:200%;">Itemised plans for GMP </span> <br>
- <span style="font-size:75%;">Last modified: 2016-09-06 </span>
+ <span style="font-size:75%;">Last modified: 2016-11-12 </span>
</td>
</tr>
</table>
@@ -70,49 +70,49 @@
<h2> Unbalanced multiplication </h2>
- <p style="color:#a00000"> Handle very unbalanced multiplication by "transforming" the smaller
- operand, then multiply using toomX2, toomX3, using the transformed value.
- Similarly in FFT range. </p>
+ <p style="color:#a00000"> Handle very unbalanced multiplication by
+ "transforming" the smaller operand, then multiply using toomX2, toomX3, using
+ the transformed value. Similarly in FFT range. </p>
<h2> Multiply with FFT </h2>
- <p style="color:#a00000"> Merge <a href="https://gmplib.org/repo/gcd-nisse/" style="color:
- rgb(160,0,0)">Niels' small-primes FFT code</a>. Make sure it is memory
- efficient. </p>
+ <p style="color:#a00000"> Merge <a href="https://gmplib.org/repo/gcd-nisse/"
+ style="color: rgb(160,0,0)">Niels' small-primes FFT code</a>. Make sure it
+ is memory efficient. </p>
- <p style="color:#a00000"> Merge new mul_fft.c. [Stalls on SFLC analysis.] Alternatively, improve
- the present code ourselves, or perhaps reimplement the algorithm from
- scratch. </p>
+ <p style="color:#a00000"> Merge new mul_fft.c. [Stalls on SFLC analysis.]
+ Alternatively, improve the present code ourselves, or perhaps reimplement the
+ algorithm from scratch. </p>
- <p style="color:#a00000"> Extend SS FFT table more intelligently, taking 'goodness' into account.
- (For both plain and fat builds.) </p>
+ <p style="color:#a00000"> Extend SS FFT table more intelligently, taking
+ 'goodness' into account. (For both plain and fat builds.) </p>
<p style="color:#a00000"> Implement the
<span style="white-space: nowrap; font-size:larger">
√<span style="text-decoration:overline;"> 2 </span>
</span>
- trick: 2<sup>3n/4</sup>−2<sup>n/4</sup> is a square
- root of 2 mod (2<sup>n</sup>+1). This allows for smaller coefficients. </p>
+ trick: 2<sup>3n/4</sup>−2<sup>n/4</sup> is a square root of 2 mod
+ (2<sup>n</sup>+1). This allows for smaller coefficients. </p>
<h2> Short products </h2>
<p style="color:#00a000;"> Merge David's mpn_mulmid code,
- <span style="color:#a00000;"> use it at least for
- mpn_binvert.</span></p>
+ <span style="color:#a00000;"> use it at least for mpn_binvert.</span></p>
<h2> Division m,n </h2>
<p style="color:#808000"> Implement van der Hoeven's MU generalisation. </p>
- <p style="color:#808000"> Perfect algorithm selection for nn-limb by dn-limb division. </p>
+ <p style="color:#808000"> Perfect algorithm selection for nn-limb by dn-limb
+ division. </p>
<p style="color:#a00000"> Add pi/preinv variants for all mu functions. [2h] </p>
- <p style="color:#a00000"> We still use mpn_tdiv_qr or even mpn_divrem in many files. Replace by
- current division functions: </p>
+ <p style="color:#a00000"> We still use mpn_tdiv_qr or even mpn_divrem in many
+ files. Replace by current division functions: </p>
<blockquote>
<table>
<tr> <th> mpn <th> mpz <th> mpf </tr>
@@ -142,7 +142,8 @@
</table>
</blockquote>
- <p style="color:#808000"> These files also use mpn_tdiv_qr, but just for nails: </p>
+ <p style="color:#808000"> These files also use mpn_tdiv_qr, but just for
+ nails: </p>
<blockquote style="color:#808000">
mpz/cdiv_q_ui.c
mpz/cdiv_qr_ui.c
@@ -161,14 +162,14 @@
<h2> Divide-by-fewlimb and modulo-by-fewlimb </h2>
- <p style="color:#a00000"> Complete set of bdiv_1, bdiv_2 and div_1, div_2 functions, using 1-limb,
- 2-limb inverses. </p>
+ <p style="color:#a00000"> Complete set of bdiv_1, bdiv_2 and div_1, div_2
+ functions, using 1-limb, 2-limb inverses. </p>
- <p style="color:#a00000"> Define plain 2/1-division based mod_1 loops (norm, unorm) as separate
- abstractions, since this is something all processors will need. This loop
- will in almost all cases be used just for very small n, since mod_1s-family
- functions will take over very quickly. Selection mechanisms between the mod
- functions could still be in C. </p>
+ <p style="color:#a00000"> Define plain 2/1-division based mod_1 loops (norm,
+ unorm) as separate abstractions, since this is something all processors will
+ need. This loop will in almost all cases be used just for very small n,
+ since mod_1s-family functions will take over very quickly. Selection
+ mechanisms between the mod functions could still be in C. </p>
<h2> mpz_remove, mpn_remove </h2>
@@ -178,40 +179,44 @@
<h2> mpn_redc vs mpn_bdiv </h2>
- <p style="color:#a00000"> Finish unification of bdiv/redc. The internal (sorry!) repo
- ~hg/gmp-proj/gmp-bdiv has a good start. </p>
+ <p style="color:#a00000"> Finish unification of bdiv/redc. The internal
+ (sorry!) repo ~hg/gmp-proj/gmp-bdiv has a good start. </p>
<h2> mpn_broot, mpn_binvroot, mpn_bsqrt, mpn_binvsqrt </h2>
- <p style="color:#a00000"> Implement wrap-around trick for the <i></i>-adic root code. </p>
+ <p style="color:#a00000"> Implement wrap-around trick for the <i></i>-adic
+ root code. </p>
<h2> Exact powers </h2>
- <p style="color:#a00000"> Add companion for mpz_perfect_power_p that returns the exponent and
- optionally the root. Possible name mpz_rootexact. Marco suggested some
- alternative functions, which could test a number and extract its nth power.
- (Sept/Oct 2012). </p>
+ <p style="color:#a00000"> Add companion for mpz_perfect_power_p that returns
+ the exponent and optionally the root. Possible name mpz_rootexact. Marco
+ suggested some alternative functions, which could test a number and extract
+ its nth power. (Sept/Oct 2012). </p>
<h2> Exact division </h2>
- <p style="color:#a00000"> We have an ugly mpn_divexact using Jebelean's bidirectional algorithm.
- Clean it up, and probably permit even divisors. </p>
+ <p style="color:#a00000"> We have an ugly mpn_divexact using Jebelean's
+ bidirectional algorithm. Clean it up, and probably permit even
+ divisors. </p>
<h2> GCD </h2>
- <p style="color:#808000"> Use mulmod_bnm1 for cancelling operand updates. [Partly done] </p>
+ <p style="color:#808000"> Use mulmod_bnm1 for cancelling operand updates.
+ [Partly done] </p>
<p> Unlike most basic functions in GMP, GCD depends on C code and thus on
compilers. Furthermore, extended GCD lacks proper basecase code, and instead
invoke the heavy machinery. </p>
- <p style="color:#a00000"> Replace gcd_1 by gcd_11, accepting two limbs. Implement gcd_1 in C for
- compatibility. </p>
+ <p style="color:#a00000"> Replace gcd_1 by gcd_11, accepting two limbs.
+ Implement gcd_1 in C for compatibility. </p>
- <p style="color:#a00000"> Implement gcd_22 in C, also in assembly using compiler generation. </p>
+ <p style="color:#a00000"> Implement gcd_22 in C, also in assembly using
+ compiler generation. </p>
<p style="color:#a00000"> Implement gcdext_22. </p>
@@ -245,45 +250,63 @@
Calls that are declared with attribute "internal" will be the fastest, since
then the GOT pointer can be assumed correct. </p>
- <p style="color:#a00000"> Fix calls to strictly internal routines, using the "internal" attribute,
- e.g. allowing compiler suppression of GOT setup code. <br> Fix references to
- strictly internal variables (e.g., alloc func pointers). </p>
-
- <p style="color:#a00000"> Fix calls to remaining mpn routines (perhaps using hidden+alias). </p>
+ <p style="color:#a00000"> Fix calls to strictly internal routines, using the
+ "internal" attribute, e.g. allowing compiler suppression of GOT setup
+ code. <br> Fix references to strictly internal variables (e.g., alloc func
+ pointers). </p>
- <p style="color:#a00000"> Fix calls via gmpn_cpuvec (x86/fat/fat_entry.asm, x86_64/fat/fat_entry.asm).
- The table itself needs a GOT and each entry points to a PLT entry... </p>
+ <p style="color:#a00000"> Fix calls to remaining mpn routines (perhaps using
+ hidden+alias). </p>
- <p style="color:#a00000"> Use <code>restrict</code> keyword in most <code>mpn</code> functions. </p>
+ <p style="color:#a00000"> Fix calls via gmpn_cpuvec (x86/fat/fat_entry.asm,
+ x86_64/fat/fat_entry.asm). The table itself needs a GOT and each entry
+ points to a PLT entry... </p>
- <p style="color:#a00000"> Use <code>visibility hidden</code> for <b>all</b> symbols used
- internally, then make documented symbols visible with alias. See star
- hacker <a href="http://www.airs.com/blog/archives/307" style="color:
+ <p style="color:#a00000"> Use <code>restrict</code> keyword in
+ most <code>mpn</code> functions. </p>
+
+ <p style="color:#a00000"> Use <code>visibility hidden</code> for <b>all</b>
+ symbols used internally, then make documented symbols visible with alias.
+ See star hacker <a href="http://www.airs.com/blog/archives/307" style="color:
rgb(160,0,0)">Ian Lance Taylor's explanations</a>. </p>
+ <blockquote>
+ <ul>
+ <li> What happens if we use __attribute__ visibility("hidden") on a
+ platform which does not support symbol visibility? ("Nothing" is OK.)
+ <li> Do we need to mark a symbol as hidden for each use? Clearly, for C/C++
+ the compiler needs to know about the hidden attribute in order to use the
+ optimal relocs. But how about assembly files which merely <i>use</i> a
+ symbol, does ".hidden foo" change anything wrt foo?
+ <li> Fat binaries contain generated symbols, i.e., __gmpn_addlsh2_n_core2,
+ __gmpn_mod_1_1p_x86_64. We need to generate corresponding "hidden" decls.
+ <li>
+ </ul>
+ </blockquote>
<h2> Configuru </h2>
- <p style="color:#a00000"> Improve C++ configure, see comment in configure.in. In particular,
- enforce selected ABI also for C++ (through CXXFLAGS). </p>
-
- <p style="color:#a00000"> Add more fat routines, as well as more fat thresholds. Ideally, any
- function with exist in a top-level <code>mpn/cpufam</code> directory and is
- also provided specially for at least one (relevant) sub-cpu, should be
- fat. </p>
+ <p style="color:#a00000"> Improve C++ configure, see comment in configure.in.
+ In particular, enforce selected ABI also for C++ (through CXXFLAGS). </p>
- <p style="color:#a00000"> Consider making fat the default. This could allow us to make more fine
- grained tuning, without trying to invent ever stranger configure CPU names. </p>
+ <p style="color:#a00000"> Add more fat routines, as well as more fat
+ thresholds. Ideally, any function with exist in a
+ top-level <code>mpn/cpufam</code> directory and is also provided specially
+ for at least one (relevant) sub-cpu, should be fat. </p>
- <p style="color:#a00000"> Make GMP_CPU_TYPE fat CPU selection standard for a fat build (but perhaps
- rename it to something more specific, GMP_FAT_CPU_TYPE_SELECT). Motive:
- Testability. </p>
+ <p style="color:#a00000"> Consider making fat the default. This could allow
+ us to make more fine grained tuning, without trying to invent ever stranger
+ configure CPU names. </p>
- <p style="color:#a00000"> Improve thresholds handling for fat builds, with the goal of making a fat
- lib as fast as a slim lib. </p>
+ <p style="color:#a00000"> Make GMP_CPU_TYPE fat CPU selection standard for a
+ fat build (but perhaps rename it to something more specific,
+ GMP_FAT_CPU_TYPE_SELECT). Motive: Testability. </p>
- <p style="color:#a00000"> Consider moving the exact CPU recognition from the configure triplet to
- some separate options (perhaps "--with-cpu"). </p>
+ <p style="color:#a00000"> Improve thresholds handling for fat builds, with
+ the goal of making a fat lib as fast as a slim lib. </p>
+
+ <p style="color:#a00000"> Consider moving the exact CPU recognition from the
+ configure triplet to some separate options (perhaps "--with-cpu"). </p>
<h2> C++ interface </h2>
@@ -301,19 +324,20 @@
<p style="color:#a00000"> Commit lots of new assembly code for div_2_*. </p>
- <p style="color:#808000"> We have lots of great multiply primitives like mul_1, mul_2, addmul_1,
- addmul_2, but development lags for O(n<sup>2</sup>) functions like
- mul_basecase, sqr_basecase, mullo_basecase, redc_1, redc_2, mulmod_bnm1,
- sqrmod_bnm1. Perhaps we could define the primitive functions in such a way
- that we could automatically assemble reasonably good O(n<sup>2</sup>)
- functions? </p>
+ <p style="color:#808000"> We have lots of great multiply primitives like
+ mul_1, mul_2, addmul_1, addmul_2, but development lags for O(n<sup>2</sup>)
+ functions like mul_basecase, sqr_basecase, mullo_basecase, redc_1, redc_2,
+ mulmod_bnm1, sqrmod_bnm1. Perhaps we could define the primitive functions in
+ such a way that we could automatically assemble reasonably good
+ O(n<sup>2</sup>) functions? </p>
<h2> Test environment </h2>
- <p style="color:#a00000"> Implement artificially small limbs, ASL™. Using C++ overload
- <code>mp_limb_t</code>, and let it be any size smaller-than <code>long long</code>.
- This will allow for better testing. [Work-in-progress] </p>
More information about the gmp-commit
mailing list