division

Jason Moxham J.L.Moxham@maths.soton.ac.uk
Wed, 6 Nov 2002 20:41:01 +0000


--------------Boundary-00=_DG9615A6KF86G07HXJVF
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: quoted-printable


Attached are some times for varible precision newton iteration for calcul=
ation=20
quotient and/or remainder

the numbers are speedup's over GMP4.1 and GMP4.1+div-patch
the left hand column is the number of limbs in the numerator
and the top row is the number of limbs in the denominator as a fraction o=
f the=20
number of limbs in the numerator

It's still generic inverse k'th root preliminary code , but I doubt that =
the=20
speedup's will change much.

The runtime is proportional to the quotient , so for large n and small q =
there=20
is a large speed up (upto 5000x) ,though I thought gmp allready had code =
for=20
this special case?

I havent yet testing using this for barrett type division , but it should=
 be=20
fairly good.

How does this compair to yours , Karl ?

jason

--------------Boundary-00=_DG9615A6KF86G07HXJVF
Content-Type: text/plain;
  charset="us-ascii";
  name="tdiv_qr_times"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="tdiv_qr_times"

for mpz_tdiv_qr
N limbs	0.067	0.133	0.200	0.267	0.333	0.400	0.467	0.533	0.600	0.667	0.733	0.800	0.867	0.933	1.000	

20	0.110	0.167	0.212	0.226	0.230	0.217	0.217	0.228	0.273	0.250	0.283	0.256	0.263	0.219	0.200	
30	0.054	0.218	0.230	0.224	0.241	0.247	0.247	0.250	0.288	0.317	0.304	0.298	0.293	0.265	0.222	
45	0.138	0.229	0.240	0.248	0.260	0.258	0.277	0.283	0.326	0.318	0.338	0.349	0.365	0.310	0.231	
67	0.217	0.240	0.245	0.259	0.265	0.270	0.287	0.315	0.342	0.371	0.400	0.409	0.397	0.377	0.321	
100	0.209	0.226	0.242	0.254	0.271	0.296	0.320	0.357	0.391	0.410	0.446	0.463	0.480	0.449	0.355	
150	0.195	0.207	0.230	0.263	0.290	0.319	0.356	0.404	0.445	0.455	0.512	0.541	0.562	0.561	0.394	
225	0.174	0.200	0.243	0.278	0.327	0.343	0.387	0.432	0.487	0.504	0.547	0.590	0.637	0.646	0.421	
337	0.167	0.211	0.264	0.306	0.342	0.375	0.404	0.471	0.021	0.541	0.595	0.629	0.683	0.720	0.532	
505	0.190	0.224	0.279	0.316	0.367	0.405	0.434	0.508	0.541	0.554	0.607	0.653	0.722	0.795	0.585	
757	0.171	0.242	0.303	0.340	0.383	0.445	0.464	0.538	0.545	0.576	0.642	0.693	0.745	0.806	0.667	
1135	0.182	0.261	0.322	0.376	0.410	0.499	0.481	0.550	0.586	0.602	0.645	0.694	0.767	0.846	0.767	
1702	0.198	0.283	0.340	0.410	0.426	0.547	0.515	0.572	0.613	0.598	0.677	0.710	0.778	0.875	0.827	
2553	0.216	0.299	0.351	0.458	0.443	0.603	0.551	0.583	0.610	0.635	0.696	0.711	0.796	0.868	0.958	
3829	0.221	0.310	0.363	0.492	0.459	0.680	0.583	0.590	0.619	0.643	0.700	0.745	0.803	0.884	1.180	
5743	0.237	0.331	0.378	0.548	0.473	0.773	0.627	0.614	0.636	0.661	0.704	0.756	0.812	0.890	1.098	
8614	0.246	0.354	0.388	0.613	0.473	0.883	0.676	0.620	0.655	0.653	0.715	0.753	0.821	0.885	1.010	
12921	0.251	0.365	0.396	0.699	0.472	1.007	0.737	0.629	0.647	0.667	0.731	0.763	0.821	0.892	0.967	
19381	0.292	0.407	0.423	0.779	0.499	1.202	0.851	0.641	0.664	0.684	0.727	0.765	0.827	0.894	0.937	
29071	0.310	0.451	0.451	0.943	0.516	1.808	1.151	0.682	0.697	0.673	0.739	0.778	0.833	0.905	1.092	
43606	0.325	0.507	0.453	1.396	0.550	2.377	1.345	0.699	0.739	0.696	0.766	0.764	0.840	0.901	0.927	
65409	0.356	0.582	0.498	1.942	0.627	2.955	1.679	0.744	0.789	0.769	0.794	0.787	0.830	0.901	0.856	
98113	0.382	0.732	0.578	2.439	0.654	8.811	4.991	0.846	0.793	0.781	0.827	0.826	0.841	0.904	0.869	
147169	0.409	

--------------Boundary-00=_DG9615A6KF86G07HXJVF
Content-Type: text/plain;
  charset="us-ascii";
  name="tdiv_q_times"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="tdiv_q_times"

for mpz_tdiv_q
N limbs	0.067	0.133	0.200	0.267	0.333	0.400	0.467	0.533	0.600	0.667	0.733	0.800	0.867	0.933	1.000	

20	0.091	0.169	0.246	0.245	0.269	0.265	0.245	0.255	0.349	0.317	0.342	0.273	0.300	0.296	0.238	
30	0.058	0.241	0.260	0.267	0.268	0.271	0.292	0.310	0.352	0.396	0.395	0.389	0.406	0.308	0.238	
45	0.151	0.254	0.281	0.286	0.256	0.304	0.319	0.360	0.413	0.431	0.444	0.468	0.486	0.419	0.318	
67	0.232	0.268	0.280	0.301	0.319	0.333	0.361	0.406	0.464	0.505	0.560	0.590	0.562	0.556	0.381	
100	0.220	0.255	0.278	0.302	0.334	0.372	0.413	0.471	0.552	0.494	0.716	0.756	0.803	0.762	0.500	
150	0.210	0.238	0.272	0.321	0.356	0.410	0.468	0.550	0.663	0.709	0.830	1.033	1.098	1.122	0.545	
225	0.193	0.230	0.288	0.338	0.409	0.589	0.513	0.603	0.743	0.812	1.004	1.216	1.562	1.615	0.762	
337	0.180	0.244	0.314	0.370	0.432	0.490	0.535	0.660	0.781	0.877	1.147	1.393	1.840	2.287	1.190	
505	0.176	0.263	0.336	0.024	0.464	0.544	0.585	0.740	0.840	0.924	1.188	1.491	2.112	3.309	1.364	
757	0.192	0.286	1.073	0.413	0.482	0.588	0.621	0.774	0.867	0.968	1.280	1.580	2.346	3.880	2.273	
1135	0.208	0.308	0.389	0.465	0.514	0.659	0.651	0.792	0.931	1.017	1.311	1.638	2.446	4.514	3.682	
1702	0.234	0.333	0.412	0.511	0.524	0.720	0.702	0.838	0.969	1.022	1.380	1.699	2.507	4.771	5.391	
2553	0.243	0.354	0.427	0.570	0.558	0.799	0.755	0.850	0.961	1.065	1.416	1.698	2.598	4.919	7.593	
3829	0.251	0.368	0.438	0.614	0.580	0.905	0.800	0.869	0.983	1.091	1.425	1.771	2.677	5.194	24.567	
5743	0.270	0.394	0.461	0.688	0.597	1.022	0.857	0.904	1.004	1.111	1.442	1.802	2.694	5.186	37.903	
8614	0.284	0.421	0.474	0.769	0.605	1.169	0.927	0.914	1.040	1.110	1.466	1.796	2.732	5.188	49.303	
12921	0.286	0.435	0.481	0.880	0.599	1.333	1.008	0.926	1.027	1.131	1.487	1.814	2.737	5.251	69.857	
19381	0.338	0.485	0.519	0.978	0.634	1.599	1.167	0.940	1.052	1.155	1.487	1.825	2.747	5.261	111.314	
29071	0.360	0.543	0.557	1.197	0.666	2.431	1.576	1.005	1.105	1.138	1.509	1.848	2.766	5.276	196.410	
43606	0.380	0.616	0.567	1.811	0.708	3.256	1.873	1.026	1.184	1.169	1.568	1.822	2.789	5.328	438.579	
65409	0.424	0.725	0.624	2.566	0.799	4.051	2.397	1.100	1.259	1.263	1.604	1.853	2.764	5.315	582.538	
98113	0.458	0.899	0.725	3.208	0.834	12.037	7.330	1.261	1.248	1.281	1.662	1.902	2.768	5.272	941.333	
147169	0.489	1.142	0.773	7.986	0.906	

--------------Boundary-00=_DG9615A6KF86G07HXJVF
Content-Type: text/plain;
  charset="us-ascii";
  name="tdiv_qr_times_div_patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="tdiv_qr_times_div_patch"

for mpz_tdiv_qr
N limbs	0.067	0.133	0.200	0.267	0.333	0.400	0.467	0.533	0.600	0.667	0.733	0.800	0.867	0.933	1.000	

20	0.085	0.152	0.219	0.210	0.230	0.220	0.200	0.214	0.278	0.250	0.261	0.244	0.256	0.242	0.154	
30	0.054	0.213	0.227	0.230	0.238	0.244	0.231	0.247	0.284	0.290	0.304	0.286	0.310	0.250	0.222	
45	0.138	0.227	0.238	0.240	0.248	0.256	0.263	0.290	0.312	0.314	0.333	0.328	0.340	0.302	0.259	
67	0.420	0.244	0.245	0.254	0.266	0.270	0.284	0.311	0.342	0.368	0.389	0.404	0.391	0.358	0.286	
100	0.207	0.227	0.239	0.257	0.270	0.294	0.317	0.353	0.383	0.408	0.443	0.537	0.471	0.431	0.344	
150	0.192	0.210	0.229	0.263	0.288	0.318	0.358	0.403	0.443	0.443	0.505	0.532	0.556	0.540	0.412	
225	0.174	0.201	0.243	0.277	0.335	0.347	0.394	0.434	0.488	0.503	0.538	0.591	0.643	0.644	0.447	
337	0.168	0.211	0.018	0.316	0.343	0.368	0.406	0.471	0.505	0.528	0.594	0.631	0.679	0.714	0.543	
505	0.158	0.131	0.280	0.322	0.375	0.383	0.438	0.506	0.539	0.553	0.614	0.653	0.722	0.789	0.566	
757	0.170	0.242	0.305	0.331	0.388	0.413	0.456	0.535	0.550	0.585	0.638	0.683	0.732	0.817	0.699	
1135	0.185	0.260	0.323	0.354	0.409	0.432	0.465	0.544	0.588	0.602	0.643	0.696	0.760	0.846	0.756	
1702	0.197	0.279	0.339	0.372	0.426	0.447	0.490	0.575	0.612	0.607	0.674	0.708	0.775	0.859	0.815	
2553	0.214	0.291	0.352	0.391	0.444	0.452	0.506	0.583	0.610	0.632	0.697	0.714	0.794	0.873	0.924	
3829	0.215	0.302	0.361	0.394	0.459	0.467	0.510	0.595	0.618	0.645	0.698	0.736	0.799	0.884	1.161	
5743	0.239	0.313	0.379	0.410	0.472	0.482	0.524	0.613	0.636	0.662	0.705	0.755	0.815	0.889	1.105	
8614	0.248	0.327	0.388	0.418	0.473	0.493	0.531	0.621	0.657	0.656	0.717	0.753	0.822	0.886	1.009	
12921	0.252	0.330	0.397	0.433	0.475	0.496	0.539	0.630	0.648	0.672	0.729	0.762	0.822	0.892	1.000	
19381	0.293	0.353	0.424	0.433	0.500	0.514	0.546	0.641	0.663	0.684	0.726	0.767	0.829	0.897	0.920	
29071	0.310	0.377	0.451	0.465	0.518	0.541	0.591	0.683	0.696	0.673	0.738	0.778	0.834	0.903	1.071	
43606	0.324	0.405	0.452	0.499	0.549	0.580	0.623	0.699	0.739	0.696	0.763	0.764	0.839	0.899	0.913	
65409	0.357	0.428	0.494	0.563	0.625	0.617	0.639	0.735	0.786	0.766	0.789	0.781	0.831	0.903	0.844	
98113	0.382	0.452	0.578	0.605	0.653	0.649	0.710	0.846	0.792	0.780	0.827	0.826	0.840	0.902	0.900	
147169	0.409	0.519	0.617	0.612	0.709	0.719	0.752	0.882	0.851	0.816	0.833	0.845	0.874	0.904	0.858	
220753	0.436	0.563	0.652	0.664	0.767	0.747	0.778	0.921	0.928	0.870	0.875	0.868	0.885	0.913	0.850	
331129	0.529	0.600	0.720	0.739	0.812	0.778	0.831	0.972	0.932	0.916	0.942	0.904	0.904	0.921	0.872	
496693	0.553	0.666	0.783	0.776	0.887	0.856	0.867	1.023	0.964	0.956	0.951	0.932	0.935	0.932	0.879	
745039	0.624	0.737	0.849	0.818	0.978	0.932	0.907	1.082	1.070	1.029	0.971	0.969	0.952	0.943	0.867	

--------------Boundary-00=_DG9615A6KF86G07HXJVF
Content-Type: text/plain;
  charset="us-ascii";
  name="tdiv_q_times_div_patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="tdiv_q_times_div_patch"

for mpz_tdiv_q
N limbs	0.067	0.133	0.200	0.267	0.333	0.400	0.467	0.533	0.600	0.667	0.733	0.800	0.867	0.933	1.000	

20	0.103	0.186	0.246	0.245	0.250	0.255	0.255	0.283	0.318	0.302	0.324	0.303	0.290	0.296	0.238	
30	0.058	0.237	0.260	0.267	0.274	0.271	0.292	0.288	0.345	0.367	0.386	0.368	0.364	0.296	0.286	
45	0.151	0.263	0.281	0.286	0.305	0.307	0.323	0.365	0.413	0.409	0.455	0.447	0.474	0.406	0.286	
67	0.239	0.253	0.286	0.297	0.325	0.346	0.371	0.409	0.463	0.517	0.554	0.600	0.553	0.543	0.450	
100	0.216	0.254	0.278	0.303	0.336	0.377	0.413	0.473	0.552	0.630	0.716	0.765	0.787	0.721	0.476	
150	0.196	0.239	0.271	0.321	0.358	0.413	0.473	0.553	0.665	0.712	0.885	1.042	1.111	1.122	0.571	
225	0.192	0.230	0.289	0.341	0.409	0.451	0.513	0.603	0.744	0.810	1.011	1.228	1.580	1.600	0.810	
337	0.181	0.251	0.316	0.374	0.432	0.484	0.550	0.673	0.017	0.890	1.147	1.393	1.860	2.312	1.238	
505	0.174	0.263	0.340	0.401	0.466	0.508	0.705	0.733	0.849	0.929	1.188	1.470	2.118	3.333	1.333	
757	0.190	0.285	0.368	0.142	0.485	0.536	0.616	0.773	0.862	0.966	1.278	1.532	2.329	3.918	2.318	
1135	0.208	0.305	0.391	0.442	0.517	0.571	0.638	0.789	0.927	1.020	1.301	1.651	2.447	4.476	3.762	
1702	0.225	0.327	0.411	0.468	0.527	0.588	0.668	0.835	0.968	1.030	1.371	1.678	2.543	4.924	5.292	
2553	0.244	0.346	0.428	0.492	0.563	0.599	0.694	0.852	0.964	1.072	1.417	1.698	2.605	4.969	9.462	
3829	0.252	0.353	0.438	0.494	0.582	0.621	0.701	0.871	0.985	1.091	1.422	1.769	2.668	5.223	24.933	
5743	0.271	0.370	0.460	0.515	0.599	0.639	0.718	0.901	1.005	1.110	1.437	1.806	2.743	5.180	36.125	
8614	0.282	0.386	0.474	0.528	0.603	0.655	0.725	0.912	1.040	1.106	1.467	1.801	2.722	5.209	49.906	
12921	0.285	0.391	0.482	0.541	0.599	0.659	0.740	0.928	1.027	1.132	1.481	1.805	2.732	5.259	70.235	
19381	0.338	0.423	0.522	0.543	0.636	0.687	0.747	0.945	1.061	1.156	1.490	1.829	2.755	5.268	118.200	
29071	0.360	0.455	0.557	0.590	0.668	0.720	0.808	1.002	1.105	1.140	1.512	1.846	2.765	5.287	222.111	
43606	0.380	0.493	0.568	0.640	0.710	0.777	0.862	1.025	1.183	1.168	1.567	1.826	2.785	5.321	437.684	
65409	0.422	0.534	0.617	0.724	0.798	0.821	0.884	1.088	1.253	1.258	1.605	1.840	2.765	5.323	599.000	
98113	0.457	0.560	0.724	0.778	0.835	0.865	0.972	1.261	1.250	1.278	1.663	1.902	2.769	5.270	889.659	
147169	0.495	0.642	0.774	0.789	0.911	0.969	1.038	1.319	1.344	1.325	1.695	1.925	2.825	5.279	1288.073	
220753	0.523	0.701	0.822	0.859	0.982	0.992	1.077	1.399	1.475	1.417	1.777	1.982	2.843	5.233	1910.525	
331129	0.643	0.753	0.907	0.954	1.043	1.039	1.153	1.468	1.470	1.495	1.890	2.018	2.862	5.101	2773.122	
496693	0.683	0.834	0.989	1.012	1.140	1.157	1.203	1.553	1.509	1.546	1.893	2.076	2.884	5.098	4659.750	
745039	0.768	0.925	1.069	1.061	1.252	1.252	1.250	1.646	1.688	1.654	1.925	2.152	2.943	5.036	6573.455	

--------------Boundary-00=_DG9615A6KF86G07HXJVF--