This chapter describes low-level GMP functions, used to implement the high-level GMP functions, but also intended for time-critical user code.

These functions start with the prefix `mpn_`

.

The `mpn`

functions are designed to be as fast as possible, **not**
to provide a coherent calling interface. The different functions have somewhat
similar interfaces, but there are variations that make them hard to use. These
functions do as little as possible apart from the real multiple precision
computation, so that no time is spent on things that not all callers need.

A source operand is specified by a pointer to the least significant limb and a limb count. A destination operand is specified by just a pointer. It is the responsibility of the caller to ensure that the destination has enough space for storing the result.

With this way of specifying operands, it is possible to perform computations on subranges of an argument, and store the result into a subrange of a destination.

A common requirement for all functions is that each source area needs at least one limb. No size argument may be zero. Unless otherwise stated, in-place operations are allowed where source and destination are the same, but not where they only partly overlap.

The `mpn`

functions are the base for the implementation of the
`mpz_`

, `mpf_`

, and `mpq_`

functions.

This example adds the number beginning at `s1p` and the number beginning at
`s2p` and writes the sum at `destp`. All areas have `n` limbs.

cy = mpn_add_n (destp, s1p, s2p, n)

It should be noted that the `mpn`

functions make no attempt to identify
high or low zero limbs on their operands, or other special forms. On random
data such cases will be unlikely and it'd be wasteful for every function to
check every time. An application knowing something about its data can take
steps to trim or perhaps split its calculations.

In the notation used below, a source operand is identified by the pointer to the least significant limb, and the limb count in braces. For example, {

— Function: mp_limb_t **mpn_add_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Add {

s1p,n} and {s2p,n}, and write thenleast significant limbs of the result torp. Return carry, either 0 or 1.This is the lowest-level function for addition. It is the preferred function for addition, since it is written in assembly for most CPUs. For addition of a variable to itself (i.e.,

s1pequalss2p) use`mpn_lshift`

with a count of 1 for optimal speed.

— Function: mp_limb_t **mpn_add_1** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n, mp_limb_t s2limb`)

Add {

s1p,n} ands2limb, and write thenleast significant limbs of the result torp. Return carry, either 0 or 1.

— Function: mp_limb_t **mpn_add** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t s1n, const mp_limb_t *s2p, mp_size_t s2n`)

Add {

s1p,s1n} and {s2p,s2n}, and write thes1nleast significant limbs of the result torp. Return carry, either 0 or 1.This function requires that

s1nis greater than or equal tos2n.

— Function: mp_limb_t **mpn_sub_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Subtract {

s2p,n} from {s1p,n}, and write thenleast significant limbs of the result torp. Return borrow, either 0 or 1.This is the lowest-level function for subtraction. It is the preferred function for subtraction, since it is written in assembly for most CPUs.

— Function: mp_limb_t **mpn_sub_1** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n, mp_limb_t s2limb`)

Subtract

s2limbfrom {s1p,n}, and write thenleast significant limbs of the result torp. Return borrow, either 0 or 1.

— Function: mp_limb_t **mpn_sub** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t s1n, const mp_limb_t *s2p, mp_size_t s2n`)

Subtract {

s2p,s2n} from {s1p,s1n}, and write thes1nleast significant limbs of the result torp. Return borrow, either 0 or 1.This function requires that

s1nis greater than or equal tos2n.

— Function: mp_limb_t **mpn_neg** (`mp_limb_t *rp, const mp_limb_t *sp, mp_size_t n`)

Perform the negation of {

sp,n}, and write the result to {rp,n}. This is equivalent to calling`mpn_sub_n`

with an-limb zero minuend and passing {sp,n} as subtrahend. Return borrow, either 0 or 1.

— Function: void **mpn_mul_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Multiply {

s1p,n} and {s2p,n}, and write the 2*n-limb result torp.The destination has to have space for 2*

nlimbs, even if the product's most significant limb is zero. No overlap is permitted between the destination and either source.If the two input operands are the same, use

`mpn_sqr`

.

— Function: mp_limb_t **mpn_mul** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t s1n, const mp_limb_t *s2p, mp_size_t s2n`)

Multiply {

s1p,s1n} and {s2p,s2n}, and write the (s1n+s2n)-limb result torp. Return the most significant limb of the result.The destination has to have space for

s1n+s2nlimbs, even if the product's most significant limb is zero. No overlap is permitted between the destination and either source.This function requires that

s1nis greater than or equal tos2n.

— Function: void **mpn_sqr** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n`)

Compute the square of {

s1p,n} and write the 2*n-limb result torp.The destination has to have space for 2

nlimbs, even if the result's most significant limb is zero. No overlap is permitted between the destination and the source.

— Function: mp_limb_t **mpn_mul_1** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n, mp_limb_t s2limb`)

Multiply {

s1p,n} bys2limb, and write thenleast significant limbs of the product torp. Return the most significant limb of the product. {s1p,n} and {rp,n} are allowed to overlap providedrp<=s1p.This is a low-level function that is a building block for general multiplication as well as other operations in GMP. It is written in assembly for most CPUs.

Don't call this function if

s2limbis a power of 2; use`mpn_lshift`

with a count equal to the logarithm ofs2limbinstead, for optimal speed.

— Function: mp_limb_t **mpn_addmul_1** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n, mp_limb_t s2limb`)

Multiply {

s1p,n} ands2limb, and add thenleast significant limbs of the product to {rp,n} and write the result torp. Return the most significant limb of the product, plus carry-out from the addition.This is a low-level function that is a building block for general multiplication as well as other operations in GMP. It is written in assembly for most CPUs.

— Function: mp_limb_t **mpn_submul_1** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n, mp_limb_t s2limb`)

Multiply {

s1p,n} ands2limb, and subtract thenleast significant limbs of the product from {rp,n} and write the result torp. Return the most significant limb of the product, plus borrow-out from the subtraction.This is a low-level function that is a building block for general multiplication and division as well as other operations in GMP. It is written in assembly for most CPUs.

— Function: void **mpn_tdiv_qr** (`mp_limb_t *qp, mp_limb_t *rp, mp_size_t qxn, const mp_limb_t *np, mp_size_t nn, const mp_limb_t *dp, mp_size_t dn`)

Divide {

np,nn} by {dp,dn} and put the quotient at {qp,nn−dn+1} and the remainder at {rp,dn}. The quotient is rounded towards 0.No overlap is permitted between arguments, except that

npmight equalrp. The dividend sizennmust be greater than or equal to divisor sizedn. The most significant limb of the divisor must be non-zero. Theqxnoperand must be zero.

— Function: mp_limb_t **mpn_divrem** (`mp_limb_t *r1p, mp_size_t qxn, mp_limb_t *rs2p, mp_size_t rs2n, const mp_limb_t *s3p, mp_size_t s3n`)

[This function is obsolete. Please call

`mpn_tdiv_qr`

instead for best performance.]Divide {

rs2p,rs2n} by {s3p,s3n}, and write the quotient atr1p, with the exception of the most significant limb, which is returned. The remainder replaces the dividend atrs2p; it will bes3nlimbs long (i.e., as many limbs as the divisor).In addition to an integer quotient,

qxnfraction limbs are developed, and stored after the integral limbs. For most usages,qxnwill be zero.It is required that

rs2nis greater than or equal tos3n. It is required that the most significant bit of the divisor is set.If the quotient is not needed, pass

rs2p+s3nasr1p. Aside from that special case, no overlap between arguments is permitted.Return the most significant limb of the quotient, either 0 or 1.

The area at

r1pneeds to bers2n−s3n+qxnlimbs large.

— Function: mp_limb_t **mpn_divrem_1** (`mp_limb_t *r1p, mp_size_t qxn, mp_limb_t *s2p, mp_size_t s2n, mp_limb_t s3limb`)

— Macro: mp_limb_t**mpn_divmod_1** (`mp_limb_t *r1p, mp_limb_t *s2p, mp_size_t s2n, mp_limb_t s3limb`)

— Macro: mp_limb_t

Divide {

s2p,s2n} bys3limb, and write the quotient atr1p. Return the remainder.The integer quotient is written to {

r1p+qxn,s2n} and in additionqxnfraction limbs are developed and written to {r1p,qxn}. Either or boths2nandqxncan be zero. For most usages,qxnwill be zero.

`mpn_divmod_1`

exists for upward source compatibility and is simply a macro calling`mpn_divrem_1`

with aqxnof 0.The areas at

r1pands2phave to be identical or completely separate, not partially overlapping.

— Function: mp_limb_t **mpn_divmod** (`mp_limb_t *r1p, mp_limb_t *rs2p, mp_size_t rs2n, const mp_limb_t *s3p, mp_size_t s3n`)

[This function is obsolete. Please call

`mpn_tdiv_qr`

instead for best performance.]

— Macro: mp_limb_t **mpn_divexact_by3** (`mp_limb_t *rp, mp_limb_t *sp, mp_size_t n`)

— Function: mp_limb_t**mpn_divexact_by3c** (`mp_limb_t *rp, mp_limb_t *sp, mp_size_t n, mp_limb_t carry`)

— Function: mp_limb_t

Divide {

sp,n} by 3, expecting it to divide exactly, and writing the result to {rp,n}. If 3 divides exactly, the return value is zero and the result is the quotient. If not, the return value is non-zero and the result won't be anything useful.

`mpn_divexact_by3c`

takes an initial carry parameter, which can be the return value from a previous call, so a large calculation can be done piece by piece from low to high.`mpn_divexact_by3`

is simply a macro calling`mpn_divexact_by3c`

with a 0 carry parameter.These routines use a multiply-by-inverse and will be faster than

`mpn_divrem_1`

on CPUs with fast multiplication but slow division.The source a, result q, size n, initial carry i, and return value c satisfy c*b^n + a-i = 3*q, where b=2^GMP_NUMB_BITS. The return c is always 0, 1 or 2, and the initial carry i must also be 0, 1 or 2 (these are both borrows really). When c=0 clearly q=(a-i)/3. When c!=0, the remainder (a-i) mod 3 is given by 3-c, because b == 1 mod 3 (when

`mp_bits_per_limb`

is even, which is always so currently).

— Function: mp_limb_t **mpn_mod_1** (`const mp_limb_t *s1p, mp_size_t s1n, mp_limb_t s2limb`)

Divide {

s1p,s1n} bys2limb, and return the remainder.s1ncan be zero.

— Function: mp_limb_t **mpn_lshift** (`mp_limb_t *rp, const mp_limb_t *sp, mp_size_t n, unsigned int count`)

Shift {

sp,n} left bycountbits, and write the result to {rp,n}. The bits shifted out at the left are returned in the least significantcountbits of the return value (the rest of the return value is zero).

countmust be in the range 1 to`mp_bits_per_limb`

−1. The regions {sp,n} and {rp,n} may overlap, providedrp>=sp.This function is written in assembly for most CPUs.

— Function: mp_limb_t **mpn_rshift** (`mp_limb_t *rp, const mp_limb_t *sp, mp_size_t n, unsigned int count`)

Shift {

sp,n} right bycountbits, and write the result to {rp,n}. The bits shifted out at the right are returned in the most significantcountbits of the return value (the rest of the return value is zero).

countmust be in the range 1 to`mp_bits_per_limb`

−1. The regions {sp,n} and {rp,n} may overlap, providedrp<=sp.This function is written in assembly for most CPUs.

— Function: int **mpn_cmp** (`const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Compare {

s1p,n} and {s2p,n} and return a positive value ifs1>s2, 0 if they are equal, or a negative value ifs1<s2.

— Function: mp_size_t **mpn_gcd** (`mp_limb_t *rp, mp_limb_t *xp, mp_size_t xn, mp_limb_t *yp, mp_size_t yn`)

Set {

rp,retval} to the greatest common divisor of {xp,xn} and {yp,yn}. The result can be up toynlimbs, the return value is the actual number produced. Both source operands are destroyed.It is required that

xn>=yn> 0, and the most significant limb of {yp,yn} must be non-zero. No overlap is permitted between {xp,xn} and {yp,yn}.

— Function: mp_limb_t **mpn_gcd_1** (`const mp_limb_t *xp, mp_size_t xn, mp_limb_t ylimb`)

Return the greatest common divisor of {

xp,xn} andylimb. Both operands must be non-zero.

— Function: mp_size_t **mpn_gcdext** (`mp_limb_t *gp, mp_limb_t *sp, mp_size_t *sn, mp_limb_t *up, mp_size_t un, mp_limb_t *vp, mp_size_t vn`)

Let

Ube defined by {up,un} and letVbe defined by {vp,vn}.Compute the greatest common divisor G of U and V. Compute a cofactor S such that G = US + VT. The second cofactor

Tis not computed but can easily be obtained from (G-U*S) /V(the division will be exact). It is required thatun>=vn> 0, and the most significant limb of {vp,vn} must be non-zero.S satisfies S = 1 or abs(S) < V / (2 G). S = 0 if and only if V divides U (i.e., G = V).

Store G at

gpand let the return value define its limb count. Store S atspand let |*sn| define its limb count. S can be negative; when this happens *snwill be negative. The area atgpshould have room forvnlimbs and the area atspshould have room forvn+1 limbs.Both source operands are destroyed.

Compatibility notes: GMP 4.3.0 and 4.3.1 defined S less strictly. Earlier as well as later GMP releases define S as described here. GMP releases before GMP 4.3.0 required additional space for both input and output areas. More precisely, the areas {

up,un+1} and {vp,vn+1} were destroyed (i.e. the operands plus an extra limb past the end of each), and the areas pointed to bygpandspshould each have room forun+1 limbs.

— Function: mp_size_t **mpn_sqrtrem** (`mp_limb_t *r1p, mp_limb_t *r2p, const mp_limb_t *sp, mp_size_t n`)

Compute the square root of {

sp,n} and put the result at {r1p, ceil(n/2)} and the remainder at {r2p,retval}.r2pneeds space fornlimbs, but the return value indicates how many are produced.The most significant limb of {

sp,n} must be non-zero. The areas {r1p, ceil(n/2)} and {sp,n} must be completely separate. The areas {r2p,n} and {sp,n} must be either identical or completely separate.If the remainder is not wanted then

r2pcan be`NULL`

, and in this case the return value is zero or non-zero according to whether the remainder would have been zero or non-zero.A return value of zero indicates a perfect square. See also

`mpn_perfect_square_p`

.

— Function: size_t **mpn_sizeinbase** (`const mp_limb_t *xp, mp_size_t n, int base`)

Return the size of {

xp,n} measured in number of digits in the givenbase.basecan vary from 2 to 62. Requiresn> 0 andxp[n-1] > 0. The result will be either exact or 1 too big. Ifbaseis a power of 2, the result is always exact.

— Function: mp_size_t **mpn_get_str** (`unsigned char *str, int base, mp_limb_t *s1p, mp_size_t s1n`)

Convert {

s1p,s1n} to a raw unsigned char array atstrin basebase, and return the number of characters produced. There may be leading zeros in the string. The string is not in ASCII; to convert it to printable format, add the ASCII codes for ‘0’ or ‘A’, depending on the base and range.basecan vary from 2 to 256.The most significant limb of the input {

s1p,s1n} must be non-zero. The input {s1p,s1n} is clobbered, except whenbaseis a power of 2, in which case it's unchanged.The area at

strhas to have space for the largest possible number represented by as1nlong limb array, plus one extra character.

— Function: mp_size_t **mpn_set_str** (`mp_limb_t *rp, const unsigned char *str, size_t strsize, int base`)

Convert bytes {

str,strsize} in the givenbaseto limbs atrp.

str[0] is the most significant input byte andstr[strsize-1] is the least significant input byte. Each byte should be a value in the range 0 tobase-1, not an ASCII character.basecan vary from 2 to 256.The converted value is {

rp,rn} wherernis the return value. If the most significant input bytestr[0] is non-zero, thenrp[rn-1] will be non-zero, elserp[rn-1] and some number of subsequent limbs may be zero.The area at

rphas to have space for the largest possible number withstrsizedigits in the chosen base, plus one extra limb.The input must have at least one byte, and no overlap is permitted between {

str,strsize} and the result atrp.

— Function: mp_bitcnt_t **mpn_scan0** (`const mp_limb_t *s1p, mp_bitcnt_t bit`)

Scan

s1pfrom bit positionbitfor the next clear bit.It is required that there be a clear bit within the area at

s1pat or beyond bit positionbit, so that the function has something to return.

— Function: mp_bitcnt_t **mpn_scan1** (`const mp_limb_t *s1p, mp_bitcnt_t bit`)

Scan

s1pfrom bit positionbitfor the next set bit.It is required that there be a set bit within the area at

s1pat or beyond bit positionbit, so that the function has something to return.

— Function: void **mpn_random** (`mp_limb_t *r1p, mp_size_t r1n`)

— Function: void**mpn_random2** (`mp_limb_t *r1p, mp_size_t r1n`)

— Function: void

Generate a random number of length

r1nand store it atr1p. The most significant limb is always non-zero.`mpn_random`

generates uniformly distributed limb data,`mpn_random2`

generates long strings of zeros and ones in the binary representation.

`mpn_random2`

is intended for testing the correctness of the`mpn`

routines.

— Function: mp_bitcnt_t **mpn_popcount** (`const mp_limb_t *s1p, mp_size_t n`)

Count the number of set bits in {

s1p,n}.

— Function: mp_bitcnt_t **mpn_hamdist** (`const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Compute the hamming distance between {

s1p,n} and {s2p,n}, which is the number of bit positions where the two operands have different bit values.

— Function: int **mpn_perfect_square_p** (`const mp_limb_t *s1p, mp_size_t n`)

Return non-zero iff {

s1p,n} is a perfect square. The most significant limb of the input {s1p,n} must be non-zero.

— Function: void **mpn_and_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Perform the bitwise logical and of {

s1p,n} and {s2p,n}, and write the result to {rp,n}.

— Function: void **mpn_ior_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Perform the bitwise logical inclusive or of {

s1p,n} and {s2p,n}, and write the result to {rp,n}.

— Function: void **mpn_xor_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Perform the bitwise logical exclusive or of {

s1p,n} and {s2p,n}, and write the result to {rp,n}.

— Function: void **mpn_andn_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Perform the bitwise logical and of {

s1p,n} and the bitwise complement of {s2p,n}, and write the result to {rp,n}.

— Function: void **mpn_iorn_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Perform the bitwise logical inclusive or of {

s1p,n} and the bitwise complement of {s2p,n}, and write the result to {rp,n}.

— Function: void **mpn_nand_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Perform the bitwise logical and of {

s1p,n} and {s2p,n}, and write the bitwise complement of the result to {rp,n}.

— Function: void **mpn_nior_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Perform the bitwise logical inclusive or of {

s1p,n} and {s2p,n}, and write the bitwise complement of the result to {rp,n}.

— Function: void **mpn_xnor_n** (`mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

Perform the bitwise logical exclusive or of {

s1p,n} and {s2p,n}, and write the bitwise complement of the result to {rp,n}.

— Function: void **mpn_com** (`mp_limb_t *rp, const mp_limb_t *sp, mp_size_t n`)

Perform the bitwise complement of {

sp,n}, and write the result to {rp,n}.

— Function: void **mpn_copyi** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n`)

Copy from {

s1p,n} to {rp,n}, increasingly.

— Function: void **mpn_copyd** (`mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n`)

Copy from {

s1p,n} to {rp,n}, decreasingly.

The functions prefixed with `mpn_sec_`

and `mpn_cnd_`

are designed to
perform the exact same low-level operations and have the same cache access
patterns for any two same-size arguments, assuming that function arguments are
placed at the same position and that the machine state is identical upon
function entry. These functions are intended for cryptographic purposes, where
resilience to side-channel attacks is desired.

These functions are less efficient than their “leaky” counterparts; their performance for operands of the sizes typically used for cryptographic applications is between 15% and 100% worse. For larger operands, these functions might be inadequate, since they rely on asymptotically elementary algorithms.

These functions do not make any explicit allocations. Those of these functions that need scratch space accept a scratch space operand. This convention allows callers to keep sensitive data in designated memory areas. Note however that compilers may choose to spill scalar values used within these functions to their stack frame and that such scalars may contain sensitive data.

In addition to these specially crafted functions, the following `mpn`

functions are naturally side-channel resistant: `mpn_add_n`

,
`mpn_sub_n`

, `mpn_lshift`

, `mpn_rshift`

, `mpn_zero`

,
`mpn_copyi`

, `mpn_copyd`

, `mpn_com`

, and the logical function
(`mpn_and_n`

, etc).

There are some exceptions from the side-channel resilience: (1) Some assembly
implementations of `mpn_lshift`

identify shift-by-one as a special case.
This is a problem iff the shift count is a function of sensitive data. (2)
Alpha ev6 and Pentium4 using 64-bit limbs have leaky `mpn_add_n`

and
`mpn_sub_n`

. (3) Alpha ev6 has a leaky `mpn_mul_1`

which also makes
`mpn_sec_mul`

on those systems unsafe.

— Function: mp_limb_t **mpn_cnd_add_n** (`mp_limb_t cnd, mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

— Function: mp_limb_t**mpn_cnd_sub_n** (`mp_limb_t cnd, mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n`)

— Function: mp_limb_t

These functions do conditional addition and subtraction. If

cndis non-zero, they produce the same result as a regular`mpn_add_n`

or`mpn_sub_n`

, and ifcndis zero, they copy {s1p,n} to the result area and return zero. The functions are designed to have timing and memory access patterns depending only on size and location of the data areas, but independent of the conditioncnd. Like for`mpn_add_n`

and`mpn_sub_n`

, on most machines, the timing will also be independent of the actual limb values.

— Function: mp_limb_t **mpn_sec_add_1** (`mp_limb_t *rp, const mp_limb_t *ap, mp_size_t n, mp_limb_t b, mp_limb_t *tp`)

— Function: mp_limb_t**mpn_sec_sub_1** (`mp_limb_t *rp, const mp_limb_t *ap, mp_size_t n, mp_limb_t b, mp_limb_t *tp`)

— Function: mp_limb_t

Set

RtoA+borA-b, respectively, whereR= {rp,n},A= {ap,n}, andbis a single limb. Returns carry.These functions take O(N) time, unlike the leaky functions

`mpn_add_1`

which are O(1) on average. They require scratch space of`mpn_sec_add_1_itch(`

n`)`

and`mpn_sec_sub_1_itch(`

n`)`

limbs, respectively, to be passed in thetpparameter. The scratch space requirements are guaranteed to increase monotonously in the operand size.

— Function: void **mpn_sec_mul** (`mp_limb_t *rp, const mp_limb_t *ap, mp_size_t an, const mp_limb_t *bp, mp_size_t bn, mp_limb_t *tp`)

— Function: mp_size_t**mpn_sec_mul_itch** (`mp_size_t an, mp_size_t bn`)

— Function: mp_size_t

Set

Rto A * B, whereA= {ap,an},B= {bp,bn}, andR= {rp,an+bn}.It is required that

an>=bn> 0.No overlapping between

Rand the input operands is allowed. ForA=B, use`mpn_sec_sqr`

for optimal performance.This function requires scratch space of

`mpn_sec_mul_itch(`

an`,`

bn`)`

limbs to be passed in thetpparameter. The scratch space requirements are guaranteed to increase monotonously in the operand sizes.

— Function: void **mpn_sec_sqr** (`mp_limb_t *rp, const mp_limb_t *ap, mp_size_t an, mp_limb_t *tp`)

— Function: mp_size_t**mpn_sec_sqr_itch** (`mp_size_t an`)

— Function: mp_size_t

Set

Rto A^2, whereA= {ap,an}, andR= {rp,2an}.It is required that

an> 0.No overlapping between

Rand the input operands is allowed.This function requires scratch space of

`mpn_sec_sqr_itch(`

an`)`

limbs to be passed in thetpparameter. The scratch space requirements are guaranteed to increase monotonously in the operand size.

— Function: void **mpn_sec_powm** (`mp_limb_t *rp, const mp_limb_t *bp, mp_size_t bn, const mp_limb_t *ep, mp_bitcnt_t enb, const mp_limb_t *mp, mp_size_t n, mp_limb_t *tp`)

— Function: mp_size_t**mpn_sec_powm_itch** (`mp_size_t bn, mp_bitcnt_t enb, size_t n`)

— Function: mp_size_t

Set

Rto (Braised toE) moduloM, whereR= {rp,n},M= {mp,n}, andE= {ep,ceil(enb/`GMP_NUMB_BITS`

)}.It is required that

B> 0, thatM> 0 is odd, and thatE< 2^enb.No overlapping between

Rand the input operands is allowed.This function requires scratch space of

`mpn_sec_powm_itch(`

bn`,`

enb`,`

n`)`

limbs to be passed in thetpparameter. The scratch space requirements are guaranteed to increase monotonously in the operand sizes.

— Function: void **mpn_sec_tabselect** (`mp_limb_t *rp, const mp_limb_t *tab, mp_size_t n, mp_size_t nents, mp_size_t which`)

Select entry

whichfrom tabletab, which hasnentsentries, eachnlimbs. Store the selected entry atrp.This function reads the entire table to avoid side-channel information leaks.

— Function: mp_limb_t **mpn_sec_div_qr** (`mp_limb_t *qp, mp_limb_t *np, mp_size_t nn, const mp_limb_t *dp, mp_size_t dn, mp_limb_t *tp`)

— Function: mp_size_t**mpn_sec_div_qr_itch** (`mp_size_t nn, mp_size_t dn`)

— Function: mp_size_t

Set

Qto the truncated quotientN/DandRtoNmoduloD, whereN= {np,nn},D= {dp,dn},Q's most significant limb is the function return value and the remaining limbs are {qp,nn-dn}, andR= {np,dn}.It is required that

nn>=dn>= 1, and thatdp[dn-1] != 0. This does not imply thatN>=DsinceNmight be zero-padded.Note the overlapping between

NandR. No other operand overlapping is allowed. The entire space occupied byNis overwritten.This function requires scratch space of

`mpn_sec_div_qr_itch(`

nn`,`

dn`)`

limbs to be passed in thetpparameter.

— Function: void **mpn_sec_div_r** (`mp_limb_t *np, mp_size_t nn, const mp_limb_t *dp, mp_size_t dn, mp_limb_t *tp`)

— Function: mp_size_t**mpn_sec_div_r_itch** (`mp_size_t nn, mp_size_t dn`)

— Function: mp_size_t

Set

RtoNmoduloD, whereN= {np,nn},D= {dp,dn}, andR= {np,dn}.It is required that

nn>=dn>= 1, and thatdp[dn-1] != 0. This does not imply thatN>=DsinceNmight be zero-padded.Note the overlapping between

NandR. No other operand overlapping is allowed. The entire space occupied byNis overwritten.This function requires scratch space of

`mpn_sec_div_r_itch(`

nn`,`

dn`)`

limbs to be passed in thetpparameter.

— Function: int **mpn_sec_invert** (`mp_limb_t *rp, mp_limb_t *ap, const mp_limb_t *mp, mp_size_t n, mp_bitcnt_t nbcnt, mp_limb_t *tp`)

— Function: mp_size_t**mpn_sec_invert_itch** (`mp_size_t n`)

— Function: mp_size_t

Set

Rto the inverse ofAmoduloM, whereR= {rp,n},A= {ap,n}, andM= {mp,n}.This function's interface is preliminary.If an inverse exists, return 1, otherwise return 0 and leave

Rundefined. In either case, the inputAis destroyed.It is required that

Mis odd, and thatnbcnt>= ceil(log(A+1)) + ceil(log(M+1)). A safe choice isnbcnt= 2 *n* GMP_NUMB_BITS, but a smaller value might improve performance ifMorAare known to have leading zero bits.This function requires scratch space of

`mpn_sec_invert_itch(`

n`)`

limbs to be passed in thetpparameter.

**Everything in this section is highly experimental and may disappear or
be subject to incompatible changes in a future version of GMP.**

Nails are an experimental feature whereby a few bits are left unused at the
top of each `mp_limb_t`

. This can significantly improve carry handling
on some processors.

All the `mpn`

functions accepting limb data will expect the nail bits to
be zero on entry, and will return data with the nails similarly all zero.
This applies both to limb vectors and to single limb arguments.

Nails can be enabled by configuring with ‘`--enable-nails`’. By default
the number of bits will be chosen according to what suits the host processor,
but a particular number can be selected with ‘`--enable-nails=N`’.

At the mpn level, a nail build is neither source nor binary compatible with a non-nail build, strictly speaking. But programs acting on limbs only through the mpn functions are likely to work equally well with either build, and judicious use of the definitions below should make any program compatible with either build, at the source level.

For the higher level routines, meaning `mpz`

etc, a nail build should be
fully source and binary compatible with a non-nail build.

— Macro: **GMP_NAIL_BITS**

— Macro:**GMP_NUMB_BITS**

— Macro:**GMP_LIMB_BITS**

— Macro:

— Macro:

`GMP_NAIL_BITS`

is the number of nail bits, or 0 when nails are not in use.`GMP_NUMB_BITS`

is the number of data bits in a limb.`GMP_LIMB_BITS`

is the total number of bits in an`mp_limb_t`

. In all casesGMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS

— Macro: **GMP_NAIL_MASK**

— Macro:**GMP_NUMB_MASK**

— Macro:

Bit masks for the nail and number parts of a limb.

`GMP_NAIL_MASK`

is 0 when nails are not in use.

`GMP_NAIL_MASK`

is not often needed, since the nail part can be obtained with`x >> GMP_NUMB_BITS`

, and that means one less large constant, which can help various RISC chips.

— Macro: **GMP_NUMB_MAX**

The maximum value that can be stored in the number part of a limb. This is the same as

`GMP_NUMB_MASK`

, but can be used for clarity when doing comparisons rather than bit-wise operations.

The term “nails” comes from finger or toe nails, which are at the ends of a limb (arm or leg). “numb” is short for number, but is also how the developers felt after trying for a long time to come up with sensible names for these things.

In the future (the distant future most likely) a non-zero nail might be permitted, giving non-unique representations for numbers in a limb vector. This would help vector processors since carries would only ever need to propagate one or two limbs.