Uninitialized memory bug found in /mpn/generic/mod_1_1.c
Brett Kuntz
kuntz at shaw.ca
Thu Aug 31 02:21:45 CEST 2023
Hello, today I found a bug in the latest GMP (6.3) using uninitialized memory in /mpn/generic/mod_1_1.c while trying to understand some functions like mpn_mod_1_1p_cps.
Take a look at function mpn_mod_1() in /mpn/generic/ mod_1.c on lines 248 - 250
mp_limb_t pre[4];
mpn_mod_1_1p_cps (pre, b);
return mpn_mod_1_1p (ap, n, b, pre);
mp_limb_t pre[4] is not initialized and the mpn_mod_1_1p_cps() function never writes to pre[2]. So if we change that to mp_limb_t pre[4] = { -1, -1, -1, -1 }; as a test we'll quickly see that inside mpn_mod_1_1p_cps() the value cps[2] (pre[2]) is never written to and if we print out pre[4] after running it we'll get output like: 0x21CFE6CFC938B36BU, 0x0000000000000000U, 0xFFFFFFFFFFFFFFFFU, 0x9581CA13918612E1U (pre[2] is -1)
/mpn/generic/mod_1_1.c lines 260-266:
if (LIKELY (cnt != 0))
{
mp_limb_t B1modb = -b;
B1modb *= ((bi >> (GMP_LIMB_BITS-cnt)) | (CNST_LIMB(1) << cnt));
ASSERT (B1modb <= b); /* NB: not fully reduced mod b */
cps[2] = B1modb >> cnt;
}
cnt will ALWAYS equal 0 since there will be NO leading 0's when this is called since /mpn/generic/mod_1_1.c checks for the high bit being SET before calling.. which means this function only ever gets called with NO leading zeros!!!
ALSO there are unexplained differences between MOD_1_1P_METHOD 1 and MOD_1_1P_METHOD 2 when it comes to the function mpn_mod_1_1p_cps(). In one of them pre[4] is always set and in the other (method 2) pre[2] is skipped! Why are these two functions different? It looks like their job is to calculate the inverses for the following mod functions so one would expect them to be the same.
Method 2 is the function that was written in x64 asm.
ALSO the x64 assembly versions of these codes also does not always set all 4 values of pre[4] and this has lead to incorrect results when using the exported function mpn_mod_1().
/mpn/x86_64/mod_1_1.asm
neg %r12
mov %r12, %r8
mov %rax, (%rbx) C store bi
mov %rbp, 8(%rbx) C store cnt
imul %rax, %r12
mov %r12, 24(%rbx) C store B2modb
mov R32(%rbp), R32(%rcx)
test R32(%rcx), R32(%rcx)
jz L(z)
mov $1, R32(%rdx)
ifdef(`SHLD_SLOW',`
C Destroys %rax, unlike shld. Otherwise, we could do B1modb
C before B2modb, and get rid of the move %r12, %r8 above.
shl R8(%rcx), %rdx
neg R32(%rcx)
shr R8(%rcx), %rax
or %rax, %rdx
neg R32(%rcx)
shld R8(%rcx), %rax, %rdx
imul %rdx, %r8
shr R8(%rcx), %r8
mov %r8, 16(%rbx) C store B1modb
L(z):
pop %r12
pop %rbx
pop %rbp
FUNC_EXIT()
ret
Notice on line 230 of the asm file (mov %r8, 16(%rbx) // C store B1modb) gets skipped over when count is 0 (which it will be 100% of the time when the high-bit is set).
I checked mod_1_2.c, mod_1_3.c, and mod_1_4.c and they do not seem to have this problem.
Also note that most compilers and run-times for me seem to have a flukey '0' here in memory (which will be the correct value when the high-bit is set) so that might be why no one has stumbled upon this bug yet or why it might pass tests.
I do not understand these functions enough to suggest a fix. I do not know if initializing pre[4] to all 0's will produce correct results.
Which is the correct version of mpn_mod_1_1p_cps() to use for now? (Method 1 or Method 2)
My apologies if this is not the best bug report ever.
-Brett Kuntz
More information about the gmp-bugs
mailing list