From marc.glisse at inria.fr  Tue Apr  1 15:30:53 2025
From: marc.glisse at inria.fr (Marc Glisse)
Date: Tue, 1 Apr 2025 15:30:53 +0200 (CEST)
Subject: [PATCH] acinclude.m4: Add parameter names in prototype for
 g().
In-Reply-To: <20250315165840.2519326-1-raj.khem@gmail.com>
References: <20250315165840.2519326-1-raj.khem@gmail.com>
Message-ID: <41b67e80-7012-2730-96b9-5d19ab816903@inria.fr>

Done. Thanks, and sorry for breaking it.

-- 
Marc Glisse

On Sat, 15 Mar 2025, Khem Raj wrote:

> This allows it to compile with older gcc e.g. gcc-10
> which does not have allow parameter name omission, it results
> in
>
> a.c: In function ?g?:
> a.c:3:8: error: parameter name omitted
>    3 | void g(int,t1 const*,t1,t2,t1 const*,int){}
>      |        ^~~
>
> this was added to gcc via [1] thats why it is supported in
> newer gcc.
>
> Adding the parameter names make it compatible with
> old and new gcc
>
> [1] https://gcc.gnu.org/pipermail/gcc-cvs/2020-October/336068.html
>
> Signed-off-by: Khem Raj <raj.khem at gmail.com>
> ---
> ChangeLog
>
> 2025-03-15  Khem Raj <raj.khem at gmail.com>
>
>   * acinclude.m4: Add parameter names to function prototype.
>
> acinclude.m4 | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/acinclude.m4 b/acinclude.m4
> index 4fca12de2..b9d1eacfe 100644
> --- a/acinclude.m4
> +++ b/acinclude.m4
> @@ -609,7 +609,7 @@ GMP_PROG_CC_WORKS_PART([$1], [long long reliability test 1],
> 
> #if defined (__GNUC__) && ! defined (__cplusplus)
> typedef unsigned long long t1;typedef t1*t2;
> -void g(int,t1 const*,t1,t2,t1 const*,int){}
> +void g(int a,t1 const* b,t1 c,t2 d,t1 const* e,int f){}
> void h(){}
> static __inline__ t1 e(t2 rp,t2 up,int n,t1 v0)
> {t1 c,x,r;int i;if(v0){c=1;for(i=1;i<n;i++){x=up[i];r=x+1;rp[i]=r;}}return c;}
> _______________________________________________
> gmp-devel mailing list
> gmp-devel at gmplib.org
> https://gmplib.org/mailman/listinfo/gmp-devel

From jeremy.linton at arm.com  Fri Apr 18 02:56:30 2025
From: jeremy.linton at arm.com (Jeremy Linton)
Date: Thu, 17 Apr 2025 19:56:30 -0500
Subject: [PATCH v5 1/1] aarch64: support PAC and BTI
Message-ID: <06a7d350-7923-4208-9fb7-0c5eabeacd1d@arm.com>

Hi,

First I apologize, that this mail will likely not thread properly.

Secondly, thanks for working on this!

I spun this up on an orion6 (PAC+BTI+MTE) board running Fedora 42 and it 
seems to be working correctly. The library now has appropriate gnu 
notes, and the unit tests/etc all seem to be working as expected. I 
noticed a few largly trivial things while reading the patch.


On 3/25/25 Bill Roberts,  wrote:
> 
> Enable Pointer Authentication Codes (PAC) and Branch Target
> Identification (BTI) support for ARM 64 targets.
> 
> PAC works by signing the LR with either an A key or B key and verifying
> the return address. There are quite a few instructions capable of doing
> this, however, the Linux ARM ABI is to use hint compatible instructions
This might be better worded something similar to:

"While there are several instructions that can perform this operation, 
the Linux ARM ABI uses hint-space instructions which execute as NOPs on 
older hardware."
> that can be safely NOP'd on older hardware and can be assembled and
> linked with older binutils. This limits the instruction set to paciasp,
> pacibsp, autiasp and autibsp. Instructions prefixed with pac are for
> signing and instructions prefixed with aut are for signing. Both

aut is for authentication.

> instructions are then followed with an a or b to indicate which signing
> key they are using. The keys can be controlled using
> -mbranch-protection=pac-ret for the A key and
> -mbranch-protection=pac-ret+b-key for the B key.
> 
> BTI works by marking all call and jump positions with bti c and bti
> j instructions. If execution control transfers to an instruction other
> than a BTI instruction, the execution is killed via SIGILL. Note that
> to remove one instruction, the aforementioned pac instructions will
> also work as a BTI landing pad for bti c usages.
> 
> For BTI to work, all object files linked for a unit of execution,
> whether an executable or a library must have the GNU Notes section of
> the ELF file marked to indicate BTI support. This is so loader/linkers
> can apply the proper permission bits (PROT_BRI) on the memory region.
> 
> PAC can also be annotated in the GNU ELF notes section, but it's not
> required for enablement, as interleaved PAC and non-pac code works as
> expected since it's the callee that performs all the checking. The
> linker follows the same rules as BTI for discarding the PAC flag from
> the GNU Notes section.
> 
> Testing was done under the following CFLAGS and CXXFLAGS for all
> combinations:
> 1. -mbranch-protection=none
> 2. -mbranch-protection=standard
> 3. -mbranch-protection=pac-ret
> 4. -mbranch-protection=pac-ret+b-key
> 5. -mbranch-protection=bti
> 
> Add tests that get skipped on non-pac and bti enabled systems,
> so this safely limits the tests to aarch64 platforms with support.
> One test dynamically tests that an mpn assembly routine supports
> BTI when the binary is enabled AND the system has support by
> calling that routine and verifying it's functionality and then
> by calling it one instruction past the correct entry point, and
> thus missing the landing pad.
> 
> The other test added, tests that the ELF binary has the proper
> GNU Notes section for the set of build flags.
> 
> Signed-off-by: Bill Roberts <bill.roberts at arm.com>
> ---
>   acinclude.m4                   |  33 +++++++++++
>   configure.ac                   |  22 +++++++-
>   mpn/Makeasm.am                 |   3 +-
>   mpn/arm64/arm64-defs.m4        | 100 +++++++++++++++++++++++++++++++++
>   mpn/arm64/divrem_1.asm         |   8 ++-
>   tests/mpn/Makefile.am          |  43 +++++++++-----
>   tests/mpn/log-compiler.sh      |  21 +++++++
>   tests/mpn/t-arm64_bti.c        |  86 ++++++++++++++++++++++++++++
>   tests/mpn/t-arm64_elf_check.sh |  96 +++++++++++++++++++++++++++++++
>   9 files changed, 394 insertions(+), 18 deletions(-)
>   create mode 100755 tests/mpn/log-compiler.sh
>   create mode 100644 tests/mpn/t-arm64_bti.c
>   create mode 100755 tests/mpn/t-arm64_elf_check.sh
> 
> diff --git a/acinclude.m4 b/acinclude.m4
> index 4fca12de2..4b9a579b1 100644
> --- a/acinclude.m4
> +++ b/acinclude.m4
> @@ -3992,3 +3992,36 @@ case $gmp_cv_check_libm_for_build in
>     *)   LIBM_FOR_BUILD=$gmp_cv_check_libm_for_build ;;
>   esac
>   ])
> +
> +# Define GMP_GET_MACRO_VALUE to capture the value of a C preprocessor symbol via compilation.
> +# This is useful when something like AC_EGREP_CPP doesn't have the correct environment.
> +# Arg 1 - The name of the macro to check in the compiled program.
> +# Arg 2 - The variable name to define the value of the macro to.
> +# Arg 3 - The default value if not defined.
> +#
> +# Example: GMP_GET_MACRO_VALUE([FOO], [BAR], [0])
> +# This will check for macro FOO and define in a new variable BAR the value
> +# of FOO as derived from invoking the C pre-processor or the default value
> +# as specified by the caller.
> +#
> +AC_DEFUN([GMP_GET_MACRO_VALUE], [
> +  AC_MSG_CHECKING([value of $1])
> +
> +  $2=$(printf "#ifdef $1\n$1_VALUE=$1\n#else\n$1_VALUE=$3\n#endif\n" | ${CC} ${CFLAGS} -E - | grep "$1_VALUE" | cut -d'=' -f2-)
> +  AC_MSG_RESULT([$$2])
> +])
> +
> +# Define GMP_CHECK_PROG to find a host program using AC_CHECK_PROG and fail if not found.
> +#
> +# Arg 1 - The name of the variable to define if found.
> +# Arg 2 - The program to check for, and the value of the variable named in argument 1.
> +#
> +# Example: GMP_CHECK_PROG([GREP], [grep])
> +# This will check for program grep and define GREP equal to "grep"
> +#
> +AC_DEFUN([GMP_CHECK_PROG], [
> +  AC_CHECK_PROG([$1], [$2], [$2])
> +  if test "$$1" != "$2"; then
> +    AC_MSG_FAILURE([Could not find $2! Ensure it's on PATH and/or installed."])

Is there an extranious quote at the end?


> +  fi
> +])
> diff --git a/configure.ac b/configure.ac
> index edee25fae..bd3524cb9 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -82,6 +82,8 @@ AM_INIT_AUTOMAKE([1.8 gnu no-dependencies subdir-objects])
>   AC_CONFIG_HEADERS(config.h:config.in)
>   AM_MAINTAINER_MODE
>   
> +GMP_CHECK_PROG([GREP], [grep])
> +GMP_CHECK_PROG([CUT], [cut])
>   
>   AC_ARG_ENABLE(assert,
>   AS_HELP_STRING([--enable-assert],[enable ASSERT checking [default=no]]),
> @@ -3767,7 +3769,17 @@ if test "$gmp_asm_syntax_testing" != no; then
>   	    *-*-darwin*)
>   	      GMP_INCLUDE_MPN(arm64/darwin.m4) ;;
>   	    *)
> -	      GMP_INCLUDE_MPN(arm64/arm64-defs.m4) ;;
> +	      GMP_INCLUDE_MPN(arm64/arm64-defs.m4)
> +	      GMP_GET_MACRO_VALUE([__ARM_FEATURE_BTI_DEFAULT], [ARM64_FEATURE_BTI_DEFAULT], [0])
> +		  GMP_DEFINE_RAW(["define(<ARM64_FEATURE_BTI_DEFAULT>,<$ARM64_FEATURE_BTI_DEFAULT>)"])
> +	      AC_SUBST([ARM64_FEATURE_BTI_DEFAULT])
> +
> +	      GMP_GET_MACRO_VALUE([__ARM_FEATURE_PAC_DEFAULT], [ARM64_FEATURE_PAC_DEFAULT], [0])
> +		  GMP_DEFINE_RAW(["define(<ARM64_FEATURE_PAC_DEFAULT>,<$ARM64_FEATURE_PAC_DEFAULT>)"])
> +	      AC_SUBST([ARM64_FEATURE_PAC_DEFAULT])
> +
> +	      GMP_GET_MACRO_VALUE([__ELF__], [ARM64_ELF], [0])
> +		  GMP_DEFINE_RAW(["define(<ARM64_ELF>,<$ARM64_ELF>)"])
>             esac
>   	  ;;
>         esac
> @@ -4058,6 +4070,12 @@ fi
>   AC_PROG_YACC
>   AM_PROG_LEX
>   
> +AC_CHECK_TOOL([HAVE_BASH], [bash], [no])
> +AM_CONDITIONAL([HAVE_BASH], [test "$HAVE_BASH" != "no"])
> +
> +AC_CHECK_TOOL([HAVE_READELF], [readelf], [no])
> +AM_CONDITIONAL([HAVE_READELF], [test "$HAVE_READELF" != "no"])
> +
>   # Create config.m4.
>   GMP_FINISH
>   
> @@ -4069,7 +4087,7 @@ AC_CONFIG_FILES([Makefile						\
>     tests/Makefile tests/devel/Makefile					\
>     tests/mpf/Makefile tests/mpn/Makefile tests/mpq/Makefile		\
>     tests/mpz/Makefile tests/rand/Makefile tests/misc/Makefile		\
> -  tests/cxx/Makefile							\
> +  tests/cxx/Makefile						\

Whitespace?

>     doc/Makefile tune/Makefile						\
>     demos/Makefile demos/calc/Makefile demos/expr/Makefile		\
>     gmp.h:gmp-h.in gmp.pc:gmp.pc.in gmpxx.pc:gmpxx.pc.in])
> diff --git a/mpn/Makeasm.am b/mpn/Makeasm.am
> index 5d7306c22..527bf41cf 100644
> --- a/mpn/Makeasm.am
> +++ b/mpn/Makeasm.am
> @@ -115,4 +115,5 @@ RM_TMP = rm -f
>   	$(CCAS) $(COMPILE_FLAGS) tmp-$*.s -o $@
>   	$(RM_TMP) tmp-$*.s
>   .asm.lo:
> -	$(LIBTOOL) --mode=compile --tag=CC $(top_srcdir)/mpn/m4-ccas --m4="$(M4)" $(CCAS) $(COMPILE_FLAGS) `test -f '$<' || echo '$(srcdir)/'`$<
> +	$(LIBTOOL) --mode=compile --tag=CC $(top_srcdir)/mpn/m4-ccas --m4="$(M4)" \
> +		$(CCAS) $(COMPILE_FLAGS) `test -f '$<' || echo '$(srcdir)/'`$<
> diff --git a/mpn/arm64/arm64-defs.m4 b/mpn/arm64/arm64-defs.m4
> index 46149f7bf..c717e5ebd 100644
> --- a/mpn/arm64/arm64-defs.m4
> +++ b/mpn/arm64/arm64-defs.m4
> @@ -36,6 +36,101 @@ dnl  don't want to disable macro expansions in or after them.
>   
>   changecom
>   
> +dnl use the hint instructions so they NOP on older machines.
> +dnl Add comments so the assembly is notated with the instruction
> +
> +
> +define(`PACIASP', `hint #25  /* paciasp */')
> +define(`AUTIASP', `hint #29  /* autiasp */')
> +define(`PACIBSP', `hint #27  /* pacibsp */')
> +define(`AUTIBSP', `hint #31  /* autibsp */')
> +
> +dnl if BTI is enabled we want the SIGN_LR to be a valid
> +dnl landing pad, we don't need VERIFY_LR and we need to
> +dnl indicate the valid BTI support for gnu notes.
> +
> +
> +ifelse(ARM64_FEATURE_BTI_DEFAULT, `1',
> +  `define(`BTI_C',   `hint #34  /* bti c */')
> +   define(`SIGN_LR', `BTI_C')
> +   define(`GNU_PROPERTY_AARCH64_BTI', `1')
> +   define(`PAC_OR_BTI')', `
> +   define(`BTI_C', `')
> +   define(`GNU_PROPERTY_AARCH64_BTI', `0')'
> +')
> +
> +dnl define instructions for PAC, which can use the A
> +dnl or the B key. PAC instructions are also valid BTI
> +dnl landing pads, so we re-define SIGN_LR if BTI is
> +dnl enabled.
> +
> +
> +ifelse(ARM64_FEATURE_PAC_DEFAULT, `1',
> +    `define(`SIGN_LR', `PACIASP')
> +     define(`VERIFY_LR', `AUTIASP')
> +     define(`GNU_PROPERTY_AARCH64_POINTER_AUTH', `2')
> +     define(`PAC_OR_BTI')',
> +   ARM64_FEATURE_PAC_DEFAULT, `2',
> +    `define(`SIGN_LR', `PACIBSP')
> +     define(`VERIFY_LR', `AUTIBSP')
> +     define(`GNU_PROPERTY_AARCH64_POINTER_AUTH', `2')
> +     define(`PAC_OR_BTI')',
> +    `ifdef(`SIGN_LR', , `define(`SIGN_LR', `')')
> +     define(`VERIFY_LR', `')
> +     define(`GNU_PROPERTY_AARCH64_POINTER_AUTH', `0')'
> +')
> +
> +dnl NOTE OVERRIDES asm-defs.m4 definition for arch specific functionality
> +dnl
> +dnl Usage: PROLOGUE_cpu(GSYM_PREFIX`'foo[,param])
> +dnl         EPILOGUE_cpu(GSYM_PREFIX`'foo)
> +dnl
> +dnl  These macros hold the CPU-specific parts of PROLOGUE and is called
> +dnl  with the function name, with GSYM_PREFIX already prepended.
> +dnl
> +dnl  By default, it marks entry points with a bti c instruction unless
> +dnl  the second argument is true and it marks it using SIGN_LR which expands
> +dnl  to the proper paci instruction OR bti c instruction depending on
> +dnl  compilation flags. In the case of an instruction that uses paci, this
> +dnl  provides a one instruction advantage over having a bti c followed by
> +dnl  a paci instruction.
> +
> +define(`PROLOGUE_cpu',
> +m4_assert_numargs_range(1,2)
> +`	TEXT
> +	ALIGN(8)
> +	GLOBL	`$1' GLOBL_ATTR
> +	TYPE(`$1',`function')
> +`$1'LABEL_SUFFIX
> +	ifelse(`$2',`true',
> +		`SIGN_LR',
> +		`BTI_C')
> +')
> +
> +dnl ADD_GNU_NOTES_IF_NEEDED
> +dnl
> +dnl Conditionally add into ELF assembly files the GNU notes indicating if
> +dnl BTI or PAC is support. BTI is required by the linkers and loaders, however
> +dnl PAC is a nice to have for auditing. Use readelf -n to display.
> +
> +
> +define(`ADD_GNU_NOTES_IF_NEEDED', `
> +  ifdef(`ARM64_ELF', `
> +    ifdef(`PAC_OR_BTI', `
> +      .pushsection .note.gnu.property, "a";
> +      .balign 8;
> +      .long 4;
> +      .long 0x10;
> +      .long 0x5;
> +      .asciz "GNU";
> +      .long 0xc0000000; /* GNU_PROPERTY_AARCH64_FEATURE_1_AND */
> +      .long 4;
> +      .long eval(indir(`GNU_PROPERTY_AARCH64_POINTER_AUTH') + indir(`GNU_PROPERTY_AARCH64_BTI'));
> +      .long 0;
> +      .popsection;
> +    ')
> +  ')
> +')
>   
>   dnl  LEA_HI(reg,gmp_symbol), LEA_LO(reg,gmp_symbol)
>   dnl
> @@ -50,4 +145,9 @@ define(`LEA_HI', `adrp	$1, $2')dnl
>   define(`LEA_LO', `add	$1, $1, :lo12:$2')dnl
>   ')dnl
>   
> +dnl divert output to the following m4 file to shove the GNU Notes section into subsequent
> +dnl files implicitly.
> +divert(1)
> +ADD_GNU_NOTES_IF_NEEDED
> +
>   divert`'dnl
> diff --git a/mpn/arm64/divrem_1.asm b/mpn/arm64/divrem_1.asm
> index 9d5bb5959..2c5265780 100644
> --- a/mpn/arm64/divrem_1.asm
> +++ b/mpn/arm64/divrem_1.asm
> @@ -65,7 +65,7 @@ dnl                      mp_limb_t d_unnorm, mp_limb_t dinv, int cnt)
>   
>   ASM_START()
>   
> -PROLOGUE(mpn_preinv_divrem_1)
> +PROLOGUE(mpn_preinv_divrem_1, true)
>   	cbz	n_arg, L(fz)
>   	stp	x29, x30, [sp, #-80]!
>   	mov	x29, sp
> @@ -85,7 +85,7 @@ PROLOGUE(mpn_preinv_divrem_1)
>   	b	L(uentry)
>   EPILOGUE()
>   
> -PROLOGUE(mpn_divrem_1)
> +PROLOGUE(mpn_divrem_1, true)
>   	cbz	n_arg, L(fz)
>   	stp	x29, x30, [sp, #-80]!
>   	mov	x29, sp
> @@ -154,6 +154,7 @@ L(uend):add	x2, x11, #1
>   	ldp	x21, x22, [sp, #32]
>   	ldp	x23, x24, [sp, #48]
>   	ldp	x29, x30, [sp], #80
> +	VERIFY_LR
>   	ret
>   
>   L(ufx):	add	x2, x2, #1
> @@ -194,6 +195,7 @@ L(nend):cbnz	fn, L(frac)
>   	ldp	x21, x22, [sp, #32]
>   	ldp	x23, x24, [sp, #48]
>   	ldp	x29, x30, [sp], #80
> +	VERIFY_LR
>   	ret
>   
>   L(nfx):	add	x2, x2, #1
> @@ -219,6 +221,7 @@ L(ftop):add	x2, x11, #1
>   	ldp	x21, x22, [sp, #32]
>   	ldp	x23, x24, [sp, #48]
>   	ldp	x29, x30, [sp], #80
> +	VERIFY_LR
>   	ret
>   
>   C Block zero. We need this for the degenerated case of n = 0, fn != 0.
> @@ -227,5 +230,6 @@ L(ztop):str	xzr, [qp_arg], #8
>   	sub	fn_arg, fn_arg, #1
>   	cbnz	fn_arg, L(ztop)
>   L(zend):mov	x0, #0
> +	VERIFY_LR
>   	ret
>   EPILOGUE()
> diff --git a/tests/mpn/Makefile.am b/tests/mpn/Makefile.am
> index 0e979a3ad..16d4d2dc6 100644
> --- a/tests/mpn/Makefile.am
> +++ b/tests/mpn/Makefile.am
> @@ -22,19 +22,36 @@ AM_CPPFLAGS = -I$(top_srcdir) -I$(top_srcdir)/tests
>   AM_LDFLAGS = -no-install
>   LDADD = $(top_builddir)/tests/libtests.la $(top_builddir)/libgmp.la
>   
> -check_PROGRAMS = t-asmtype t-aors_1 t-divrem_1 t-mod_1 t-fat t-get_d	\
> -  t-instrument t-iord_u t-mp_bases t-perfsqr t-scan logic		\
> -  t-toom22 t-toom32 t-toom33 t-toom42 t-toom43 t-toom44			\
> -  t-toom52 t-toom53 t-toom54 t-toom62 t-toom63 t-toom6h t-toom8h	\
> -  t-toom2-sqr t-toom3-sqr t-toom4-sqr t-toom6-sqr t-toom8-sqr		\
> -  t-div t-mul t-mullo t-sqrlo t-mulmod_bnm1 t-sqrmod_bnm1 t-mulmid	\
> -  t-mulmod_bknp1 t-sqrmod_bknp1						\
> -  t-addaddmul t-hgcd t-hgcd_appr t-matrix22 t-invert t-bdiv t-fib2m	\
> -  t-broot t-brootinv t-minvert t-sizeinbase t-gcd_11 t-gcd_22 t-gcdext_1
> -
> -EXTRA_DIST = toom-shared.h toom-sqr-shared.h
> -
> -TESTS = $(check_PROGRAMS)
> +TEST_EXTENSIONS = .sh
> +AM_SH_LOG_FLAGS = --enable-pac=@ARM64_FEATURE_PAC_DEFAULT@ \
> +  --enable-bti=@ARM64_FEATURE_BTI_DEFAULT@ \
> +  $(top_builddir)/.libs/libgmp.so
> +SH_LOG_COMPILER = $(srcdir)/log-compiler.sh
> +
> +check_PROGRAMS = t-asmtype t-aors_1 t-divrem_1 t-mod_1 t-fat t-get_d	 \
> +  t-instrument t-iord_u t-mp_bases t-perfsqr t-scan logic		 \
> +  t-toom22 t-toom32 t-toom33 t-toom42 t-toom43 t-toom44			 \
> +  t-toom52 t-toom53 t-toom54 t-toom62 t-toom63 t-toom6h t-toom8h	 \
> +  t-toom2-sqr t-toom3-sqr t-toom4-sqr t-toom6-sqr t-toom8-sqr		 \
> +  t-div t-mul t-mullo t-sqrlo t-mulmod_bnm1 t-sqrmod_bnm1 t-mulmid	 \
> +  t-mulmod_bknp1 t-sqrmod_bknp1						 \
> +  t-addaddmul t-hgcd t-hgcd_appr t-matrix22 t-invert t-bdiv t-fib2m	 \
> +  t-broot t-brootinv t-minvert t-sizeinbase t-gcd_11 t-gcd_22 t-gcdext_1 \
> +  t-arm64_bti
> +
> +test_scripts =
> +if HAVE_BASH
> +if HAVE_READELF
> +  test_scripts += t-arm64_elf_check.sh
> +endif
> +endif
> +check_SCRIPTS = $(test_scripts)
> +
> +EXTRA_DIST = toom-shared.h toom-sqr-shared.h t-arm64_elf_check.sh
> +
> +TESTS = $(check_PROGRAMS) $(check_SCRIPTS)
> +
> +XFAIL_TESTS = t-arm64_bti
>   
>   $(top_builddir)/tests/libtests.la:
>   	cd $(top_builddir)/tests; $(MAKE) $(AM_MAKEFLAGS) libtests.la
> diff --git a/tests/mpn/log-compiler.sh b/tests/mpn/log-compiler.sh
> new file mode 100755
> index 000000000..092b21b33
> --- /dev/null
> +++ b/tests/mpn/log-compiler.sh
> @@ -0,0 +1,21 @@
> +#!/usr/bin/env bash
> +
> +echo "Log Compiler: $@"
> +
> +# Flip <options> command to command <options> by swaping the
> +# first and last elements of the argv array
> +# Convert "$@" to an array for easy manipulation
> +args=("$@")
> +
> +# Get the indices for the first and last elements
> +first=0
> +last=$((${#args[@]} - 1))
> +
> +# Swap the first and last elements
> +temp="${args[$first]}"
> +args[$first]="${args[$last]}"
> +args[$last]="$temp"
> +
> +# Run the script
> +./${args[@]}
> +exit $?
> diff --git a/tests/mpn/t-arm64_bti.c b/tests/mpn/t-arm64_bti.c
> new file mode 100644
> index 000000000..6c36da2d5
> --- /dev/null
> +++ b/tests/mpn/t-arm64_bti.c
> @@ -0,0 +1,86 @@
> +/*
> +Copyright 2024 Free Software Foundation, Inc.
> +
> +This file is part of the GNU MP Library test suite.
> +
> +The GNU MP Library test suite is free software; you can redistribute it
> +and/or modify it under the terms of the GNU General Public License as
> +published by the Free Software Foundation; either version 3 of the License,
> +or (at your option) any later version.
> +
> +The GNU MP Library test suite is distributed in the hope that it will be
> +useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
> +Public License for more details.
> +
> +You should have received a copy of the GNU General Public License along with
> +the GNU MP Library test suite.  If not, see https://www.gnu.org/licenses/.  */
> +
> +/*
> + * Test if if BTI is working within the GMP assembly stubs for AArch64 aka arm64
> + * within GMP. This test gets a function pointer to mpn_lshift avoiding the PLT
> + * using dlsym and calls the function and checks for a valid return. It then
> + * advances the function pointer by 2, which points us to the next instruction,
> + * and calls. The following scenarios are possible:
> + * | Binary BTI Enabled | Hardware BTI Enabled | Executable Outcome  | Test Outcome |
> + * | 0                  | 0                    | Works returning 77  | SKIP          |
> + * | 0                  | 1                    | Works returning 77  | SKIP          |
> + * | 1                  | 0                    | Works returning 77  | SKIP          |
> + * | 1                  | 1                    | BTI Exception       | PASS          |
> + * Note: 77 is the magic value for autotools to indicate to skip a test.
> + * Note: You MUST run this test when enabled on a BTI enabled hardware setup.
> + * Note: That for non-aarch64 platforms, this also just skips.
> + */
> +
> +#define SKIP 77
> +
> +/* AArch64 BTI Binary enabled code ONLY */
> +#ifdef __ARM_FEATURE_BTI_DEFAULT
> +
> +#include <stdint.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +
> +#include <dlfcn.h>
> +#include <sys/auxv.h>
> +#include <asm/hwcap.h>
> +
> +#include "gmp-impl.h"
> +#include "tests.h"
> +
> +typedef mp_limb_t (*fn_mpn_lshift)(mp_ptr, mp_srcptr, mp_size_t, unsigned int);
> +
> +int
> +main (int argc, char **argv)
> +{
> +	unsigned long hwcap2 = getauxval(AT_HWCAP2);
> +	if (!(hwcap2 & HWCAP2_BTI)) {
> +		fprintf(stderr, "Hardware does not support BTI\n");
> +		return SKIP;
> +	}
> +
> +	mp_limb_t xp = 0x1001, wp;
> +
> +	fn_mpn_lshift fn = dlsym(RTLD_DEFAULT, "__gmpn_lshift");
> +	if (!fn) {
> +	  fprintf(stderr, "Could not find the symbol __gmpn_lshift\n");
> +	  return 0;

Right so we return 'success', when the harness is expecting failure. 
Took me a bit to understand what was going on. Might be worth a comment.

> +	}
> +
> +	/* should work as this will land on a BTI landing pad as expected */
> +	fn (&wp, &xp, (mp_size_t) 1, 1);
> +	ASSERT_ALWAYS (wp == 0x2002);
> +
> +	/* this should fail as it's off 1 instruction */
> +	fn = (fn_mpn_lshift)((uintptr_t)fn + 4);

Caveat emptor here, the function casting UB might result in a diffrent 
kind of crash on future compilers. Inline assembly might be able to pin 
that down, but its going to result portability issues on mac/windows?

> +	fn(&wp, &xp, (mp_size_t) 1, 1);
> +	fprintf(stderr, "This should cause an exception, does your system support BTI?\n");
> +	return 0;
> +}
> +#else
> +/* No binary support for BTI or another arch, just skips */
> +int
> +main (int argc, char **argv) {
> +	return SKIP;
> +}
> +#endif
> diff --git a/tests/mpn/t-arm64_elf_check.sh b/tests/mpn/t-arm64_elf_check.sh
> new file mode 100755
> index 000000000..b0d294692
> --- /dev/null
> +++ b/tests/mpn/t-arm64_elf_check.sh
> @@ -0,0 +1,96 @@
> +#!/usr/bin/env bash
> +
> +set -e -o pipefail
> +
> +check_val() {
> +
> +  local grep_flags="-qi"
> +  local not_msg=""
> +  # invert the grep match if it SHOULDN'T be found in the flags.
> +  # ie BTI 0 means BTI should not be in the notes.
> +  if [ "${2}" -eq 0 ]; then
> +    grep_flags+="v"
> +    not_msg="Not "
> +  fi
> +
> +  printf 'Checking for %s in "%s". Expecting "%sPresent", ' "${1}" "${ELF_BINARY}" "${not_msg}"
> +
> +  set +e
> +  readelf -n "${ELF_BINARY}" | grep $grep_flags -- "${1}"
> +  local r="${?}"
> +  set -e
> +  # Possible states we care about, which grep will fail under:
> +  #   - State 1: Not expecting and Found
> +  #   - State 2: Expecting and not Found
> +  if [[ "${r}" -ne 0 ]]; then
> +    # Flip the not message
> +    if [ -z "${not_msg}" ]; then
> +      not_msg="Not "
> +    else
> +      not_msg=""
> +    fi
> +  fi
> +
> +  # print found or not found
> +  printf 'got "%sPresent."\n' "${not_msg}"
> +
> +  # The grep result means we return the rc through the named variable
> +  # this way consumers can just add all the values to determine if its
> +  # a failure.
> +  eval "${1}=\"${r}\""
This is just setting the global BTI and PAC variables right? 'define -g' 
is generally safer, right?

> +}
> +
> +# Initialize variables
> +BTI="0"
> +PAC="0"
> +ELF_BINARY=""
> +
> +# Loop through the arguments
> +while [[ "${#}" -gt 0 ]]; do
> +  case "${1}" in
> +  --enable-bti=*)
> +    BTI="${1#*=}"
> +    shift
> +    ;;
> +  --enable-pac=*)
> +    PAC="${1#*=}"
> +    shift
> +    ;;
> +  --enable-bti | --enable-pac)
> +    # If the argument is in the form --enable-bti value (without =)
> +    printf 'Error: Option %s requires a value, like --enable-bti=value' "${1}"
> +    exit 1
> +    ;;
> +  *)
> +    # Handle the non-option argument
> +    if [[ -z "${ELF_BINARY}" ]]; then
> +      ELF_BINARY="${1}"
> +    else
> +      printf 'Error: More than one non-option argument provided: %s\n' "${1}"
> +      exit 1
> +    fi
> +    shift
> +    ;;
> +  esac
> +done
> +
> +if [ -z "${ELF_BINARY}" ]; then
> +  printf "Must specify the ELF binary ast he ONLY script argument\n"

"binary as the ONLY"

> +  exit 1
> +fi
> +
> +# Skip if nothing is enabled, 77 is automake magic for SKIP this test.
> +# For non-supporting architectures and ABIs both of these will be 0
> +# and thus skip.
> +if [[ "${BTI}" -eq 0 && "${PAC}" -eq 0 ]]; then
> +  printf "PAC and BTI disabled...skipping\n"
> +  exit 77
> +fi
> +
> +check_val "BTI" "${BTI}"
> +check_val "PAC" "${PAC}"
> +
> +# don't use expr as it returns non-zero when the addition result is non-zero
> +# and causes the set -e script to fail.
> +rc=$((BTI + PAC))
> +exit ${rc}


Otherwise generally looks good. Are people using this library on arm 
mac/windows machines? If so, was it validated there?


Thanks again,


From krishilsheth at gmail.com  Sun Apr 20 13:03:34 2025
From: krishilsheth at gmail.com (KRISHIL SHETH)
Date: Sun, 20 Apr 2025 16:33:34 +0530
Subject: Proposal : Introducing RPF: A Faster and More Efficient Alternative
 to Karatsuba for Large-Number Operations
Message-ID: <CACAyDrjm+5tDcqHFc6EP4yMccqYaKaeMMa3GB78kM8T4e9so8g@mail.gmail.com>

Hi GMP Team,

I?m Krishil Rohit Sheth, an independent researcher and developer.
Over the past 4 years, I?ve developed a new squaring algorithm, which I
call *RPF* (Rapid Precision Formula).

In my benchmarks, *RPF consistently outperforms Karatsuba* ? both in raw
performance and when enhanced using GMP ? especially for real-world input
sizes (small to mid-sized big numbers).
Notably, RPF also shows faster results than FFT-based methods for numbers
up to ~800 digits.

I believe this could bring measurable improvements to GMP's already
excellent performance, especially in areas like cryptography, scientific
computation, and finance where big number squaring is critical.

I would love to discuss:

   -

   Sharing detailed benchmarks and technical information
   -

   Exploring possible collaboration or contribution pathways
   -

   Understanding your process for reviewing and integrating algorithmic
   enhancements

I deeply respect GMP?s impact on the open-source and mathematics
communities and would be honored to contribute meaningfully.

Please let me know if we could schedule a brief discussion or if you'd
prefer a formal technical submission first.

Looking forward to hearing from you.

Best regards,


*Krishil Rohit ShethIndia , *
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RPF_Vs_Karatsuba.pdf
Type: application/pdf
Size: 186673 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20250420/fde6a94d/attachment-0001.pdf>

From Paul.Zimmermann at inria.fr  Mon Apr 21 09:01:13 2025
From: Paul.Zimmermann at inria.fr (Paul Zimmermann)
Date: Mon, 21 Apr 2025 09:01:13 +0200
Subject: Proposal : Introducing RPF: A Faster and More Efficient
 Alternative to Karatsuba for Large-Number Operations
In-Reply-To: <CACAyDrjm+5tDcqHFc6EP4yMccqYaKaeMMa3GB78kM8T4e9so8g@mail.gmail.com>
 (message from KRISHIL SHETH on Sun, 20 Apr 2025 16:33:34 +0530)
References: <CACAyDrjm+5tDcqHFc6EP4yMccqYaKaeMMa3GB78kM8T4e9so8g@mail.gmail.com>
Message-ID: <p9u0plh6q9qe.fsf@araignee.loria.fr>

       Hi Krishil Rohit Sheth,

if you want to convince people that your algorithm is faster than Karatsuba,
a simple figure is not enough:

* please publish your algorithm

* please publish an implementation that people can play with, and check it is faster
  than GMP

Best regards,
Paul Zimmermann

> From: KRISHIL SHETH <krishilsheth at gmail.com>
> Date: Sun, 20 Apr 2025 16:33:34 +0530
> 
> 
> [1:text/plain Hide]
> 
> Hi GMP Team,
> 
> I?m Krishil Rohit Sheth, an independent researcher and developer.
> Over the past 4 years, I?ve developed a new squaring algorithm, which I
> call *RPF* (Rapid Precision Formula).
> 
> In my benchmarks, *RPF consistently outperforms Karatsuba* ? both in raw
> performance and when enhanced using GMP ? especially for real-world input
> sizes (small to mid-sized big numbers).
> Notably, RPF also shows faster results than FFT-based methods for numbers
> up to ~800 digits.
> 
> I believe this could bring measurable improvements to GMP's already
> excellent performance, especially in areas like cryptography, scientific
> computation, and finance where big number squaring is critical.