lib/libc/aarch64/string: add strncmp SIMD implementation
This changeset includes a port of the SIMD implementation of
strncmp for amd64 to Aarch64.
It is based on D45839 with added handling for the limit.
An extended unit test for strncmp is currently being written to
make sure the bounds checks for page crossings work as expected.
Performance is significantly better than the existing
implementation from the Arm Optimized Routines repository.
Benchmark results are generated by the strperf utility by fuz.
See the DR for benchmark results.
Tested by: fuz (exprun)
Reviewed by: fuz, emaste
Sponsored by: Google LLC (GSoC 2024)
PR: 281175
Differential Revision: https://reviews.freebsd.org/D45943