lib/libc/aarch64/string: add memccpy SIMD implementation
This changeset includes a port of the SIMD implementation of
memccpy for amd64 to Aarch64.
Performance is significantly better than the scalar implementation
except for short strings.
Benchmark results are as usual generated by the strperf utility
written by fuz.
See the DR for benchmark results.
Tested by: fuz (exprun)
Reviewed by: fuz, emaste
Sponsored by: Google LLC (GSoC 2024)
PR: 281175
Differential Revision: https://reviews.freebsd.org/D46170