lib/libc/amd64/string: add strncmp scalar, baseline implementation
The scalar implementation is fairly straightforward and merely unrolled
four times. The baseline implementation closely follows D41971 with
appropriate extensions and extra code paths to pay attention to string
length.
Performance is quite good. We beat both glibc (except for very long
strings, but they likely use AVX which we don't) and Bionic (except for
medium-sized aligned strings, where we are still in the same ballpark).
Sponsored by: The FreeBSD Foundation
Tested by: developers@, exp-run
Approved by: mjg
MFC after: 1 month
MFC to: stable/14
PR: 275785
Differential Revision: https://reviews.freebsd.org/D42122