arm: Implement atomic_testandset_acq_long as a simple wrapper
Use a memory barrier after calling the existing atomic_testandset_long
rather than using the fcmpset-based fallback version from
<sys/_atomic_subword.h>.
Reviewed by: kib
Sponsored by: AFRL, DARPA
Differential Revision: https://reviews.freebsd.org/D47628
(cherry picked from commit 987c5a1944183cf82884fdb96b0432bd810aa8b9)