Ensure 'struct thread' is aligned to a cache line
Using the new UMA_ALIGN_CACHE_AND_MASK() facility, which allows to
simultaneously guarantee a minimum of 32 bytes of alignment (the 5 lower
bits are always 0).
For the record, to this day, here's a (possibly non-exhaustive) list of
synchronization primitives using lower bits to store flags in pointers
to thread structures:
- lockmgr, rwlock and sx all use the 5 bits directly.
- rmlock indirectly relies on sx, so can use the 5 bits.
- mtx (non-spin) relies on the 3 lower bits.
Reviewed by: markj, kib
MFC after: 2 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42266
(cherry picked from commit 7d1469e555bdce32b3dfc898478ae5564d5072b1)