amd64: Allocate TCB with alignment of 16 rather than 8.
This matches the TLS_TCB_ALIGN definition in libc.
Reviewed by: kib, jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33349
(cherry picked from commit 299617496cc3c525a63833894fd8dbdc4e5de6a7)