Remove the secondary_stacks array in arm64 and riscv kernels.
Instead, dynamically allocate a page for the boot stack of each AP when
starting them up, like we do on x86. This shrinks the bss by
MAXCPU*KSTACK_PAGES pages, which corresponds to 4MB on arm64 and 256KB
on riscv.
Duplicate the logic used on x86 to free the bootstacks, by using a
sysinit to wait for each AP to switch to a thread before freeing its
stack.
While here, mark some static MD variables as such.
Reviewed by: kib
MFC after: 1 month
Sponsored by: Juniper Networks, Klara Inc.
Differential Revision: https://reviews.freebsd.org/D24158