r327899 added per-domain vmem arenas which import superpage-sized and
-aligned chunks from the global kernel_arena. On platforms without a
direct map, this increases the maximum number of boundary tags required
vmem_bt_alloc(). In particular, BT_MAXALLOC is 4, so vmem_bt_alloc()
may require up to 8 tags since it allocates from a per-domain arena,
which may import from kernel_arena, which may import from kernel_map.
vmem reserves items in the boundary tag UMA zone to handle the recursion
in vmem_bt_alloc(), but with the above-mentioned change this reservation
may be insufficient. Consider a system with 2 CPUs; we will reserve 6
tags using the old calculation, which is insufficient. Increase the
reservation so that vmem_bt_alloc() is more likely to succeed.[*]
Also reduce KVA_QUANTUM on system where VM_NRESERVLEVEL is 0. On
systems with limited KVA, this import size is too large. On a 32-bit
powerpc system, I saw a case where per-domain kernel arena allocations
were failing with 30MB free in kernel_arena but no 4MB chunks available.
An import size of 1MB (assuming a 4KB page size) seems more reasonable.
(*) The bug described in PR 235747 can cause reserved tags to be leaked,
causing subsequent failures to allocate KVA in vmem_bt_alloc(). I think
this bug is harder to solve, and I think the change in this diff is
required regardless.