When deploying encrypted swap to a large number of machines with memory pressure, we noticed periodic hangs where the machine would kill off process after process due to being unable to allocate memory for the process.
It looks like one source of this problem might be the circular dependency inherent in allocating memory in the swap path. When we get low on memory, the VM system tries to free some by swapping pages. However, if we are so low on free pages that GELI allocations block, then the swapout operation cannot complete. This keeps the VM system from being able to free enough memory so the allocation can complete.
To alleviate this, keep a UMA pool at the GELI layer which is used for data buffer allocation in the fast path, and reserve some of that memory for swap operations. Signal to the GELI layer that a BIO is part of a swap operation. If so, use the reserved memory. If the allocation still fails, return ENOMEM instead of blocking.
For non-swap allocations, change the default to using M_NOWAIT. In general, this *should* be better, since it gives upper layers a signal of the memory pressure and a chance to manage their failure strategy appropriately. However, a user can set the kern.geom.eli.blocking_malloc sysctl/tunable to restore the previous M_WAIT strategy.