uma: Micro-optimize memory trashing
Use u_long for memory accesses instead of uint32_t. On my tests on
amd64 this by ~30% reduces time spent in those functions thanks to
bigger 64bit accesses. i386 still uses 32bit accesses.
MFC after: 1 month
(cherry picked from commit 7c566d6cfc7bfb913bad89d87386fa21dce8c2e6)