More aggsum optimizations
- Avoid atomic_add() when updating as_lower_bound/as_upper_bound.
Previous code was excessively strong on 64bit systems while not
strong enough on 32bit ones. Instead introduce and use real
atomic_load() and atomic_store() operations, just an assignments
on 64bit machines, but using proper atomics on 32bit ones to avoid
torn reads/writes.
- Reduce number of buckets on large systems. Extra buckets not as
much improve add speed, as hurt reads. Unlike wmsum for aggsum
reads are still important.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12145