HomeFreeBSD

uma: increase alignment to 128 bytes on amd64

Description

uma: increase alignment to 128 bytes on amd64

Current UMA internals are not suited for efficient operation in
multi-socket environments. In particular there is very common use of
MAXCPU arrays and other fields which are not always properly aligned and
are not local for target threads (apart from the first node of course).
Turns out the existing UMA_ALIGN macro can be used to mostly work around
the problem until the code get fixed. The current setting of 64 bytes
runs into trouble when adjacent cache line prefetcher gets to work.

An example 128-way benchmark doing a lot of malloc/frees has the following
instruction samples:

before:
kernel`lf_advlockasync+0x43b 32940

       kernel`malloc+0xe5            42380
        kernel`bzero+0x19            47798
kernel`spinlock_exit+0x26            60423
      kernel`0xffffffff80            78238
                      0x0           136947
kernel`uma_zfree_arg+0x46           159594

kernel`uma_zalloc_arg+0x672 180556

kernel`uma_zfree_arg+0x2a           459923

kernel`uma_zalloc_arg+0x5ec 489910

after:

kernel`bzero+0xd            46115

kernel`lf_advlockasync+0x25f 46134
kernel`lf_advlockasync+0x38a 49078

kernel`fget_unlocked+0xd1            49942

kernel`lf_advlockasync+0x43b 55392

       kernel`copyin+0x4a            56963
        kernel`bzero+0x19            81983
kernel`spinlock_exit+0x26            91889
      kernel`0xffffffff80           136357
                      0x0           239424

See the review for more details.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D15346

Details

Provenance
mjgAuthored on May 11 2018, 7:04 AM
Parents
rG85c1b3c1cbb1: rmlock: partially depessimize lock/unlock fastpath
Branches
Unknown
Tags
Unknown

Event Timeline