Page MenuHomeFreeBSD

Ensure I/O buffers in libufs(3) are 128-byte aligned
ClosedPublic

Authored by mckusick on Sep 5 2023, 6:34 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Nov 14, 2:56 AM
Unknown Object (File)
Thu, Nov 14, 2:54 AM
Unknown Object (File)
Sun, Oct 27, 4:40 PM
Unknown Object (File)
Sun, Oct 20, 5:03 PM
Unknown Object (File)
Sun, Oct 20, 5:02 PM
Unknown Object (File)
Sun, Oct 20, 5:02 PM
Unknown Object (File)
Sun, Oct 20, 5:02 PM
Unknown Object (File)
Oct 12 2024, 6:27 PM
Subscribers

Details

Summary

Various disk controllers require their buffers to be aligned to a cache-line size (128 bytes). For buffers allocated in structures, ensure that they are 128-byte aligned. Use aligned_malloc to allocate memory to ensure that the returned memory is 128-byte aligned.

This can be removed if/when the kernel is fixed. Because this problem has existed on one I/O subsystem or another since the 1990's, we are probably stuck with dealing with it forever.

While we are here, we replace the dynamically allocated inode buffer with a buffer allocated in the uufsd structure just as the superblock and cylinder group buffers do.

Test Plan

The problem most recent showed up in Azure, see https://reviews.freebsd.org/D41728 and https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267654. Before these fixes were applied, it was confirmed that these changes also fixed the issue in Azure.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

How is struct uufsd allocated? Is it libufs consumer duty to allocate it, is there an initialization step for the struct? I was not able to find it from the look over libufs.h.

lib/libufs/libufs.h
76

We have the __aligned(64) attribute provided by sys/cdefs.h, which should be used instead of the direct compiler-specific magic.

104

In userspace, you can use posix_memalign(3)

Update to reflect suggested changes by kib.

The struct uufsd is provided by the user. Typically it is declared as a global variable or as a stack variable in main().

It is initialized by ufs_disk_fillout(3) or ufs_disk_fillout_blank(3) as part of opening the filesystem with which it is associated.

Since the user-exposed structure layout and alignment requirements changed, you need to bump libufs.so version.

lib/libufs/block.c
84

No need to check for p2 != NULL, free() handles the case.

lib/libufs/libufs.h
92

I would added a constant like LIBUFS_BALIGN, defined as 64. Then it can be used for aligned_alloc() call, and 0x3f replaced by (LIBUFS_BALIGN-1).

lib/libufs/libufs.h
63

LIBUFS_BALIGN could be used there as well

64 is a magic number that may change in time.
In the past, though, these misaligned I/O requests were supposed to be aligned with a bounce buffer in the kernel when the I/O is being done.
I ran into this with armv4/armv5 and wound up being one of the reasons we stepped away from that platform: the busdma routines that were supposed to guarantee coherency wouldn't if the I/O wasn't cache line aligned in some cases (related to USB IIRC, but it's been a while: either usb worked and mmc didn't, or vice versa).
Page aligned is likely a bit of overkill, but might be a good balance between page aligned and MAXBSIZE aligned for things that aren't inside of internal structures.

lib/libufs/libufs.h
63

I'd make it a #define for alignment
What happens when we find a controller that has more strict alignment requirements?
Though I'd be tempted to make them MAXBSIZE aligned at least here.
It's a bit of a waste, but will be maximally compatible for quite some time.
Also, since these are 64k buffers, one after the other, subsequent alignment won't change anything (at the moment).

92

agreed.

iirc nvme has a stricter requirement, but I believe already bounces its requests when that requirement isn't met.

It would also be good to document exactly which HBAs have this issue as well, since otherwise it becomes kinda cargo-culty.

Issue reported as not reproducible with D41728 applied

Issue reported as not reproducible with D41728 applied

That's the kind of fix I'd expect...

Updates reflecting comments from kib and imp.

I do not know which if any disk controllers are currently affected. It is a problem that comes and goes. Adding the alignment hook allows buffer alignment to be easily adjusted as needed. It also has the performance improvement of not requiring disk drivers to use bounce buffers in cases that they previously needed to do so.

kib added inline comments.
lib/libufs/libufs.h
40

It is usually better to put asserts into a .c file instead of header. This way when assert fails, user get single error message instead of bunch of it for parallel build.

This revision is now accepted and ready to land.Sep 8 2023, 3:32 PM
mckusick edited the test plan for this revision. (Show Details)

Comments from kib.

Reorganize libufs structure with explicit type sizes ordered from most to least aligned.

This revision now requires review to proceed.Nov 15 2023, 1:26 AM
mckusick added inline comments.
lib/libufs/libufs.h
40

No need to check as the compiler will complain if an __aligned() request is not a power of 2.

I think this represents a good compromise between the different factor.

This revision is now accepted and ready to land.Nov 15 2023, 3:14 AM

This looks fine overall.

But now I have two suggestions:

  • libufs(3) perhaps should mention that struct uufsd requires non-standard alignment if allocated dynamically, expressed as LIBUFS_BUFALIGN
  • it would be useful if ufs_disk_fillout(3) and related functions checked alignment of the pointer to uufsd and returned error if not properly aligned.
mckusick marked an inline comment as done.
mckusick retitled this revision from Ensure I/O buffers in libufs(3) are 64-byte aligned to Ensure I/O buffers in libufs(3) are 128-byte aligned.

Per kib suggestions:

Add check for disk structure to be LIBUFS_ALIGN'ed in ufs_disk_fillout() in the file libufs/type.c

Add description of disk structure alignment requirement in ufs_disk_fillout() that is in the ufs_disk_close.3 manual page

This revision now requires review to proceed.Nov 17 2023, 1:52 AM
This revision is now accepted and ready to land.Nov 17 2023, 1:58 AM