Page MenuHomeFreeBSD

Don't use sleeping allocations for ufs dirhash blocks when holding directory vnode
ClosedPublic

Authored by dgmorris_earthlink.net on Mar 3 2021, 6:37 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Dec 29, 5:57 AM
Unknown Object (File)
Dec 10 2024, 5:45 AM
Unknown Object (File)
Nov 28 2024, 7:42 AM
Unknown Object (File)
Nov 14 2024, 7:25 PM
Unknown Object (File)
Nov 7 2024, 1:24 PM
Unknown Object (File)
Oct 25 2024, 8:19 AM
Unknown Object (File)
Oct 19 2024, 7:32 PM
Unknown Object (File)
Oct 14 2024, 6:11 AM
Subscribers
None

Details

Summary

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253992

System with a based-on-FreeBSD-12 kernel but with a timeout panic for the UFS root vnode lock paniced due to a ufsdirhash_build() call with M_WAITOK, UMA_ZONE_FIRSTTOUCH (the default) and a domain which was OOM. The ufsdirhash_build() code is invoked with the directory vnode exclusively locked (per ufs_lookup.c / the caller) and in all other places takes care to allocate M_NOWAIT to avoid such issues.

So it seems reasonable to make this one allocation also M_NOWAIT instead of M_WAITOK, especially given the FIRSTTOUCH default behavior where only the calling cpu's domain can satisfy the request.

Test Plan

Organizational stress testing, no unit test planned -- change is pretty straightforward.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This revision is now accepted and ready to land.Mar 3 2021, 7:50 PM
markj added reviewers: mckusick, kib.
markj added inline comments.
sys/ufs/ufs/ufs_dirhash.c
111

I would just drop the NOWAIT suffix, or get rid of the UMA wrappers entirely.

Took Mark's suggestion on dropping _NOWAIT suffix on allocation macro.
Left the macro for symmetry with the FREE macros in the file.

This revision now requires review to proceed.Mar 3 2021, 8:09 PM

This seems like a reasonable solution to the problem. The allocation will fail in a few cases where it previously would have succeeded, but hopefully those will be rare. The effect of failing will simply be slower lookups rather than unexpected errors to applications.

This revision is now accepted and ready to land.Mar 3 2021, 9:19 PM