Page MenuHomeFreeBSD

aio: Fix a test and set race in aio_biowakeup.
ClosedPublic

Authored by jhb on Feb 13 2023, 6:42 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Jan 19, 1:41 AM
Unknown Object (File)
Sat, Jan 18, 9:59 PM
Unknown Object (File)
Dec 5 2024, 12:40 PM
Unknown Object (File)
Nov 21 2024, 1:49 AM
Unknown Object (File)
Nov 17 2024, 1:17 AM
Unknown Object (File)
Oct 14 2024, 7:50 PM
Unknown Object (File)
Oct 4 2024, 9:06 PM
Unknown Object (File)
Oct 4 2024, 7:13 PM
Subscribers

Details

Summary

Use atomic_fetchadd in place of separate atomic_subtract / atomic_load.

Sponsored by: HPE TidalScale

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jhb requested review of this revision.Feb 13 2023, 6:42 PM
sys/kern/vfs_aio.c
2501

Given that we want to read the value of job->error that might have only been set by the other job we are racing with, do we need a memory barrier of sorts here? Alternatively, given that we have several atomics in a row here, I wonder if instead we should just use a mtxpool mutex here and combine all the atomics under that. It would cover job->error, the updates to oublock/inblock, and you could do the job->nbio-- under it as well.

markj added inline comments.
sys/kern/vfs_aio.c
2497

Zap the extra newline while you're here?

This revision is now accepted and ready to land.Feb 13 2023, 6:48 PM
sys/kern/vfs_aio.c
2501

Yes, I think you need to follow the example of refcount_release(): issue a release fence before the decrement so that all prior stores are visible to the thread which decrements 1->0, and issue an acquire fence in the thread that loads job->error.

sys/kern/vfs_aio.c
2491

This should be atomic_store_int().

mjg added inline comments.
sys/kern/vfs_aio.c
2501

acquire fence is not enough to synchro against the previous store

you need seq cst

sys/kern/vfs_aio.c
2501

Could you please say more? What exactly is the problematic reordering, and why doesn't it impact consumers of refcount_release()?

sys/kern/vfs_aio.c
2497

I have that ready in a followup that fixes a few other nits. :)

sys/kern/vfs_aio.c
2501

Would it perhaps be simplest to just use the refcount API for the nbio field rather than home-rolled atomics?

sys/kern/vfs_aio.c
2501

Yes, I think that's a good idea, then you get over/underflow checking as well.

I don't have any strong feelings regarding the use of a mutex pool, that would work too of course.

sys/kern/vfs_aio.c
2501

sorry, i somehow did not register this is basically hand-rolled refcounting. there is the general scheme where you sync with seq cst if you store first and load something later, but given the situation this indeed is doable differently

so ye i vote for just employing refcount(9)

sys/kern/vfs_aio.c
2491

this is another case where atomic_set is used wrong, the name is highly misleading at best

how about retiring it in favor of atomic_setbit or similar to prevent future breakage

sys/kern/vfs_aio.c
2491

Even _setbit is kind of misleading since atomic_set(p, v) does *p |= v, not *p |= (1 << v), as I'd expect from something called "setbit". In particular, some consumers use it to set multiple bits in one operation. Maybe atomic_or_* would be better.

But yes, it's not the first time I've seen this bug, so a rename is a good idea.

sys/kern/vfs_aio.c
2491

_orbits? :]

i don't care what it is specifically, as long as it is ont easily confused with something blatantly different.

as shown above, as is people blindly assume it is an equivalent of atomic_store.

sys/kern/vfs_aio.c
2491

I would honestly prefer that we start migrating to C11 atomics in the kernel and eventually retire <machine/atomic.h>. However, if we want to rename it, let's pick a name that aligns with C11 which would be atomic_or_*. Our atomic_clear_* doesn't quite have a C11 name since it is a NAND rather than an AND.