Page MenuHomeFreeBSD

Add unlocked/SMR fast path to getblk()
ClosedPublic

Authored by cem on Jul 22 2020, 11:29 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 11, 12:38 PM
Unknown Object (File)
Wed, Jan 1, 1:49 PM
Unknown Object (File)
Wed, Dec 18, 4:09 AM
Unknown Object (File)
Nov 26 2024, 11:27 PM
Unknown Object (File)
Nov 23 2024, 4:31 PM
Unknown Object (File)
Oct 4 2024, 6:57 PM
Unknown Object (File)
Oct 1 2024, 11:38 PM
Unknown Object (File)
Oct 1 2024, 10:00 PM
Subscribers

Details

Summary

Convert the bufobj tries to an SMR zone/PCTRIE and add a gbincore_unlocked()
API wrapping this functionality. Use it for a fast path in getblkx(),
falling back to locked lookup if we raced a thread changing the buf's
identity.

Test Plan

Quite surprisingly to me, it doesn't seem to immediately panic. So I'm probably missing something. ;-)

In our getblk-heavy workloads (which have other scaling limitations beyond those in FreeBSD), on 2x48 CPU (2P) metal, we see significantly better scaling with this change at 6+ threads (4.3x single-thread vs 3.7x single-thread), and the gap gets wider at higher thread counts. I don't have any great benchmarks on vanilla FreeBSD.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 32501
Build 29974: arc lint + arc unit

Event Timeline

cem requested review of this revision.Jul 22 2020, 11:29 PM
cem created this revision.
cem added reviewers: alc, dougm, markj, rlibby, kib, jeff.

Quite surprisingly to me, it doesn't seem to immediately panic. So I'm probably missing something.

Perhaps @pho would be willing to try and test this together with D25781?

Quite surprisingly to me, it doesn't seem to immediately panic. So I'm probably missing something.

Perhaps @pho would be willing to try and test this together with D25781?

I'd certainly appreciate it.

In D25782#571014, @cem wrote:

Quite surprisingly to me, it doesn't seem to immediately panic. So I'm probably missing something.

Perhaps @pho would be willing to try and test this together with D25781?

I'd certainly appreciate it.

I'm building a kernel right now.

sys/kern/vfs_bio.c
3872

The blocking there is not needed.

3884

I think the rule should be that any failure on lockless path for buffer lookup results in locked retry. Imagine that buffer was reclaimed meantime, then you would not reach the check for bufobj/blkno at all.

3889

Why do you think it should be handled there ?

sys/kern/vfs_subr.c
2352

return ()

Thanks!

sys/kern/vfs_bio.c
3872

will fix

3884

We can abort without retry for EINTR/ERESTART, right? Retrying in all other cases makes sense to me.

3889
#define BUF_UNLOCK(bp) do {                                             \
        KASSERT(((bp)->b_flags & B_REMFREE) == 0,                       \
            ("BUF_UNLOCK %p while B_REMFREE is still set.", (bp)));     \
                                                                        \
        (void)_lockmgr_args(&(bp)->b_lock, LK_RELEASE, NULL,            \
            LK_WMESG_DEFAULT, LK_PRIO_DEFAULT, LK_TIMO_DEFAULT,         \
            LOCK_FILE, LOCK_LINE);                                      \
sys/kern/vfs_subr.c
2352

will fix

cem marked 2 inline comments as done.
  • Fix style issues
  • Fallback to locked getblk path on all BUF_TIMELOCK errors except EINTR/ERESTART.
sys/kern/vfs_bio.c
3889

Huh. I think you should use lockmgr() directly there, to not touch buffer we do not intend to operate on, at all.

Add BUF_UNLOCK_NOASSERT macro for raw lockmgr unlock operation, and use it in unlocked identity-race path.

sys/kern/vfs_bio.c
3923

I must admit that 'unlocked' suffix is misleading.

sys/sys/buf.h
331

I would name it BUF_UNLOCK_RAW()

cem marked 2 inline comments as done.Jul 24 2020, 2:06 PM
cem added inline comments.
sys/kern/vfs_bio.c
3923

Will change.

sys/sys/buf.h
331

Will change

Rename new BUF_UNLOCK variant, goto label, for clarity.

This revision is now accepted and ready to land.Jul 24 2020, 2:24 PM
sys/kern/vfs_bio.c
3887

If you invert the condition, the control flow can be simplified a bit:

if (bp->b_bufobj == && bp->b_lblkno == blkno)
    goto foundbuf_fastpath;
/* It changed, fall back to locked lookup. */
BUF_UNLOCK_RAW(bp);
loop:
sys/kern/vfs_subr.c
495

Missing parens.

cem marked 2 inline comments as done.Jul 24 2020, 2:30 PM
cem added inline comments.
sys/kern/vfs_bio.c
3887

Sure, will do.

cem marked an inline comment as done.

Invert sense of identity check to simplify goto logic

This revision now requires review to proceed.Jul 24 2020, 2:32 PM
cem marked an inline comment as done.

Fix MarkJ's style comment I missed earlier.

This revision is now accepted and ready to land.Jul 24 2020, 2:49 PM
This revision was automatically updated to reflect the committed changes.
bdrewery added inline comments.
head/sys/kern/vfs_subr.c
2348–2352 ↗(On Diff #74887)

Suppose a buf moved from dirty to clean here, after checking the clean queue, before checking the dirty queue. No buf is found. Is SMR intended to avoid that, or is it only consistent per queue?

We found, presumably, that getblk(GB_NOCREAT) gets the wrong result because of this.

head/sys/kern/vfs_subr.c
2348–2352 ↗(On Diff #74887)

SMR only helps you on a per-queue basis. I think maybe ideally there would be a single pctrie here and some other mechanism for efficiently tracking dirty bufs, but I don't know how realistic that is.

Probably the right fix for getblk GB_NOCREAT is to re-check with the lock if unlocked lookup returned NULL. Something like:

	bp = gbincore_unlocked(bo, blkno);
	if (bp == NULL) {
		if (flags & GB_NOCREAT)
			goto loop;
		else
			goto newbuf_unlocked;
	}
head/sys/kern/vfs_subr.c
2348–2352 ↗(On Diff #74887)

(In the !GB_NOCREAT case, we might see the same race from gbincore_unlocked, but it is harmless: we'd create a useless buf at the same lbn, re-lock to attempt to insert it, and discover we'd lost the race. That race can happen even without SMR-gbincore.)

head/sys/kern/vfs_subr.c
2348–2352 ↗(On Diff #74887)

Suraj came up with the same patch idea. I'm wondering if it could be more efficient to just try gbincore_unlocked a second time rather than wait for the bo lock, in the GB_NOCREAT case.

head/sys/kern/vfs_subr.c
2348–2352 ↗(On Diff #74887)

Ah, cool. I hope Suraj is doing well. Spinning locklessly again (GB_NOCREAT case) might be worth doing, but you'll still need the same locked fallback if you continue to get NULL (for correctness). If you should spin, and for how long, probably depends on how many GB_NOCREAT getblk requests will already exist in the bufq and be transitioning queues, vs true-positive misses in your workload. (And relative costs of a few lockless rechecks vs taking the bufobj lock.)