Page MenuHomeFreeBSD

vm_object: add getpages utility
ClosedPublic

Authored by dougm on Mar 4 2025, 1:42 AM.
Tags
None
Referenced Files
F115913009: D49224.diff
Wed, Apr 30, 8:16 AM
Unknown Object (File)
Sun, Apr 20, 5:39 AM
Unknown Object (File)
Sat, Apr 12, 10:10 PM
Unknown Object (File)
Wed, Apr 9, 8:26 AM
Unknown Object (File)
Tue, Apr 8, 11:55 PM
Unknown Object (File)
Tue, Apr 8, 11:36 AM
Unknown Object (File)
Mon, Apr 7, 9:03 PM
Unknown Object (File)
Sun, Apr 6, 10:39 AM
Subscribers

Details

Summary

vnode_pager_generic_getpages() and swap_pager_getpages_locked() each include code to read a few pages behind, and a few pages ahead of, a specified page range. The same code can serve each function, and that code is encapsulated in a new vm_object_page function. For the swap case, this also eliminates some needless linked-list traversal.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

dougm requested review of this revision.Mar 4 2025, 1:42 AM
dougm created this revision.
sys/vm/swap_pager.c
1393–1400

Style(9) requests that local variables are declared at the beginning of block. Also a blank line should delineate var block from the first line of executable statements (if you decide to not move the declaration).

Also VLA is somewhat frowned upon, but I personally think it is fine, and we use it in other places in the kernel.

dougm retitled this revision from swap_pager: avoid linked list walk in getpages_locked to vm_object: add readbehind, readahead utilities.
dougm edited the summary of this revision. (Show Details)
dougm added reviewers: alc, markj.
This revision is now accepted and ready to land.Mar 7 2025, 5:48 AM
sys/vm/swap_pager.c
1376

We should also assert that rahead and rbehind are no larger than SWB_NPAGES - 1. Or even clamp those values below to imin(*a_rbehind, nitems(behind)) etc..

1399

Logically this block belongs inside the previous one, though there's no correctness issue as it is. I'd find the code a bit clearer with the blocks nested, though, i.e.

if (rbehind != 0) {
    ...
    rbehind = ...
    if (rbehind != 0) {
    ...
sys/vm/vm_object.c
2303

I'd find it clearer to use a separate variable for the next page, as you did in vm_object_page_readbehind(), rather than using p for two purposes.

dougm marked 4 inline comments as done.
dougm edited the summary of this revision. (Show Details)

Combine the two vm_object functions into one. Add assertions.

This revision now requires review to proceed.Mar 8 2025, 8:51 PM
dougm retitled this revision from vm_object: add readbehind, readahead utilities to vm_object: add getpages utility.Mar 8 2025, 8:58 PM
dougm added inline comments.
sys/vm/swap_pager.c
1402

If we could allocate bp before acquiring the object lock, then we could avoid using behind and ahead and have just one call to vm_object_getpages here. But a first attempt to have bp allocated before lock acquisition had some synchronization problem I don't know how to deal with.

dougm marked 2 inline comments as done.

Fix comments.

sys/vm/swap_pager.c
1402

How did you tried to allocate the buffer before taking the object lock?

sys/vm/swap_pager.c
1402

Pass sb to swap_pager_getpages_locked.

Alloc sb in swap_pager_getpages, and free it if PAGER_FAIL returned.

Alloc sb in swap_pager_swapoff, pass the address of it to swap_pager_swapoff_object. Free sb after objects all processed.

In swap_pager_swapoff_object, pass *sb to swap_pager_getpages_locked and allocate a new *sb before reacquiring object write lock.

But then I failed

			KASSERT(!pmap_page_is_mapped(m),

in swp_pager_async_iodone when I started running stress tests.

sys/vm/swap_pager.c
1402

Could you please share the patch you were testing?

Allocate the struct buf for swap_pager_getpages_locked early.

I'll test this version again to see if it crashes like it did yesterday.

The latest version has held up under a few hours of stress testing. Drop the handling of NULL parameters from vm_object_getpages().

sys/vm/swap_pager.c
2085

you won't have an allocated buf

Move the umz_zfree() in swap_pager_swapoff() to past the last backjumping goto.

sys/vm/swap_pager.c
1403

Rather than have three loops to set VPO_SWAPINPROG, it'd be cleaner to first copy ma into bp->b_pages[], then set VPO_SWAPINPROG on all pages in the buf. This means that we would be doing slightly more work under the object lock, but I think that's quite negligible, especially given that we already have to visit each page in ma[] while holding the object lock.

1411–1412

This comment was referring to the now-deleted assertion in the loop. I would just remove it too.

dougm marked 2 inline comments as done.

Let vm_object_getpages() do the copying of the pages between the borders, even though it happens with a lock held when that's not strictly necessary.

sys/vm/swap_pager.c
1390–1391

I do not understand the new comment ('requested pages')

1403
1405

I suggest to add yet another local for rbehind + count + rahead.

1428
sys/vm/vm_object.c
2270

I am sorry for this, but I suggest a different name. E.g. vm_object_prepare_getpages().
The function does not get pages from the pager.

dougm marked 2 inline comments as done.

Rewrite a comment. Take some suggestions from @kib, adapt some others. Rename the new function - though still need a better name.

sys/vm/swap_pager.c
1390–1391

@alc, this is your suggested change. I don't understand it any better than @kib does.

1405

I'll just assign to b_npages earlier.

sys/vm/swap_pager.c
1372

You could free the buffer here and avoid an added 'if' statement in swap_pager_getpages(). If swap_pager_swapoff_object() is ever modified not to panic on this failure, that would be the right action so as not to leak a buffer.

1390–1391

@kib is correct. Restore the word 'resident'. I would add the word 'already' preceding it.

1495–1496

This can be done before the object lock is acquired.

2087

I would restore this back to its original location, after the unlock, and move the allocation after the label full_rescan. There is no virtue to holding a buffer while pausing. Somebody else could be using the buffer instead.

sys/vm/vm_object.h
379–380

Once the name is finalized, sorted order should be restored.

sys/vm/swap_pager.c
1409

I would place this right before the unlock.

sys/vm/vm_object.c
2270

I'm going to suggest vm_object_prepare_buf_pages.

dougm marked 9 inline comments as done.

Apply @alc suggestions.

sys/vm/swap_pager.c
1372

If swap_pager_swapoff_object() does not panic, and returns immediately, then swap_pager_swapoff() would be left freeing an already freed buf. If it does not panic and does not return immediately, then it could later feed a freed buf to swap_pager_getpages_locked().

So I'm making this change for the sake of the optimization now, but I disagree that it helps when panic() is dropped later.

sys/vm/swap_pager.c
1495–1496

Only if an assertion is dropped from swblk_iter_init_only().

Undo a fatal line-swap from the last version.

alc added inline comments.
sys/vm/swap_pager.c
1347
1391

This line would be better placed immediately before the vm_object_prepare_buf_pages call.

sys/vm/vm_object.c
2270–2271
2283

Blank lines are more common before multiline comments. I would delete this one.

2292

If this vm_page_alloc_after failed, there is no point in trying another below. Set *rahead to 0.

This revision is now accepted and ready to land.Mar 14 2025, 6:22 AM
dougm added inline comments.
sys/vm/swap_pager.c
1391

Generally, 'behind' is handled before 'ahead'. What motivates the preference for the other order?

sys/vm/swap_pager.c
1391

Wait .. maybe I've misunderstood what line 'This line' refers to. I thought it was an assignment to rbehind. But I guess it's the comment before it. Okay, I can move the comment.

sys/vm/swap_pager.c
1391

Yes, I was referring to: "Allocate pages."

Apply suggestions from @alc. I expect to commit it tonight, so further comments should be made promptly.

This revision now requires review to proceed.Mar 14 2025, 9:48 PM
This revision is now accepted and ready to land.Mar 14 2025, 10:07 PM
This revision was automatically updated to reflect the committed changes.