Page MenuHomeFreeBSD

swap_pager: aggregate freeswapspace calls in swapoff_object
ClosedPublic

Authored by dougm on Jun 21 2024, 6:30 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Oct 19, 7:00 AM
Unknown Object (File)
Sep 11 2024, 10:43 PM
Unknown Object (File)
Sep 1 2024, 5:54 AM
Unknown Object (File)
Aug 22 2024, 7:02 AM
Unknown Object (File)
Aug 22 2024, 4:15 AM
Unknown Object (File)
Aug 22 2024, 4:15 AM
Unknown Object (File)
Aug 22 2024, 4:15 AM
Unknown Object (File)
Aug 22 2024, 4:15 AM
Subscribers

Details

Summary

Function swap_pager_swapoff_object calls vm_pager_unswapped (via swp_pager_force_dirty) for every page that must be unswapped. That means that there's an unneeded check for lock ownership (the caller always owns it), a needless PCTRIE_LOOKUP (the caller has already found it), a call to free one page of swap space only, and a check to see if all blocks are empty, when the caller knows whether that check is useless.

Isolate the essential part, needed however swap_pager_unswapped is invoked, into a smaller function swap_pager_unswapped_acct (suggestions for a better name are welcome). From swapoff_object, invoke swp_pager_update_freerange for each appropriate page, so that there are potentially fewer calls to swp_pager_freeswapspace. Avoid re-fetching swblks and re-scanning them in cases where the object lock has been held since the last fetch and scan.

Test Plan

I've been running the stress2 swapout test for 90 minutes. No problems seen yet.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

dougm requested review of this revision.Jun 21 2024, 6:30 AM
dougm created this revision.

Abort that. The test failed after a few more minutes.

Fixed a couple of errors. Test has run for 71 minutes now.

dougm edited the test plan for this revision. (Show Details)
dougm added a reviewer: kib.
sys/vm/swap_pager.c
1197

I would move the assert to the beginning of the new function.

1868

Is such condition (empty swblk there) allowed now?

Also, this code fragment can be written as swp_pager_free_empty_swblk() AFAIS.

dougm marked 2 inline comments as done.

Assume that once we've find that one block in a swblk is invalid, it stays invalid, so that we don't have to start i from zero with every iteration of the same swblk, and only have to reset it to zero when we start the next swblk. Look for a block not on the device, but valid, and if we find one of these 'bad_blocks', don't free the swblk when we're done.

When manage to launder the last block of the swblk, don't do another lookup of the same swblk in the trie, since we're done with that swblk already.

A stress test with this version has been running for 72 minutes so far.

sys/vm/swap_pager.c
1868

I think I can remove the swblk_empty test and replace it with something else.

A question for @alc:
I have a concern that the latest version is over-optimized, and that that over-optimization introduces a bug when two processes are invoking swap_pager_swapoff_object, for two different swdevt values, but looking at the same swblk at once.

In the current committed code, every time the swblk is fetched from the trie, a scan starts from i=0 looking for the first valid block, and for the end of the range of valid blocks that begins there (which end up as i and i+nv in the committed code). In the next iteration of the swblk loop, when the sb is fetched from the trie to resume cleanup work, do we need to start again from zero looking for the first valid block, or can we start where the last search left off? Might another process have made an early block valid again?

Also, if we find that block 0 is invalid because it has a real swapblock but is not on the sp device, can I assume that it will still be invalid and on the sp device the next time I iterate the loop on the same sb?

If this is a valid concern, then the fix is to reset i=0 and bad_block = false not right after incrementing pi in line 1918, but instead near the top of the loop at around line 1853.

Reset the counter i to 0 after losing and regaining the object lock. Don't re-lookup in the pctrie when i != 0, since we haven't released the object lock since the last lookup. Rename bad_block to something hopefully more appropriate (sb_empty).

I ran a very short test (4 hours) with D45668.140098.patch, without observing any issues.

sys/vm/swap_pager.c
1881–1882

Is this safe now? Imagine that we paged in two pages that are not yet marked as dirty. Then on restart we could drop the object lock while still not yet get to the clean paged-in pages.

sys/vm/swap_pager.c
1881–1882

In the current code, we call break in the next line, and since i < SWAP_META_PAGES here, we cycle back to line 1839 and lookup sb in the trie again.

In the modified code, we set i to 0 in the next line and cycle back to line 1840, and since i is 0 we lookup sb in the trie again.

So it's the same. I haven't made anything more or less safe.

sys/vm/swap_pager.c
1881–1882

Maybe your concern is that some page that we've found empty and added to &range will no longer be empty by the time we call swp_pager_freeswapspace to free range. If that' s a legitimate concern, I can move range init and freeswapspace calls so that we init after regaining the lock and freeswapspace before releasing it. However, if that's a concern, then it's also a concern in swp_pager_meta_transfer where we have a similar range_init/freeswapspace pair around a loop where the lock may be freed and reacquired. So if this is wrong for that reason, then so is that other code.

Kostik, does this address your concern?

sys/vm/swap_pager.c
1881–1882

No, this is not what I tried to explain. Also I do think now that the old code has the same problem.

Let me try again: if we paged in some page, it is not marked dirty yet. It should be marked dirty later, when the cycle is restarted and we get to the same page again. But suppose that we unlocked the object before we got to the page. Then other thread can see the clean page and e.g. free it.

sys/vm/swap_pager.c
1881–1882

I think I understand. However, when I tried to implement something based on that understanding, which would mark dirty all paged-in pages before we call swap_pager_getpages_locked and release the lock, I found that swap_pager_getpages_locked fails an assertion when any pages after the first one we're getting have been marked dirty.

@alc has tried to explain this behavior to me. Maybe he can explain why your concern is not a valid one. Anyway, if it's not a problem introduced by this change, then it should affect whether or not this change is acceptable.

sys/vm/swap_pager.c
1881–1882

Make that "should not affect".

Back up to the previous version of this patch, since this version wasn't addressing Kostik's concern.

kib added inline comments.
sys/vm/swap_pager.c
1881–1882

Why not mark the page dirty right after page-in, before unbusy?

This revision is now accepted and ready to land.Jun 26 2024, 8:54 AM
sys/vm/swap_pager.c
1881–1882

I would duplicate five lines of code. What would the benefit be?

sys/vm/swap_pager.c
1881–1882

The benefit would be other threads not able to find the clean page which is logically dirty if ever the object is unlocked, thus fixing the bug I am talking about above.

Flatten out the loop - no more double nesting, no more counting length of valid page sequences. As per Kostik's suggestion, dirty the page that has just been read.

This revision now requires review to proceed.Jun 27 2024, 6:52 AM
kib added inline comments.
sys/vm/swap_pager.c
1896
1911

Can we please increment i in separate statement?

1922

This if is too dense even by my standards. I suggest to calculate the bool value outside the if condition, in several statements.

This revision is now accepted and ready to land.Jun 27 2024, 7:54 AM
This revision was automatically updated to reflect the committed changes.