Page MenuHomeFreeBSD

vm_page: Allow PG_NOFREE pages to be freed
Needs ReviewPublic

Authored by markj on Sun, Mar 23, 10:56 PM.

Details

Summary

There is at least one case where we need to support it: kmem_malloc()
might need to allocate multiple pages to satisfy a NOFREE allocation,
which it implements by calling vm_page_alloc() in a loop. If it fails
part-way though, it needs to free already-allocated pages, but this was
illegal.

Convert the bump allocator to a linked list; (ab)use the pindex field of
each page in the list to store the number of contiguous pages in the
block. (Originally I added a new plinks member for this purpose, but
it's not safe to use that until after vm_page_dequeue() is called due to
lazy page queue removal.) Then, modify vm_page_free() to support freeing
pages to this list.

Reported by: syzbot+93bc9edd2d0f22ae426a@syzkaller.appspotmail.com
Fixes: a8693e89e3e4 ("vm: Introduce vm_page_alloc_nofree_domain")

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 63087
Build 59971: arc lint + arc unit

Event Timeline

sys/vm/vm_page.c
4148

With this change, everyone can free NOFREE memory. Is there some way to have a function specifically for freeing NOFREE memory so that this assertion can remain in place for freeing "normal" memory?

Do we want some counter to report how many non-freeable pages were freed? It is some kind of leak, I must admit.

The change looks reasonable, but I'm personally not that keen on introducing the ability to free NOFREE pages.
IMO the key issue here is that vm_page_alloc_nofree should not be used to allocate a batch of pages.

One alternative I can think of is to extend the existing NOFREE allocator with a batched allocation routine, similar to the one proposed in D47050.
This would allow us to easily roll the allocator state back if an allocation fails without having to introduce a way of freeing NOFREE pages.

sys/vm/vm_page.c
2651

Is the __noinline here on purpose or a leftover from a debugging session?

The change looks reasonable, but I'm personally not that keen on introducing the ability to free NOFREE pages.
IMO the key issue here is that vm_page_alloc_nofree should not be used to allocate a batch of pages.

I don't think that's necessarily true: suppose that we want to allocate a single NOFREE page and some other resource in that order, and the resource allocation fails.

One alternative I can think of is to extend the existing NOFREE allocator with a batched allocation routine, similar to the one proposed in D47050.
This would allow us to easily roll the allocator state back if an allocation fails without having to introduce a way of freeing NOFREE pages.

Indeed, I was wondering if the batch allocator interface would be useful here. One problem with that interface as implemented is that it can return partial allocations, but for NOFREE allocations in this particular case one really wants everything or nothing. Is that true for all NOFREE allocations which ask for multiple pages?

Note though that there are really two changes here: one to change the internals of the NOFREE queue such that pages can be put back into the queue, and one to enable freeing of NOFREE pages via vm_page_free(). The first one is desirable in any case, I think, as the batch allocator would need a way to undo a partial allocation. For instance, if the queue contains two free pages, and a caller asks for three, and the allocation of the third page fails, we'd need to put the first two back. In the current scheme that particular case is doable but a bit complicated; I think having an explicit queue is simpler.

In D49480#1128459, @kib wrote:

Do we want some counter to report how many non-freeable pages were freed? It is some kind of leak, I must admit.

Well, the pages are still available for future NOFREE allocations. Or do you think we should go as far as reconstructing the NOFREE import chunks and freeing them back to the buddy allocator when possible?

sys/vm/vm_page.c
2651

It's intentional but admittedly somewhat unrelated to the rest of the patch: I just wanted to ensure that it doesn't get inlined into its callers, as it's rarely executed but its callers are frequently executed. Same with vm_page_free_nofree().

4148

Yes, I think that's reasonable.