Page MenuHomeFreeBSD

linuxkpi: Add `folio` and `folio_batch` APIs
Needs ReviewPublic

Authored by dumbbell on Jan 31 2025, 2:08 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Apr 7, 11:40 AM
Unknown Object (File)
Mon, Mar 24, 8:31 PM
Unknown Object (File)
Tue, Mar 18, 1:34 PM
Unknown Object (File)
Wed, Mar 12, 1:01 AM
Unknown Object (File)
Tue, Mar 11, 12:07 AM
Unknown Object (File)
Mar 9 2025, 10:16 PM
Unknown Object (File)
Mar 8 2025, 6:47 PM
Unknown Object (File)
Mar 5 2025, 10:00 PM

Details

Reviewers
bz
kib
markj
Group Reviewers
linuxkpi
Summary

They are used by the i915 DRM driver in Linux 6.6 (although this change was only backported with Linux 6.7 DRM drivers...).

struct folio simply wraps struct page for now.

struct folio_batch is the same as struct pagevec but it works with struct folio instead of struct page directly.

This is part of the update of DRM drivers to Linux 6.7.

Sponsored by: The FreeBSD Foundation

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

(For next time please upload with context, e.g. -U999999)

Add diff context. No other changes.

Manu told me the same, so while I’m uploading patches to address bz feedback, I’m updating existing patch with their context.

bz requested changes to this revision.Feb 8 2025, 8:37 PM
bz added a subscriber: bz.
bz added inline comments.
sys/compat/linuxkpi/common/include/linux/gfp.h
137

Can we please move the alloc (and free) functions into C files so we can change them in the future without breaking the KBI?

sys/compat/linuxkpi/common/include/linux/mm.h
169

return ()

173

This indicates there is some kind of reference counting on struct folio? Are you sure you can always just free here? Hmm given there is no folio_get yet you may indeed still be able to get away with it... but more sophisticated use will likely require this one day.

451

return ()

457

return ()

463

return ()

475

return ()

sys/compat/linuxkpi/common/include/linux/mm_types.h
83
/*
 * text
 * more text
 */
84

Also __folio_batch_release() relies on this currently.

sys/compat/linuxkpi/common/include/linux/pagevec.h
93

return ()

99

return ()

106

return ()

113

This also has the assumption that struct folio is struct page.

sys/compat/linuxkpi/common/include/linux/shmem_fs.h
64

return ()

sys/compat/linuxkpi/common/include/linux/swap.h
63

XXX-BZ look this up as we should not do that anymore; there should be a page_* KPI for this? It'll break when we will have a real struct page and not just a typedef.

Can we try calling mark_page_accessed() for the page? That way we contain the problem to the one function for now?

This revision now requires changes to proceed.Feb 8 2025, 8:37 PM
sys/compat/linuxkpi/common/include/linux/mm.h
173

You are right. I will rework the patch to fix it.

I spent some time to review the folio code and the existing page allocation code (alloc_pages(), __free_pages(), put_page(), release_pages(), ...).

Pages from alloc_pages() are not attached to a VM object (unmanaged). Our code assumes that a page allocated this way will be freed using __free_pages() which takes care of the unwiring and if 0, caliing vm_page_free().

put_page() and release_pages() assume that the page was allocated by something else and pages are managed and that something else will take care of the page free. They don’t do anything special when the wiring count reaches zero.

Now from folio_batch_release(), I wonder if I should check the VPO_MANAGED flag to determine if the page should simply be unwired (same as put_page()) or if it should call __free_pages(). In other words, I wonder if this flag can be used to unify the put_page(), release_pages() and folio_batch_release() functions.

FYI, on Linux, put_page() calls folio_put() (which I will add) and folio_batch_release() calls release_pages() (which accept a struct page ** or a struct folio ** through an anonymous union and assumes that a page can be casted to a folio). When the refcount reaches zero, both code paths call the same code to free pages.

What do you think, @bz and @kib?

I reworked the patch to unify the put_page() and __free_pages() code paths. Now, all put/freed pages are handled by __free_pages().

release_pages() accepts either a list of struct page or a list of struct folio, like Linux.

bz added a subscriber: markj.

I would be tempted to break out the move of release_pages() w/o modifications into an earlier commit.

Likewise depending on outcome of the review the adjustments for put_page etc. (but that's partially beyond me; I hope markj or kib can help).

Then the remaining bits a pure folio.

sys/compat/linuxkpi/common/include/linux/mm.h
292

I would put this before folio_get() so get_page and put_page are together.

296

Are we sure that that is the case in both instances in linux_free_pages(), especially the else path?

Wouldn't the original vm_page_unwire() do the right thing as well if I read the comment above vm_page_unwire_managed() correctly? @markj?

308

can be cast to struct page. But a native English speaker may know better.

sys/compat/linuxkpi/common/src/linux_page.c
167

This really ignores the else case I believe; but see comment in mm_types.h for put_page(). Someone like @markj who knows the VM bits more needs to have a look at this. get_page and put_page need to keep working for both the direct map case and the else case here.

sys/compat/linuxkpi/common/src/linux_page.c
167

Why do we need to handle managed pages here? I'd assume that this function is a complement of linux_alloc_pages(), which returns unmanaged pages when PMAP_HAS_DMAP is true.

sys/compat/linuxkpi/common/include/linux/mm.h
308

yes, "can be cast to"

sys/compat/linuxkpi/common/include/linux/mm_types.h
89

Then write a static_assert that checks that offsetof is zero?
And more, since struct page and struct folio seems to be mutually equivalent, assert that sizes are same?

sys/compat/linuxkpi/common/include/linux/pagevec.h
69

We usually do not add such delimiters. At best you could write a multi-line herald like

/*
 * folio-batch stuff
 */
sys/compat/linuxkpi/common/include/linux/mm_types.h
89

I do not think we can assert that sizes are the same. It's just one is embedded in the other. It's the other way round on Linux by the way if I understand it correctly. But I like the CTASSERT idea for the offset!

sys/compat/linuxkpi/common/include/linux/pagevec.h
69

I do. Very old school. Sorry. Massively help if you need to delimit entire sections. No multi-line comment will ever do that. But I was wondering too how that appeared here ;-)

dumbbell added inline comments.
sys/compat/linuxkpi/common/include/linux/mm.h
296

I can read again the code in Linux because I don’t remember the details. I can write something about what I understand from the Linux code.

sys/compat/linuxkpi/common/src/linux_page.c
167

I will test again tomorrow without this but IIRC, I got a panic. Once I can reproduce, I will share details here and add a comment to the code.

On Linux, the "free page" code path is not the complement of alloc_pages() because the page may come from another "source".