Page MenuHomeFreeBSD

DMAR pagetable free: handle ref_count special references
ClosedPublic

Authored by kib on Jul 7 2024, 11:29 PM.
Tags
None
Referenced Files
F107135537: D45910.diff
Fri, Jan 10, 4:27 PM
F107088318: D45910.id140679.diff
Thu, Jan 9, 10:27 PM
F107081553: D45910.id140702.diff
Thu, Jan 9, 8:05 PM
Unknown Object (File)
Mon, Jan 6, 10:30 PM
Unknown Object (File)
Wed, Dec 18, 4:19 AM
Unknown Object (File)
Nov 29 2024, 6:46 PM
Unknown Object (File)
Nov 18 2024, 4:26 AM
Unknown Object (File)
Nov 18 2024, 4:26 AM
Subscribers

Details

Summary

Preserve object reference and block bit during refcount arithmetic and obliteration.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.Jul 7 2024, 11:29 PM
kib edited the summary of this revision. (Show Details)

Use of vm_page_drop() is somewhat weird of course, but IMO it is better then rolling the atomics manually.

sys/x86/iommu/intel_idpgtbl.c
393

I'd prefer to add a vm_page_hold() which specifies the number of references to add. It can share the implementation with vm_page_drop(), I just don't like using "drop" here.

748

Why is it ok to simply clear references like this? I'm not very familiar with this code but it looks odd. Is ref_count tracking the number of PTEs referencing the page?

I think I'd prefer to have a vm_page_clear() helper for this.

kib marked 2 inline comments as done.Jul 8 2024, 5:51 PM
kib added inline comments.
sys/x86/iommu/intel_idpgtbl.c
748

Yes, the ref_count is used as a population count for PTEs on the page table page. But also, the page is owned by the page table' vm_object to keep them in some container for convenience of bulk ops like page table demolition.

kib marked an inline comment as done.

vm_page_ref/vm_page_clearref

sys/vm/vm_page.h
953

It would be good to check for overflow like vm_page_wire() does.

Actually, why can't the iommu driver use vm_page_wire()?

kib marked an inline comment as done.Jul 12 2024, 4:02 AM
kib added inline comments.
sys/vm/vm_page.h
953

I think vm_page_wire() is semantically unsuitable. For instance, the function sets PGA_DEQUEUE, which makes no sense for the use case, and I am not even sure if it is harmless WRT assertions. Similarly, if used _wire(), then _unwire() is expected by reader, but it is definitely wrong there.

Also, this special case would require special handling if vm_page_wire() grows further needs in future.

kib marked an inline comment as done.

Assert that vm_page_ref() does not overflow nor underflow.

sys/vm/vm_page.h
953

What you say is true for managed pages/objects, but from my reading, that's not the case here. Do you have some plan to use managed pages here?

sys/vm/vm_page.h
953

Wiring makes sense for managed pages as well. For all pages, vm_page_wire() requires an object lock, which would force me to use the object lock for page tables lock (this is fine now, but might be not so good in future) etc.

I do not intend to make page table pages pageable for IOMMUs, this is not how any CPU pmap works on FreeBSD, and I do not see why would it makes sense for IOMMU page tables.

But still, my feel is that wiring is the wrong concept to apply for arbitrary use of the ref_count. It happens that some flags were moved into ref_count and that started colliding with use of object container.

sys/vm/vm_page.h
953

For all pages, vm_page_wire() requires an object lock

The xbusy lock is also sufficient, and I see no reason these pages cannot be kept busy all the time. iommu_pgalloc() specifies NOBUSY, and iommu_pgfree() uses vm_page_grab(NOCREAT) instead of vm_page_lookup(), but if those functions are changed, I think vm_page_wire() will work without modification.

(In fact, the assertion at the beginning of vm_page_wire() really only applies to managed pages...)

But still, my feel is that wiring is the wrong concept to apply for arbitrary use of the ref_count.

I don't really understand why. "Wiring" has no special meaning that makes it different from a refcount, especially for unmanaged pages. In the past we used to have separate counters, wire_count and hold_count, with some semantic differences, but after they were merged, vm_page_wire() might as well be called vm_page_ref(). I think I didn't rename them those functions only to avoid churn, I just renamed wire_count -> ref_count.

I don't really object to the change, but IMO the added vm_page_ref() function is gratuitous. It is hard to know when to use vm_page_wire() vs. vm_page_ref().

This revision is now accepted and ready to land.Jul 14 2024, 4:05 PM