Page MenuHomeFreeBSD

Simplify flow of pmap_demote_pde_locked() and add more comprehensive debugging checks.
ClosedPublic

Authored by kib on May 15 2019, 8:54 PM.
Tags
None
Referenced Files
F108210717: D20266.diff
Wed, Jan 22, 4:52 PM
Unknown Object (File)
Fri, Jan 17, 5:21 AM
Unknown Object (File)
Mon, Jan 13, 7:32 AM
Unknown Object (File)
Sat, Jan 11, 6:17 PM
Unknown Object (File)
Thu, Jan 2, 3:47 PM
Unknown Object (File)
Thu, Dec 26, 11:00 AM
Unknown Object (File)
Dec 19 2024, 10:19 AM
Unknown Object (File)
Dec 19 2024, 10:19 AM
Subscribers

Details

Summary

In particular,

  • Move the code to handle failure to allocate page table page into helper
  • After the previous item is done, it is possible to distinguish !PG_A case and case of missed page, in the control flow.
  • Make the variable to indicate that in-kernel mapping is demoted.
  • Assert that missed page table page can only happen for in-kernel mapping when demoting direct map.
  • If DIAGNOSTIC is enabled, and the page table page should be already filled, check all ptes instead of only first one.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 24490

Event Timeline

sys/amd64/amd64/pmap.c
1402

If you're using rounded-up *firstaddr for this loop, you should just move the loop to after the point where you've rounded up *firstaddr in line 1045.

sys/amd64/amd64/pmap.c
1402

This operation is boot-time only, so micro-optimizing it is not too important. The current sequence of initialization actually makes some sense from the hierarchical view on the page tables, and I want to keep it as is.

What are the errors that Peter reported?

sys/amd64/amd64/pmap.c
1752

The kernel page table pages that are dynamically allocated by pmap_growkernel() will have a wire count of 1.

In D20266#437675, @alc wrote:

What are the errors that Peter reported?

It started with assert *firstpte & PG_FRAME != newpte & PG_FRAME at old line 4179 for pipe_map. Then I added the comparision and dump of the whole page table page and saw some other cases.

sys/amd64/amd64/pmap.c
4616โ€“4623

When I wrote this block of code, the PG_PROMOTED flag did not yet exist. If, today, I were writing this block for the first time, I would test (oldpte & PG_PROMOTED) == 0 instead of mpte->wire_count == 1.

4629โ€“4630

I speculate that this block would no longer be needed if we test (oldpte & PG_PROMOTED) == 0 above. In particular, pmap_protect_pde() clears PG_PROMOTED if it removes access rights from the PDE.

Also, I never intended for promotion and demotion on the kernel pmap to create an inconsistency in how we maintain the wire count on kernel page table pages. Switching to testing (oldpte & PG_PROMOTED) == 0 would allow for eliminating that inconsistency.

Index: amd64/amd64/pmap.c
===================================================================
--- amd64/amd64/pmap.c  (revision 348078)
+++ amd64/amd64/pmap.c  (working copy)
@@ -4537,8 +4537,10 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pd
                            " in pmap %p", va, pmap);
                        return (FALSE);
                }
-               if (va < VM_MAXUSER_ADDRESS)
+               if (va < VM_MAXUSER_ADDRESS) {
+                       mpte->wire_count = NPTEPG;
                        pmap_resident_count_inc(pmap, 1);
+               }
        }
        mptepa = VM_PAGE_TO_PHYS(mpte);
        firstpte = (pt_entry_t *)PHYS_TO_DMAP(mptepa);
@@ -4553,10 +4555,8 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pd
        /*
         * If the page table page is new, initialize it.
         */
-       if (mpte->wire_count == 1) {
-               mpte->wire_count = NPTEPG;
+       if ((oldpde & PG_PROMOTED) == 0)
                pmap_fill_ptp(firstpte, newpte);
-       }
        KASSERT((*firstpte & PG_FRAME) == (newpte & PG_FRAME),
            ("pmap_demote_pde: firstpte and newpte map different physical"
            " addresses"));

Use PG_PROMOTED == 0 as a predicate for the pmap_pt_fill() call.

sys/amd64/amd64/pmap.c
4549

I guess it's supposed to be MAXUSER_ADDRESS. Technically this is a functional change since before we were only allocating with interrupt priority for demotions in the direct map. I think it's fine, just contradicts the review description.

sys/amd64/amd64/pmap.c
1402

Unless you want to set PG_PROMOTED on the PDEs below, this loop is pointless (and has been for quite some time).

If we actually want to support demotion on the virtual address range being mapped here, we should be inserting these page table pages into the pmap's radix tree.

sys/amd64/amd64/pmap.c
1402

... and, in fact, pmap_init() is doing the insertion.

sys/amd64/amd64/pmap.c
1753

I'm not convinced that KERNend is the correct value to use here. I think that it is too small.

Fix VM_MAXUSER_ADDRESS spelling.
Remember the last initialized page table page for insertion into the pmap radix.

sys/amd64/amd64/pmap.c
1334

Before introducing a new variable, please take a look at how KERNend is initialized and used. I don't think that we need both KERNend and kern_promoted_end.

sys/amd64/amd64/pmap.c
4549

The only situation in which this code should ever allocate a kernel page table page is when it is called upon to demote a direct map mapping, so this doesn't really represent a functional change.

sys/amd64/amd64/pmap.c
1334

We probably can remove KERNend if going ahead with my change. I think you mean that KERNend is de-facto unused in the patched pmap, and kern_promoted_end can be int, i.e. 4 bytes shorter.

Can you separate the pmap_demote_pde_locked() changes from the create_pagetables()/pmap_init() changes, i.e., create a separate patch.

Leave only refactoring of the demotion in this review.

kib retitled this revision from Fixes for some corner cases of amd64 demotions. to Simplify flow of pmap_demote_pde_locked() and add more comprehensive debugging checks..May 24 2019, 5:34 PM
kib edited the summary of this revision. (Show Details)

i386 has similar code.

sys/amd64/amd64/pmap.c
4554

_fail is a bit misleading here. Maybe _abort?

This revision is now accepted and ready to land.May 24 2019, 6:09 PM
sys/amd64/amd64/pmap.c
4541

You've eliminated the only use of PG_G in this function.

sys/amd64/amd64/pmap.c
4569โ€“4577

The first part of this comment no longer belongs here.

kib marked 2 inline comments as done.May 24 2019, 10:11 PM
kib added inline comments.
sys/amd64/amd64/pmap.c
4541

It is used by PG_PTE_PROMOTED macro. I tried that in initial version.

Split large comment.
_fault() -> _abort().

This revision now requires review to proceed.May 24 2019, 10:12 PM
This revision is now accepted and ready to land.May 24 2019, 11:10 PM

I ran tests on D20266.57868.diff for 24 hours without seeing any problems.

In D20266#440649, @pho wrote:

I ran tests on D20266.57868.diff for 24 hours without seeing any problems.

Thank you. It would require a combined test with D20380 after both settle.

sys/amd64/amd64/pmap.c
4492

You could simply add PAGE_SIZE to newpte at the end of each iteration.

4496

The "XX" should be replaced.

4524โ€“4525

You should be able to unwrap the format string.

4571

"is for a kernel address, ..."

4594

"abort" could reasonably be interpreted as leave the mapping in place. Instead, I would explicitly say, "invalidate the 2MB page mapping".

Handle Alan' comments. Adjust the output of the diagnostic version of the checker even more.

This revision now requires review to proceed.May 27 2019, 3:43 PM
markj added inline comments.
sys/amd64/amd64/pmap.c
4669

This format string may be unwrapped as well.

This revision is now accepted and ready to land.May 27 2019, 6:41 PM

Unwrap one more CTR format string.

This revision now requires review to proceed.May 27 2019, 7:13 PM
alc added inline comments.
sys/amd64/amd64/pmap.c
4494

I suggest: "different pages: found %#lx, expected %#lx\n"

This revision is now accepted and ready to land.May 31 2019, 4:31 PM
This revision was automatically updated to reflect the committed changes.