Page MenuHomeFreeBSD

riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion
ClosedPublic

Authored by jrtc27 on Jul 21 2021, 2:09 AM.
Tags
None
Referenced Files
Unknown Object (File)
Oct 19 2024, 3:31 PM
Unknown Object (File)
Oct 18 2024, 7:32 AM
Unknown Object (File)
Oct 3 2024, 7:18 AM
Unknown Object (File)
Oct 2 2024, 7:49 AM
Unknown Object (File)
Oct 1 2024, 3:33 AM
Unknown Object (File)
Sep 29 2024, 9:11 AM
Unknown Object (File)
Sep 22 2024, 10:17 PM
Unknown Object (File)
Sep 18 2024, 6:42 PM
Subscribers

Details

Summary

This repeats amd64's cfcbf8c6fd3b (r180498) and i386's cf3508519c5e
(r202894) but for riscv; pmap_kextract must be lock-free and so it can
race with superpage promotion and demotion, thus the L2 entry must only
be loaded once to avoid using inconsistent state.

PR: 250866

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 40590
Build 37479: arc lint + arc unit

Event Timeline

Since updating my Unmatched tree with this patch 11 days ago I haven't heard of any more panics from zBeeble (dgilbert) who's been continuing to build various ports locally, as well as possibly even a full buildworld+buildkernel. I've asked for confirmation of the distinct lack of panics (compared with one every day or so before), but strongly believe this was indeed the issue, especially since the amd64 and i386 commits explicitly mention the bug causing panics with ZFS (though despite trawling the web I couldn't find any record of *what* the panics were), which is what was seen here.

EDIT: Lack of panics with this patch applied has been confirmed.

Since updating my Unmatched tree with this patch 11 days ago I haven't heard of any more panics from zBeeble (dgilbert) who's been continuing to build various ports locally, as well as possibly even a full buildworld+buildkernel. I've asked for confirmation of the distinct lack of panics (compared with one every day or so before), but strongly believe this was indeed the issue, especially since the amd64 and i386 commits explicitly mention the bug causing panics with ZFS (though despite trawling the web I couldn't find any record of *what* the panics were), which is what was seen here.

What is the panic in this case?

This revision is now accepted and ready to land.Jul 21 2021, 1:00 PM

Since updating my Unmatched tree with this patch 11 days ago I haven't heard of any more panics from zBeeble (dgilbert) who's been continuing to build various ports locally, as well as possibly even a full buildworld+buildkernel. I've asked for confirmation of the distinct lack of panics (compared with one every day or so before), but strongly believe this was indeed the issue, especially since the amd64 and i386 commits explicitly mention the bug causing panics with ZFS (though despite trawling the web I couldn't find any record of *what* the panics were), which is what was seen here.

What is the panic in this case?

For example:

panic: pmap_l2_to_l3: PA out of range, PA: 0x0
cpuid = 1
time = 1625512247
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x148
panic() at panic+0x2a
pmap_remove_write() at pmap_remove_write+0x56a
vm_object_page_collect_flush() at vm_object_page_collect_flush+0xf8
vm_object_page_clean() at vm_object_page_clean+0x144
vinactivef() at vinactivef+0x90
vput_final() at vput_final+0x2ea
vput() at vput+0x32
vn_close1() at vn_close1+0x13c
vn_closefile() at vn_closefile+0x44
_fdrop() at _fdrop+0x18
closef() at closef+0x1b8
closefp_impl() at closefp_impl+0x78
closefp() at closefp+0x52
kern_close() at kern_close+0x134
sys_close() at sys_close+0xe
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0x6
KDB: enter: panic

For example:

panic: pmap_l2_to_l3: PA out of range, PA: 0x0
cpuid = 1
time = 1625512247
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x148
panic() at panic+0x2a
pmap_remove_write() at pmap_remove_write+0x56a
vm_object_page_collect_flush() at vm_object_page_collect_flush+0xf8
vm_object_page_clean() at vm_object_page_clean+0x144
vinactivef() at vinactivef+0x90
vput_final() at vput_final+0x2ea
vput() at vput+0x32
vn_close1() at vn_close1+0x13c
vn_closefile() at vn_closefile+0x44
_fdrop() at _fdrop+0x18
closef() at closef+0x1b8
closefp_impl() at closefp_impl+0x78
closefp() at closefp+0x52
kern_close() at kern_close+0x134
sys_close() at sys_close+0xe
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0x6
KDB: enter: panic

The change itself is easy enough to understand, but I don't see exactly how the issue correlates to the panic. Are you able to explain it?

From a quick look:
If pmap_kextract() races with demotion, then it's possible that the pa returned points to the l3 table, rather than the expected physical address corresponding to va. There aren't a ton of callers of pmap_kextract(), but one interesting one is pcpu_page_free(), which looks like it could inadvertently free the wrong vm page if the race happens as I described. Could this lead to the panics observed?

For example:

panic: pmap_l2_to_l3: PA out of range, PA: 0x0
cpuid = 1
time = 1625512247
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
kdb_backtrace() at kdb_backtrace+0x2c
vpanic() at vpanic+0x148
panic() at panic+0x2a
pmap_remove_write() at pmap_remove_write+0x56a
vm_object_page_collect_flush() at vm_object_page_collect_flush+0xf8
vm_object_page_clean() at vm_object_page_clean+0x144
vinactivef() at vinactivef+0x90
vput_final() at vput_final+0x2ea
vput() at vput+0x32
vn_close1() at vn_close1+0x13c
vn_closefile() at vn_closefile+0x44
_fdrop() at _fdrop+0x18
closef() at closef+0x1b8
closefp_impl() at closefp_impl+0x78
closefp() at closefp+0x52
kern_close() at kern_close+0x134
sys_close() at sys_close+0xe
do_trap_user() at do_trap_user+0x208
cpu_exception_handler_user() at cpu_exception_handler_user+0x72
--- exception 8, tval = 0x6
KDB: enter: panic

The change itself is easy enough to understand, but I don't see exactly how the issue correlates to the panic. Are you able to explain it?

Not really, it was a shot in the dark that seemed relevant to ZFS, so the justification for why it fixes the bug is rather empirical.

From a quick look:
If pmap_kextract() races with demotion, then it's possible that the pa returned points to the l3 table, rather than the expected physical address corresponding to va. There aren't a ton of callers of pmap_kextract(), but one interesting one is pcpu_page_free(), which looks like it could inadvertently free the wrong vm page if the race happens as I described. Could this lead to the panics observed?

I didn't really chase it through. But yes, that kind of thing is what I was thinking might be happening, where you end up doing manipulations on the "wrong" page due to pmap_kextract giving you back the wrong address and corrupt the pmap, only discovering at a later date when you come to do another operation on a now-corrupted part of the pmap.