Page MenuHomeFreeBSD

amd64: Make it possible to grow the KERNBASE region of KVA
ClosedPublic

Authored by markj on Sep 22 2022, 11:07 PM.
Tags
None
Referenced Files
Unknown Object (File)
Oct 10 2024, 7:40 AM
Unknown Object (File)
Oct 10 2024, 7:26 AM
Unknown Object (File)
Oct 10 2024, 7:25 AM
Unknown Object (File)
Oct 10 2024, 7:24 AM
Unknown Object (File)
Oct 4 2024, 5:01 PM
Unknown Object (File)
Oct 4 2024, 2:07 AM
Unknown Object (File)
Oct 3 2024, 12:59 PM
Unknown Object (File)
Oct 3 2024, 10:54 AM
Subscribers

Details

Summary

pmap_growkernel() may be called when mapping a region above KERNBASE,
typically for a kernel module. If we have enough PTPs left over from
bootstrap, pmap_growkernel() does nothing. However, it's possible to
run out, and in this case pmap_growkernel() will try to grow the kernel
map all the way from kernel_vm_end to somewhere past KERNBASE, which can
easily run the system out of memory. This happens with large kernel
modules like the nvidia GPU driver, see PR 265019. I also have a WIP
dtrace provider which needs to map KVA in the region above KERNBASE (to
provide trampolines which allow a copy of traced kernel instruction to
be executed), and it could potentially trigger this scenario.

This change modifies pmap_growkernel() to manage the two regions
separately, allowing them to grow independently. The end of the
KERNBASE region is tracked by modifying "nkpt".

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/amd64/amd64/pmap.c
5051

Is there a reason the second parameter here isn't pmap_pdpe_pindex(end)?

Looks sensible to me after the kboot stuff I've done, but the x86 pmap makes me a bit nervous so you might want to get more confirmation from others.

sys/amd64/amd64/pmap.c
5037–5048

It took me a second to puzzle out that this is the 'already mapped in' case so a comment might not be amiss

This revision is now accepted and ready to land.Sep 23 2022, 3:18 PM
sys/amd64/amd64/pmap.c
5006

Maybe add the word "dynamic" to the first case, i.e., "dynamic kernel memory allocations".

I think that it would be clearer to simply say the uppermost 2GB and avoid the word "negative" that may briefly confuse people.

5027

Add: if (end == 0) return;

5051

No. Once upon a time, the value passed here was nkpt. I changed it to its current value during an ill-fated attempt to eliminate nkpt, before nkpt became a tunable.

Should we start passing the start address to pmap_growkernel() as well? I mean, vm_map_insert() can pass the range, instead of the end of the range. Then the end calculation should be more clean, and the interface more future-proof.

In D36673#832777, @kib wrote:

Should we start passing the start address to pmap_growkernel() as well? I mean, vm_map_insert() can pass the range, instead of the end of the range. Then the end calculation should be more clean, and the interface more future-proof.

I'm not sure how it would make the calculation of end cleaner. We'd still have to determine whether the range comes before or after KERNBASE, or else I misunderstand.

This revision now requires review to proceed.Sep 23 2022, 9:11 PM

IMO it is cleaner to compare start and not end. Also you can assert that the region does not intersects with kernel.

This revision is now accepted and ready to land.Sep 23 2022, 9:19 PM