Page MenuHomeFreeBSD

linuxkpi: Handle direct-mapped addresses in linux_free_kmem()
ClosedPublic

Authored by markj on May 9 2023, 7:22 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Dec 28, 4:50 PM
Unknown Object (File)
Dec 21 2024, 5:58 PM
Unknown Object (File)
Dec 15 2024, 8:02 PM
Unknown Object (File)
Dec 4 2024, 12:34 AM
Unknown Object (File)
Dec 1 2024, 12:58 AM
Unknown Object (File)
Nov 28 2024, 4:58 PM
Unknown Object (File)
Nov 26 2024, 9:28 PM
Unknown Object (File)
Nov 25 2024, 3:39 PM

Details

Summary

See the analysis in PR 271333. It is possible for driver code to
allocate a page, store its address as returned by page_address(), then
call free_page() on that address. On most systems that'll result in the
LinuxKPI calling kmem_free() with a direct-mapped address, which is not
legal.

Fix the problem by making linux_free_kmem() check the address to see
whether it's direct-mapped or not, and handling it appropriately.

PR: 271333

Test Plan

Untested so far.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj requested review of this revision.May 9 2023, 7:22 PM
This revision is now accepted and ready to land.May 9 2023, 8:50 PM

Two comments:

  1. DMAP_MAX_ADDRESS is inclusive, not exclusive, so it should be <= not < there. See this mistake repeated a few places. Maybe fix that too sparately.

/usr/img/freebsd.git/sys/amd64/include/param.h:#define INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < DMAP_MAX_ADDRESS) \
/usr/img/freebsd.git/sys/amd64/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL(va)
/usr/img/freebsd.git/sys/arm/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL((va))
/usr/img/freebsd.git/sys/arm/include/stack.h:#define INKERNEL(va) (((vm_offset_t)(va)) >= VM_MIN_KERNEL_ADDRESS)
/usr/img/freebsd.git/sys/arm64/include/csan.h: if (!INKERNEL((vm_offset_t)frame.pc))
/usr/img/freebsd.git/sys/arm64/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL((va))
/usr/img/freebsd.git/sys/arm64/include/stack.h:#define INKERNEL(va) \
/usr/img/freebsd.git/sys/i386/include/param.h:#define INKERNEL(va) (TRUE)
/usr/img/freebsd.git/sys/i386/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL(va)
/usr/img/freebsd.git/sys/riscv/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL((va))
/usr/img/freebsd.git/sys/riscv/include/stack.h:#define INKERNEL(va) ((va) >= VM_MIN_KERNEL_ADDRESS && \

  1. Use the INKERNEL() macro instead?
  1. Test compile on a platform without DMAP support, so that you don't get a compile error!
This revision now requires changes to proceed.May 10 2023, 9:20 AM
sys/amd64/compile/GENERIC/machine/param.h:#define	INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < DMAP_MAX_ADDRESS) \
sys/amd64/compile/GENERIC/machine/vmparam.h:#define	DMAP_MAX_ADDRESS	KV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0)
sys/amd64/compile/GENERIC/assym.inc:#define	DMAP_MAX_ADDRESS	0xfffffc0000000000
sys/amd64/include/param.h:#define	INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < DMAP_MAX_ADDRESS) \
sys/amd64/include/vmparam.h:#define	DMAP_MAX_ADDRESS	KV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0)
sys/arm64/include/vmparam.h:#define	DMAP_MAX_ADDRESS	(0xffffff0000000000UL)
sys/arm64/include/vmparam.h:#define	DMAP_MAX_PHYSADDR	(dmap_phys_max)
sys/cddl/contrib/opensolaris/uts/common/sys/idmap.h:#define	IDMAP_MAX_DOOR_RPC		(256 * 1024)
sys/contrib/openzfs/include/os/freebsd/spl/sys/idmap.h:#define	IDMAP_MAX_DOOR_RPC		(256 * 1024)
sys/dev/hid/hidmap.h:#define	HIDMAP_MAX_MAPS	4
sys/powerpc/include/vmparam.h:#define	DMAP_MAX_ADDRESS	0xc007ffffffffffffUL
sys/powerpc/include/vmparam.h:#define	DMAP_MAX_ADDRESS	0xbfffffffUL
sys/riscv/include/vmparam.h:#define	DMAP_MAX_ADDRESS	(0xfffffff000000000UL)
sys/riscv/include/vmparam.h:#define	DMAP_MAX_PHYSADDR	(dmap_phys_max)

@kib : Shouldn't DMAP_MAX_ADDRESS and DMAP_MAX_PHYSADDR be inclusive maximums, so that we can cover the full 64-bit range? I see we have both variants in the kernel?

sys/compat/linuxkpi/common/src/linux_page.c
196–197

I think this should simply be:

if (INKERNEL(addr))
markj removed a subscriber: kib.

Invert the test, so that DMAP_(MIN|MAX)_ADDRESS aren't referenced.

Two comments:

  1. DMAP_MAX_ADDRESS is inclusive, not exclusive, so it should be <= not < there. See this mistake repeated a few places. Maybe fix that too sparately.

How did you come to that conclusion? On amd64 DMAP_MAX_ADDRESS is KV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0) = 0xfffff80000000000 + NDMPML4E*0x8000000000. On arm64 it is 0xffffff0000000000UL.

/usr/img/freebsd.git/sys/amd64/include/param.h:#define INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < DMAP_MAX_ADDRESS) \
/usr/img/freebsd.git/sys/amd64/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL(va)
/usr/img/freebsd.git/sys/arm/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL((va))
/usr/img/freebsd.git/sys/arm/include/stack.h:#define INKERNEL(va) (((vm_offset_t)(va)) >= VM_MIN_KERNEL_ADDRESS)
/usr/img/freebsd.git/sys/arm64/include/csan.h: if (!INKERNEL((vm_offset_t)frame.pc))
/usr/img/freebsd.git/sys/arm64/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL((va))
/usr/img/freebsd.git/sys/arm64/include/stack.h:#define INKERNEL(va) \
/usr/img/freebsd.git/sys/i386/include/param.h:#define INKERNEL(va) (TRUE)
/usr/img/freebsd.git/sys/i386/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL(va)
/usr/img/freebsd.git/sys/riscv/include/pmc_mdep.h:#define PMC_IN_KERNEL(va) INKERNEL((va))
/usr/img/freebsd.git/sys/riscv/include/stack.h:#define INKERNEL(va) ((va) >= VM_MIN_KERNEL_ADDRESS && \

  1. Use the INKERNEL() macro instead?

That is not equivalent, and would re-introduce the bug. The problem is that we're treating a page mapped in the range [VM_MIN_KERNEL_ADDRESS, VM_MAX_KERNEL_ADDRESS] as being mapped in the range [DMAP_MIN_ADDRESS, DMAP_MAX_ADDRESS].

  1. Test compile on a platform without DMAP support, so that you don't get a compile error!

Indeed, this doesn't compile on !DMAP platforms.

@markj :

Could you figure out if VM_MAX_KERNEL_ADDRESS is inclusive or exclusive.

To me it looks platform dependent. Not that it matters, because I suppose allocations won't happen for one byte anyway .. though ...

grep -r VM_MAX_KERNEL_ADDRESS /usr/src/sys/ | grep define

@markj :

Could you figure out if VM_MAX_KERNEL_ADDRESS is inclusive or exclusive.

To me it looks platform dependent. Not that it matters, because I suppose allocations won't happen for one byte anyway .. though ...

grep -r VM_MAX_KERNEL_ADDRESS /usr/src/sys/ | grep define

It is exclusive. The range of addresses in the kernel map is [VM_MIN_KERNEL_ADDRESS, VM_MAX_KERNEL_ADDRESS).

sys/compat/linuxkpi/common/src/linux_page.c
196–197

That would defeat the purpose of the patch.

@markj: How did you come to that conclusion? On amd64 DMAP_MAX_ADDRESS is KV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0) = 0xfffff80000000000 + NDMPML4E*0x8000000000. On arm64 it is 0xffffff0000000000UL.

I just grepped and found:

/usr/src/sys/amd64/compile/GENERIC/machine/param.h:#define INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < DMAP_MAX_ADDRESS) \
/usr/src/sys/amd64/compile/GENERIC/machine/vmparam.h:#define DMAP_MAX_ADDRESSKV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0)
/usr/src/sys/amd64/compile/GENERIC/assym.inc:#define DMAP_MAX_ADDRESS 0xfffffc0000000000
/usr/src/sys/amd64/include/param.h:#define INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < DMAP_MAX_ADDRESS) \
/usr/src/sys/amd64/include/vmparam.h:#define DMAP_MAX_ADDRESS KV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0)
/usr/src/sys/arm64/include/vmparam.h:#define DMAP_MAX_ADDRESS (0xffffff0000000000UL)
/usr/src/sys/powerpc/include/vmparam.h:#define DMAP_MAX_ADDRESS 0xc007ffffffffffffUL
/usr/src/sys/powerpc/include/vmparam.h:#define DMAP_MAX_ADDRESS 0xbfffffffUL
/usr/src/sys/riscv/include/vmparam.h:#define DMAP_MAX_ADDRESS (0xfffffff000000000UL)

powerpc seems inclusive. You see the 0xf's going all the way down. So I immediately thought there is something not clear there.

If the value is exclusive you can cover -1ULL ... Is that an issue?

--HPS

sys/compat/linuxkpi/common/src/linux_page.c
196–197

I see.

sys/compat/linuxkpi/common/src/linux_page.c
199–206

Don't you want to check that

  1. addr is in DMAP
  2. the physical address corresponds to the initialized vm_page
markj added inline comments.
sys/compat/linuxkpi/common/src/linux_page.c
199–206

The allocator gives out either direct-mapped pages, or mappings returned by kmem_malloc(). So, I think it's safe to assume here that addr is within the direct map and is valid.

PHYS_TO_VM_PAGE() will return NULL if the physaddr does not belong to an initialized page, so in the case of a bug this code will fail closed. Furthermore, DMAP_TO_PHYS() asserts that the address lies within the direct map.

@markj: How did you come to that conclusion? On amd64 DMAP_MAX_ADDRESS is KV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0) = 0xfffff80000000000 + NDMPML4E*0x8000000000. On arm64 it is 0xffffff0000000000UL.

I just grepped and found:

/usr/src/sys/amd64/compile/GENERIC/machine/param.h:#define INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < DMAP_MAX_ADDRESS) \
/usr/src/sys/amd64/compile/GENERIC/machine/vmparam.h:#define DMAP_MAX_ADDRESSKV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0)
/usr/src/sys/amd64/compile/GENERIC/assym.inc:#define DMAP_MAX_ADDRESS 0xfffffc0000000000
/usr/src/sys/amd64/include/param.h:#define INKERNEL(va) (((va) >= DMAP_MIN_ADDRESS && (va) < DMAP_MAX_ADDRESS) \
/usr/src/sys/amd64/include/vmparam.h:#define DMAP_MAX_ADDRESS KV4ADDR(DMPML4I + NDMPML4E, 0, 0, 0)
/usr/src/sys/arm64/include/vmparam.h:#define DMAP_MAX_ADDRESS (0xffffff0000000000UL)
/usr/src/sys/powerpc/include/vmparam.h:#define DMAP_MAX_ADDRESS 0xc007ffffffffffffUL
/usr/src/sys/powerpc/include/vmparam.h:#define DMAP_MAX_ADDRESS 0xbfffffffUL
/usr/src/sys/riscv/include/vmparam.h:#define DMAP_MAX_ADDRESS (0xfffffff000000000UL)

powerpc seems inclusive. You see the 0xf's going all the way down. So I immediately thought there is something not clear there.

Indeed, I think powerpc should be fixed.

If the value is exclusive you can cover -1ULL ... Is that an issue?

I don't think it's an issue here, since addr should be page-aligned (I added an assertion to this effect).

Is there anything else I should do for this diff?

Change looks good, and the questions raised have been answered.

One last thought:

Maybe KASSERT() the "order" is in some sane range, so the shift operation is not invalid. Should also be done for allocations.

This revision is now accepted and ready to land.May 30 2023, 7:51 PM

Maybe KASSERT() the "order" is in some sane range, so the shift operation is not invalid. Should also be done for allocations.

Done here since existing functions also need updating: https://reviews.freebsd.org/D40398