Minidumps are machine-dependent, so when pmap changes, some new functions (accessors) are need to be implemented. The pmap must provide a method of resolving page table/directory entries. Minidump parses that data and stores it inside dump area. The most important part of this change is switching minidumps to use structures offered by NEW_PMAP.
Details
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
Well, we (Michal and I) do not want to expose internal pmap things (pte definitions) to other files. Note that LPAE has another (third) variant of pte definitions.
In fact, three things are needed from pmap in minidumpsys().
(1) size of all L2 page tables for <KERNBASE, kernel_vm_end) interval,
(2) VA to PA resolution -> safe variant of pmap_kextract() should work,
(3) PA and size of each kernel L2 page table if exists.
In our opinion, it should not be hard to implement three simple new function in each pmap. If old armv6 pmap will be put out of tree, it should be even easier.
lib/libkvm/kvm_minidump_arm.c | ||
---|---|---|
245 | This should be #include <machine/acle-compat.h> ... #if __ARM_ARCH > 6 | |
250–254 | This should be a macro, e.g. #if __ARM_ARCH >= 6 #define PTE_L2_TYPE_S(pte) (...) #else #define PTE_L2_TYPE_S(pte) (...) #endif ... if (PTE_L2_TYPE_S(pte)) { ... | |
sys/arm/arm/minidump_machdep.c | ||
268 | __ARM_ARCH >= 6 |
One thing I would ask: is that you come up with a pmap-independent on-disk format. In my changes in D3341 I make the minidump parsing code machine-independent to support cross-debugging. This means that the libkvm bits in this new world order have to support all possible vmcore formats. If you can only support 1 that would be ideal.
I'm not familiar with vmcore debugging, but I assume that there should be two pmap related task:
(1) virtual to physical address translation,
(2) exact value of any page table entry,
right?
If the second point is target too, then there will be always more pte formats in arm. One for armv4/armv5, one for armv6/v7, and one for armv6/armv7 LPAE (similar to PAE in i386). And if 1 vmcore format means 1 page table format, Ido not see any good solution (just various hacks).
BTW, I wonder, why there is not only first (highest) level page table in vmcore? Physical pages are in vmcore, so there should not be a problem to get the other page tables.
Minidump tries to only write the pages mapped into KVA. For instance, on amd64 you do not want to dump the direct map, which is the point of the minidump. Page tables usually become mapped with non-regular mechanisms, which means that they might be not eligible for dumping by default.
Yes, libkvm must be able to convert given KVA to PA to file offset. The typical technique is to walk the page tables and then locate PA in the core. So libkvm is aware about PAE/non-PAE i386, AFAIR. I do not see why libkvm on arm must avoid knowing about the hardware page table entries formats, since there are finite number of the variants (3), and the formats are defined by the hardware.
IMO creating some abstracted MD page table format for KVA->PA translation in minidumps might be curious idea, but somebody would need to implement it.
The minidump formats do seem to often be much simpler. In a minidump we don't store the entire page table hierarchy. Instead, only a flat array of the lowest level PTEs are stored. For example, on amd64 a super page is actually written out as a series of 4k PTEs even though it is mapped by a single PDE. For the "raw" dumps that support access via something like /dev/fwmem you are stuck with walking the page table in whatever format it is, yes. However, it would be nice to use something a bit more abstract for minidumps to avoid a proliferation of various vmcore parsers in libkvm.
If you look at my review I mentioned above, you could make this work by having multiple backends if that was simpler (one per PTE format) you would just need a way at either runtime (for live kernels) or from a vmcore to know which format to use.
Ok, thanks for all comments. So... what are we going to do with this patch in general?
I assume that it might be impossible to avoid libkvm+minidump from using PTE tables directly. Yes, we can use a pmap_kextract in minidumps to check if a given page is mapped to the kernel, but we'd still need to create a "fake" pagetables to be compatible with libkvm. What do you think?
Well, in general, I'm not happy that there are two minidumpsys() which are almost same in this patch. With LPAE, there will be three then. So I suggest again to create pmap minidump (MD) interface which will provide a possibility to have only one minidumpsys() and hides details of various pmap definitions as bonus. I do not think that it's impossible.
For example, I can image the following:
typedef void (dump_add_page_t)(void* arg, paddr_t pa); typedef int (dump_write_t)(void* arg, paddr_t pa); uint32_t pmap_dump_add_pages(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, dump_add_page_t *func, void *arg); int pmap_dump_write_tables(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, dump_write_t *func, void *arg);
The first function returns a size of memory needed for all pagetables in requested range. The second function writes them. Of course, pmap, sva, and eva arguments are there only for clarity and may be omitted fully. And again, it's only an example.
Further, I like either the idea of abstracted MD page table format and/or to make the current simple format more complex. Creating of "fake" page tables for section mappings looks weird for me. Even if there would be some duplicity in such format, I suppose that saved room will be significant when one section entry on L1 table won't be replaced by the whole L2 page table. I vaguely remember that there was around 130 promotions in kernel on panda with 1G RAM after some time (not counting direct section mappings). However, I do not suggest to make any step toward it now. But if it will be a thing in the future, it would be easier to implement it with pmap minidump interface.
Considering minidumpsys() itself, I further wonder how much is the game with writing of contiguous physical blocks still important?
Ok, I see. I'll try to change the code and use pmap_kextract to check which pages are mapped to avoid parsing of pagetables directly. However I still will need to access some of PMAP structures to prepare fake-pagetables for libkvm... Please let me know if that approach would be acceptable. I don't want to spend much time doing that and then find that everything is wrong :)
Well, you already got hints from me. But if you are in hurry due to D3341 and will do the following, I won't stay against it.
(1) Do not implement any new function in pmap-v6-new.c.
(2) Do not use pmap_kextract() from pmap-v6-new.c as there is a panic if you try to extract not mapped address.
(3) Use functions from pmap_var.h, they can be used in other MD files. It's allowed to use L2_TABLE_SIZE from pte-v6.h, but using of all PTE entry specific defines directly is off limit.
(4) Use only one minidumpsys(). Differences between pmaps implement in (I guess) two specific function for each of them within minidump_machdep.c.
(5) Note, that pmap-v6-new.c does use neither 64K nor 16M mapping.
Hopefullly, I did not forget about something. ;)
Thanks. In that case I'll wait for the generic minidump format to be integrated first.
Done in D5023. Not generic minidump format, but general pmap dump interface (for ARM platforms (not ARM64)). Minidump for ARM_NEW_PMAP was implemeneted there as well.