Page MenuHomeFreeBSD

Permit deferred creation of SMR structures for the VM radix zone.
Needs ReviewPublic

Authored by markj on May 8 2020, 12:20 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 9, 4:00 AM
Unknown Object (File)
Dec 5 2024, 11:04 AM
Unknown Object (File)
Oct 5 2024, 10:09 AM
Unknown Object (File)
Oct 5 2024, 7:36 AM
Unknown Object (File)
Oct 4 2024, 7:47 AM
Unknown Object (File)
Oct 2 2024, 11:19 AM
Unknown Object (File)
Oct 2 2024, 12:04 AM
Unknown Object (File)
Oct 1 2024, 5:37 PM
Subscribers
None

Details

Reviewers
rlibby
jeff
Summary

With a dynamic pcpu layout we cannot use startup_alloc() to allocate
per-CPU data. The radix SMR structure was the only dynamically
allocated per-CPU structure that violated this rule. I think it is
preferable to defer creation of SMR structures since startup_alloc()
does not provide domain-local memory.

Defer initialization of pwd_zone for the same reason.

Permit allocation using uma_zalloc_smr() before the SMR structure
is attached. This is fine so long as the attach happens before APs are
started.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 31246
Build 28894: arc lint + arc unit

Event Timeline

markj requested review of this revision.May 8 2020, 12:20 AM
markj created this revision.
markj added reviewers: rlibby, jeff.

Remove handling of pcpu kegs from zone_kva_available(). Now that such
kegs never use startup_alloc(), we don't need code to switch them from
startup_alloc() to the real pcpu slab allocator.

This makes me a little uneasy. Allocation before attach is okay, but free is not. Do we have some kind of guarantee that no free will occur?

So I gather you are saying that we can now create UMA_ZONE_PCPU zones no earlier than BOOT_KVA. But that happens under SI_SUB_VM. Why is smr_init() pushed out so far? ... Okay I see it is relevant to your point #4 in D24758. I think I should go understand that first.

A timeline for reference:

uma booted	func			init order
BOOT_COLD
		uma_startup1		SI_SUB_VM	SI_ORDER_FIRST
		vm_radix_zinit		SI_SUB_VM	SI_ORDER_FIRST
BOOT_KVA	uma_startup2		SI_SUB_VM	SI_ORDER_FIRST
BOOT_PCPU	uma_startup_pcpu	SI_SUB_COUNTER	SI_ORDER_ANY
					SI_SUB_CPU
		smr_init		SI_SUB_SMR	SI_ORDER_FIRST
		vm_radix_zinit2		SI_SUB_SMR	SI_ORDER_ANY
BOOT_RUNNING	uma_startup3		SI_SUB_VM_CONF	SI_ORDER_SECOND
BOOT_SHUTDOWN	uma_shutdown		shutdown_post_sync
sys/kern/subr_smr.c
627–628

Style?

This makes me a little uneasy. Allocation before attach is okay, but free is not. Do we have some kind of guarantee that no free will occur?

My reasoning is that SI_SUB_SMR happens before an APs are started, so SMR doesn't need to provide any synchronization before that point. It is true though that an early attempt to call uma_zfree_smr() could panic the system, though.

So I gather you are saying that we can now create UMA_ZONE_PCPU zones no earlier than BOOT_KVA. But that happens under SI_SUB_VM. Why is smr_init() pushed out so far? ... Okay I see it is relevant to your point #4 in D24758. I think I should go understand that first.

Sorry, I meant to explain the reasoning in the review description. The issue is that smr_create() wants to initialize all of the per-CPU structures, but with D24758, memory returned by uma_zalloc_pcpu() is not fully mapped: only the BSP's memory is mapped. Per-CPU zones which just rely on zero-initialization don't require any special handling; for instance, SI_SUB_COUNTER is right after SI_SUB_VM.

So the two options I had were:

  • this, which works fine since nothing using unlocked lookups during early boot
  • modify smr_create() to detect the case where it is allocating per-CPU structures before SI_SUB_CPU and only initialize the BSP's structure, and then use some mechanism to defer the rest of the per-CPU initialization to some point between SI_SUB_CPU and SI_SUB_SMP

I think the latter is reasonable, but is more complicated and unnecessary for now, so I went with the simpler approach. Based on your comments though, it feels too fragile.

Create the radix SMR structures earlier, during SI_SUB_SMR, right after
SI_SUB_VM. Use a global list of SMR structures to perform deferred
initialization of per-CPU structures. This allows us to choose a layout
for per-CPU data at SI_SUB_CPU.

Any objections to this change? It's a prerequisite for using 2MB pages for per-CPU data.