In D33662#760509, @bz wrote:

Can you please tell what you need it for.
The bandaid is very specific for if.c and should really never be needed anywhere else.

Sure!
Let me start with my view on the generic situation first.
When we teardown VNET, we need to gradually shut down all virtualized subsystems. This shutdown is by no way atomic, which means that certain subsystems shut down faster than the others.
Some are coupled either explicitly with the others (like PCBs referencing nhops and lles) or implicitly (by passing mbufs up or down the stack). Furthermore, the problem is made harder by the fact that certain entities like nhops or LLEs are epoch-protected and can extend the subsystem lifespan requirement non-deterministically. Such couplings present two contradicting requirements for a subsystem: (1) try to die early to make progress and (2) try to keep the necessary datastuctures alive as long as possible so the other subsystems don't crash. That said, having the ability to "close the incoming gate" for the subsystem is important - as it allows to wipe all state after the "gate" is closed (e.g. nothing can cause the addition of a new state) and keep the minimal amount of datastructures, that can die close to the end of teardown.

Specifically, there is D33658, which introduces the data structure, struct nhop_neigh, that adds _some_ coupling between nexthops and llentries. The idea of using the VNET_IS_SHUTTING_DOWN macro is to stop propagating new llentry state updates after vnet shutdown starts. I can avoid using the macro, by adding an explicit hook somewhere at the beginning of teardown, that would update the internal flag serving the similar purpose. However, I'd rather prefer to have a generic function that can be used in the same fashion.

Thoughts?

The problem you are describing is called "layer violation". It was indeed one of the harder things to solve initially for VNET teardown but it is not impossible.
The broader discussion to have is whether each layer (feature, protocol, whatever you'd call it) should be responsible for itself.

The moment you stop thinking about a monolithic network stack but think of each part as a module you'll quickly find unloading a module (say a protocol such as TCP or IPv6) the protocol needs to care for itself and cleanup all its related things on its own. This is what then VNET shutdown is modeled after.

The places where these protocol layers interact are tricky but nonetheless it can be done.

I would really love to hear what other people think about this. We already do have protocol load/unload with the TCP CC alogrithms and various TCP stacks these days. There is in theory nothing which prevents us from making IPv4 and IPv6 loadable (other OSes have long done that).

zlei added a subscriber: zlei.Jan 18 2022, 10:58 AM

konrad.kreciwilk_korbank.pl mentioned this in D33658: Pre-calculate L2 prepends for routes with gateway and avoid arp/nd lookup.Aug 5 2022, 10:06 AM

melifaro abandoned this revision.Feb 20 2023, 10:22 AM