It appears that newhook is (sometimes?) called by the ngthread, which is
not allowed to sleep. This causes occasional panics in CI runs, e.g.,
https://ci.freebsd.org/job/FreeBSD-main-aarch64-test/340/console
Details
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 40966 Build 37855: arc lint + arc unit
Event Timeline
I think the problem is somewhat broader. Any message item can be added to the node's queue under load or if the queue is blocked. This message item will be processed by ngthread() in the EPOCH section. There are many places in the code of the nodes that perform actions prohibited in the EPOCH section.
sys/netgraph/ng_bridge.c | ||
---|---|---|
398–399 | malloc(9) has KASSERT https://github.com/freebsd/freebsd-src/blob/main/sys/kern/kern_malloc.c#L542 What about uma_zalloc(9)? I think it should also be prohibited in the EPOCH section. |
JFYI, the same situation. If you call "ngctl shutdown ngeth0:" under load:
panic: epoch_wait_preempt() called in the middle of an epoch section of the same epoch cpuid = 0 time = 1600268186 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0154b9e880 vpanic() at vpanic+0x182/frame 0xfffffe0154b9e8d0 panic() at panic+0x43/frame 0xfffffe0154b9e930 epoch_wait_preempt() at epoch_wait_preempt+0x293/frame 0xfffffe0154b9e980 if_detach_internal() at if_detach_internal+0x1ca/frame 0xfffffe0154b9e9e0 if_detach() at if_detach+0x3d/frame 0xfffffe0154b9ea00 ng_eiface_rmnode() at ng_eiface_rmnode+0x55/frame 0xfffffe0154b9ea40 ng_rmnode() at ng_rmnode+0x188/frame 0xfffffe0154b9ea70 ng_mkpeer() at ng_mkpeer+0x7b/frame 0xfffffe0154b9eac0 ng_apply_item() at ng_apply_item+0x547/frame 0xfffffe0154b9eb40 ngthread() at ngthread+0x26e/frame 0xfffffe0154b9ebb0 fork_exit() at fork_exit+0x80/frame 0xfffffe0154b9ebf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0154b9ebf0
Why does it have to be under load? I thought it was just stack usage that determined whether we queue messages for ngthread.
sys/netgraph/ng_bridge.c | ||
---|---|---|
398–399 | Thanks, yes. UMA doesn't seem to have this assertion for some reason... |
The load simply allows this bug to manifest itself. Under load, the node's queue already contains the item's, so if a message item is sent at that moment (ngtstl shutdown), it will also go to the queue and will be processed by ngthread(). Without load, the queue is mostly empty, the message item will be delivered/processed by the thread of the calling process directly: ngctl -> syscall (sandto(2)) -> ngc_send (ng_socket(4)) -> ng_eiface(4) without entering to the EPOCH section. Therefore, the bug does not manifest itself.
There are many conditions when ng_item (data or the message) will get into the queue: stack usage, the queue is not empty, node flags, etc.
The main problem is that we are processing message items in the EPOCH section. Although the whole idea is that the processing of data items is in the EPOCH section (hot path), and messages in the non-EPOCH of the section.
Each netgraph node implements the following interface:
struct ng_type { u_int32_t version; /* must equal NG_API_VERSION */ const char *name; /* Unique type name */ modeventhand_t mod_event; /* Module event handler (optional) */ ng_constructor_t *constructor; /* Node constructor */ ng_rcvmsg_t *rcvmsg; /* control messages come here */ ng_close_t *close; /* warn about forthcoming shutdown */ ng_shutdown_t *shutdown; /* reset, and free resources */ ng_newhook_t *newhook; /* first notification of new hook */ ng_findhook_t *findhook; /* only if you have lots of hooks */ ng_connect_t *connect; /* final notification of new hook */ ng_rcvdata_t *rcvdata; /* data comes here */ ng_disconnect_t *disconnect; /* notify on disconnect */ const struct ng_cmdlist *cmdlist; /* commands we can convert */ /* R/W data private to the base netgraph code DON'T TOUCH! */ LIST_ENTRY(ng_type) types; /* linked list of all types */ int refs; /* number of instances */ };
I think only two methods should have execution restrictions in the EPOCH section: ng_rcvdata_t and ng_findhook_t
Could be, yes. I don't know netgraph well enough to say much. I do think that the epoch_wait_preempt() in if_detach_internal() is a hack. If nothing else, it slows down ifnet destruction quite a lot. Ideally we could synchronously unlink the ifnet and use epoch_call() to handle teardown asynchronously, but that might have other ramifications.
The whole netgraph code predates any epoch, so the interactions are interesting.
Currently I do not have the nerve to check in detail, how such a call to ng_(add/rm)hook is possible and which of those situations are covered by epoch, which should be covered, and which can't be.
Removing potential complicated operations like malloc(M_WAITOK) is a acceptable compromise, but should be raised to the whole netgraph framework.
sys/netgraph/ng_bridge.c | ||
---|---|---|
406–407 | I'd like to see a goto nomem here, which is the common idiom for a more complex failure handling. |
I looked at other nodes that implement the ng_newhook_t method. They already use M_NOWAIT, so I think this patch is good.
ng_eiface(4) issues should be resolved in a separate review.