It's not enough to IFNET_RLOCK() before iterating over the network
interfaces, the network epoch needs entered (see NET_EPOCH_ASSERT() in
if_foreach().
Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped - Build Status
Buildable 49922 Build 46814: arc lint + arc unit
Event Timeline
sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c | ||
---|---|---|
3160 ↗ | (On Diff #117301) | This seems unrelated to Linuxolator. |
sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c | ||
---|---|---|
3160 ↗ | (On Diff #117301) | Since epoch isn't a lock, it is okay to wrap entire VNET_FOREACH() into epoch. |
I named it as such because I touched all uses of if_foreach() I added. I can break it up, though, if you'd prefer.
I think it should be linux(4):, cause you are fixing the linux(4) module, not the IfAPI.
Wait a second... Are we sure that IFNET_RLOCK() isn't enough for that access? Maybe we need to adjust the assertion that we recently added? That it asserts epoch OR the lock. @melifaro What do you think?
Do we have a way to assert "or" the lock? I don't see a "sx_has_lock" or any equivalent at first glance. Otherwise I'd happily use that instead.
sys/compat/linux/linux.c | ||
---|---|---|
304 ↗ | (On Diff #117301) | There's no guarantee if it was simply RLOCK()d either. |
I prefer keeping it simple and require network epoch. Locking is not easy and "this or that" makes it harder for the human to grasp. Network epoch is widely present in the stack, while the amount of remaining IFNET_RLOCK places is low and in most of the code I checked it can be easily converted to NET_EPOCH. Maybe it can be removed entirely.
sys/compat/linux/linux.c | ||
---|---|---|
304 ↗ | (On Diff #117301) | I agree. In the previous approach, one should either refcount the found interface or require IFNET_RLOCK held by the caller. |
sys/compat/linux/linux.c | ||
---|---|---|
304 ↗ | (On Diff #117301) | It's definitely beyond my grasp of epoch at this time :) I only want to fix what I broke right now, but it should be done correctly, which you're probably better able to do. I guess this affects everywhere that calls if_foreach() inside a IFNET_RLOCK() block? |
sys/compat/linux/linux.c | ||
---|---|---|
304 ↗ | (On Diff #117301) |
What should be the best path forward here? If you could skip this block from this change, I'll come up with a diff ensuring ifname_linux_to_bsd() callers are inside the epoch ~tomorrow or so. Does it sound reasonable to you?
If they're not inside the epoch (and if they are they probably don't need IFNET_RLOCK). |
sys/compat/linux/linux.c | ||
---|---|---|
304 ↗ | (On Diff #117301) | That sounds reasonable to me. |
sys/dev/mlx5/mlx5_ib/mlx5_ib_main.c | ||
3160 ↗ | (On Diff #117301) | Okay, I'll split this one out. Given what @melifaro mentioned above about the lifetime of an ifp in the epoch, is it fine as-is in this file, or does the epoch need expanded? Because the ifp's lifetime is longer without any refcounting in the callback. |
sys/compat/linux/linux_ioctl.c | ||
---|---|---|
2131–2138 | agree |
sys/compat/linux/linux_ioctl.c | ||
---|---|---|
2137 | In the callback case, '0' means 'continue' while 'non-zero' means 'exit out'. I was on the fence implementing it that way, but chose that route because I found iterating over interfaces we generally exit when an interface is "found", whereas with the other iterators, over the addresses, operations are performed over all addresses. @glebius do you think the logic should be reversed: return '0' on 'found' (exit loop) and return 'ENODEV' (or non-zero) to continue? Or should it really behave just like the addr iterators? |
latest diff is not cleanly applies:
dchagin@mordor:~/freebsd % cat ./sys/compat/linux/linux_ioctl.c.rej
@@ -2125,18 +2125,16 @@
return (error); CURVNET_SET(TD_TO_VNET(curthread));
- IFNET_RLOCK();
+ NET_EPOCH_ENTER(et);
cbs.index = 1; /* ifr.ifr_ifindex starts from 1 */ cbs.ethno = 0; error = if_foreach(linux_ioctl_ifname_cb, &cbs);
+ NET_EPOCH_EXIT(et);
+ CURVNET_RESTORE();
if (error == 0) error = ENODEV;
- IFNET_RUNLOCK(); if (error == -1)
- error = copyout(&cbs.ifr, uifr, sizeof(cbs.ifr));
- CURVNET_RESTORE(); -
+ error = copyout(&cbs.ifr, uifr, sizeof(ifr));
return (error);
}
Hm, apparently I borked the rebase. That copyout() should not have changed in this diff, and should be in a different diff...
With two last patches the kernel paniced: Feb 21 19:27:01 mordor savecore[1488]: reboot after panic: Assertion in_epoch(net_epoch_preempt) failed at /home/dchagin/freebsd/sys/net/if.c:4520
As we found some problem in Linuxulator netlink which is merged to stable/13 branch and requires simple change to the code modified by a 52d98483 commit, I propose to revert 52d98483 until it addressed, and on a weekend I’ll help to fix it.