Page MenuHomeFreeBSD

tcp_ratelimit: provide a hook for drivers to release ratesets at detach
ClosedPublic

Authored by gallatin on Aug 4 2024, 4:10 PM.
Tags
None
Referenced Files
F102798930: D46221.diff
Sun, Nov 17, 8:24 AM
Unknown Object (File)
Tue, Oct 22, 5:56 AM
Unknown Object (File)
Tue, Oct 22, 5:56 AM
Unknown Object (File)
Tue, Oct 22, 5:56 AM
Unknown Object (File)
Tue, Oct 22, 5:43 AM
Unknown Object (File)
Oct 12 2024, 2:22 AM
Unknown Object (File)
Oct 2 2024, 6:07 AM
Unknown Object (File)
Sep 27 2024, 12:32 PM

Details

Summary

When the kernel is compiled with options TCP_RATELIMIT, the mlx5en driver cannot detach. It gets stuck waiting for all kernel users of its rates to drop to zero before finally calling ether_ifdetach.

The tcp ratelimit code has an eventhandler for ifnet departure which causes rates to be released. However, this is called as an ifnet departure eventhandler, which is invoked as part of ifdetach(), via either_ifdetach(). This means that the tcp ratelimit code holds down many hw rates when the mlx5en driver is waiting for the rate count to go to 0. Thus devctl detach will deadlock on mlx5 with this stack:
mi_switch+0xcf sleepq_timedwait+0x2f _sleep+0x1a3 pause_sbt+0x77 mlx5e_destroy_ifp+0xaf mlx5_remove_device+0xa7 mlx5_unregister_device+0x78 mlx5_unload_one+0x10a remove_one+0x1e linux_pci_detach_device+0x36 linux_pci_detach+0x24 device_detach+0x180 devctl2_ioctl+0x3dc devfs_ioctl+0xbb vn_ioctl+0xca devfs_ioctl_f+0x1e kern_ioctl+0x1c3 sys_ioctl+0x10a

To fix this, provide an explicit API for a driver to call the tcp ratelimit code telling it to detach itself from an ifnet. This allows the mlx5 driver to unload cleanly. I considered adding an ifnet pre-departure eventhandler. However, that would need to be invoked by the driver, so a simple function call seemed better.

The mlx5en driver has been updated to call this function. I believe cxgbe does not have this issue, as I could not find code in that driver waiting for the rate usage to drop to 0, but I've added Navdeep to this as a reviewer in case I'm missing something.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib added inline comments.
sys/netinet/tcp_ratelimit.c
1301

Stylistically, this should be

sys/netinet/tcp_ratelimit.h
136

This and next line should be joined perhaps. The style of the prototypes in the header is broken systematically.

I am fine with this Drew, just went through and verified the departure code running earlier should have no impact...

This revision is now accepted and ready to land.Aug 5 2024, 2:40 PM
gallatin added inline comments.
sys/netinet/tcp_ratelimit.h
136

Yeah, I just wanted to stick with the existing style.