This is counterpart to e87c4940156c, which did the same for ethernet.
Reported by: hselasky
Tested by: hselasky
MFC after: 2 weeks
Differential D39405
infiniband: Opt-in for net epoch zlei on Apr 4 2023, 4:09 PM. Authored by Tags None Referenced Files
Details
This is counterpart to e87c4940156c, which did the same for ethernet. Reported by: hselasky
Diff Detail
Event TimelineComment Actions I follow .git/hooks/prepare-commit-msg (from doc) which does not have meta Suggested by: . CC @carlavilla .
Comment Actions Did you considered how much efforts would be to convert all five or six ib drivers we have in tree, instead of adding this flag? Comment Actions Actually not too much. The problem is I don't have infiniband devices and I'm not able to test (any of) them. The maintainer of qlnx driver does not response for quite a while and I hesitate to change it. IIUC converting all ib drivers is the final goal (of widen net epoch), this (IFF_KNOWSEPOCH or IB_DEVICE_KNOWSEPOCH) is short term and will be eventually retired. So if you @kib encourage me to convert then I would like to do and prefer. @hselasky What do you think ? Comment Actions Of course it is better to convert everything, but for now I think you can go ahead with the patch. The difference, in my opinion, is that we have several dozens of ethernet drivers, while only five or like ib drivers. Comment Actions
I think I was too optimistic. Adding NET_EPOCH_ENTER() / NET_EPOCH_EXIT() around cq->comp() does not benefits from EPOCH(9) (reduce locks / refcounts in fast path). So which one should we do then?
For the third one, ib drivers can be still improved to benefit EPOCH(9). Comment Actions EPOCH() does not reduce locks contention just be fact that it is applied, so I am not sure what do you mean there. Entering epoch should fix bugs with accesses to freed resources protected by epoch guarantee. Just commit the patch for now. Comment Actions Found a possible flaw with this change while I was preparing to commit. infiniband_input() if_input() ipoib_cm_handle_rx_wc() ipoib_ib_handle_rx_wc() ipoib_drain_cq() ipoib_cm_dev_stop() ipoib_ib_dev_stop() __ipoib_ib_dev_flush() ipoib_ib_dev_flush_light() // INIT_WORK(&priv->flush_light, ipoib_ib_dev_flush_light); ipoib_ib_dev_flush_normal() // INIT_WORK(&priv->flush_normal, ipoib_ib_dev_flush_normal); ipoib_ib_dev_flush_heavy() // INIT_WORK(&priv->flush_heavy, ipoib_ib_dev_flush_heavy); If some ib driver is marked with IB_DEVICE_KNOWSEPOCH, on ipoib_ib_dev_flush_[light | normal | heavy]() the needs_epoch is false and net stack is not in net epoch, then the kernel will assert panic (with WITNESS), or there is a potential use-after-free (without WITNESS). Maybe the right approach is moving the needs_epoch check to ipoib_cm_handle_rx_wc() and ipoib_ib_handle_rx_wc(). Comment Actions
No. Just leave it like is for now. I think @glebius is the owner of that technique. There is some magic cruft up in the IRQ handler too. The mlx5 core may process multiple events per completion, so it won't be per-packet. It's pretty clever that way. |