Page MenuHomeFreeBSD

vmxnet3: make descriptor count checks more robust
ClosedPublic

Authored by kp on Feb 2 2024, 5:03 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 30, 6:42 AM
Unknown Object (File)
Thu, Jan 30, 4:10 AM
Unknown Object (File)
Wed, Jan 22, 11:21 AM
Unknown Object (File)
Tue, Jan 21, 4:39 PM
Unknown Object (File)
Tue, Jan 21, 8:47 AM
Unknown Object (File)
Sat, Jan 18, 2:25 PM
Unknown Object (File)
Sat, Jan 18, 11:15 AM
Unknown Object (File)
Fri, Jan 17, 11:56 PM
Subscribers

Details

Summary

When we update credits there is a potential for a race causing an
overflow of vxcr_next (i.e. incrementing it past vxcr_ndesc). Change the
check to >= rather than == to be more robust against this.

Sponsored by: Rubicon Communications, LLC ("Netgate")

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

kp requested review of this revision.Feb 2 2024, 5:03 PM

Specifically because we've seen users report panics like this one:

db:0:kdb.enter.default>  bt
Tracing pid 11 tid 100007 td 0xfffffe001ebbe720
kdb_enter() at kdb_enter+0x32/frame 0xfffffe00c56849c0
vpanic() at vpanic+0x183/frame 0xfffffe00c5684a10
panic() at panic+0x43/frame 0xfffffe00c5684a70
trap_fatal() at trap_fatal+0x409/frame 0xfffffe00c5684ad0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00c5684b30
calltrap() at calltrap+0x8/frame 0xfffffe00c5684b30
--- trap 0xc, rip = 0xffffffff80b05c80, rsp = 0xfffffe00c5684c00, rbp = 0xfffffe00c5684c00 ---
vmxnet3_isc_txd_credits_update() at vmxnet3_isc_txd_credits_update+0x20/frame 0xfffffe00c5684c00
iflib_fast_intr_rxtx() at iflib_fast_intr_rxtx+0xf7/frame 0xfffffe00c5684c60
intr_event_handle() at intr_event_handle+0x123/frame 0xfffffe00c5684cd0
intr_execute_handlers() at intr_execute_handlers+0x4a/frame 0xfffffe00c5684d00
Xapic_isr1() at Xapic_isr1+0xdc/frame 0xfffffe00c5684d00
--- interrupt, rip = 0xffffffff8125b026, rsp = 0xfffffe00c5684dd0, rbp = 0xfffffe00c5684dd0 ---
zlei added inline comments.
sys/dev/vmware/vmxnet3/if_vmx.c
1432–1433

I'm not familiar with IFLIB. From the driver code it looks only this function vmxnet3_isc_txd_credits_update() can increase txc->vxcr_next. So if ++txc->vxcr_next > txc->vxcr_ndesc happens then I guess the function vmxnet3_isc_txd_credits_update() is called by multiple threads concurrently. If that is the desired behavior we probably want atomic increasing of txc->vxcr_next.

1484

The idx is on local stack, so if ++idx > rxc->vxcr_ndesc happens, it actually hints that the passed in idx is wrong, since &rxc->vxcr_u.rxcd[idx] would lead to OOB access.

Hence I'd prefer a KASSERT or panic right before the for loop, rather than *HIDING* the real problem and let driver code just WORK.

KASSERT(idx < rxc->vxcr_ndesc);
sys/dev/vmware/vmxnet3/if_vmx.c
1432–1433

That's my understanding as well, yes.

I am insufficiently familiar with this driver to do more than fix the immediate problem. I'm hoping that the maintainers and/or original authors will look at fixing the fundamental problem.

In the mean time this workaround should stop the panics we see.

1484

We could do both while we wait for those familiar with this code to fix it fully.

That would mean that non-debug kernel will work, and debug kernels will assert and demonstrate the problem.

Although that's really only for the ++txc->vxcr_next >= txc->vxcr_ndesc case, because it's indeed a local variable here and there's no way for that to race.

  • add assert
  • remove unneeded changes
This revision was not accepted when it landed; it landed in state Needs Review.Jun 10 2024, 9:06 AM
This revision was automatically updated to reflect the committed changes.
avg added inline comments.
sys/dev/vmware/vmxnet3/if_vmx.c
1425–1426

Dereferencing txcd may result in crash if vxcr_next happens to be >= txc->vxcr_ndesc at this point of execution because of the same race.

I think that vmxnet3_isc_txd_credits_update can be made similar to, e.g., igb_isc_txd_credits_update with respect to the array index (tx_rs_cidx and vxcr_next respectively).
That is, we can stash vxcr_next into a local variable and then use the local variable for iteration.
At the end, we can update vxcr_next from the local variable.
This won't fix other potential issues with correctness / concurrency (data races), but at least it will ensure that vxcr_next will not have an out-of-bounds value.