Page MenuHomeFreeBSD

iflib: Discern when to reset on isc_rxd_pkt_get errors
Needs ReviewPublic

Authored by kbowling on Sun, Oct 27, 5:54 AM.
Tags
None
Referenced Files
F101975913: D47296.diff
Wed, Nov 6, 1:31 AM
F101892428: D47296.diff
Tue, Nov 5, 7:15 AM
Unknown Object (File)
Fri, Nov 1, 11:01 PM
Unknown Object (File)
Wed, Oct 30, 9:49 PM
Unknown Object (File)
Wed, Oct 30, 9:46 PM
Unknown Object (File)
Tue, Oct 29, 8:08 AM
Unknown Object (File)
Sun, Oct 27, 5:55 AM

Details

Reviewers
markj
erj
shurd
Group Reviewers
iflib
Intel Networking
Summary

erj@ provides a good analysis in the PR.. I checked the various iflib drivers and the only error cases are Intel drivers for transient receive issues or mgb which uses EINVAL for contractual enforcement of its descriptors.

I think we have some dealer's choice here.. we could return 0 for the intel isc_rxd_pkt_get instead of EBADMSG or discern EBADMSG as a transient issue and consider anything else a contractual issue as I am proposing here.

PR: 262024

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

sys/net/iflib.c
3100

Why not patch drivers to not return an error for that case? It looks like iflib is not really doing anything with EBADMSG except returning early here.

Also, looking at em_isc_rxd_pkt_get(), is there some missing cleanup in the error path? It's not really clear to me how the driver is supposed to recover.

sys/net/iflib.c
3100

It doesn't look like there's any critical cleanup for em_isc_rxd_pkt_get() to do in that error case; all the other code in there after that just manipulates local variables or the packet info struct (which iflib could maybe choose to modify or do something with when it sees EBADMSG).

I don't know the specifics of em hardware, but I'd think that in theory the hardware would be fine just moving onto the next packet; I don't think it stops or needs some special routine to happen if it gets a bad packet.

But as for the error code, yeah, it doesn't currently functionally make sense to treat that error specially if nothing special happens as a result; returning 0 would be ok. Though, I guess keeping the error code would help with writing unit test code for this function. :p

sys/net/iflib.c
3100

all the other code in there after that just manipulates local variables or the packet info struct

I was wondering specifically about:

/* Zero out the receive descriptors status. */
rxd->wb.upper.status_error &= htole32(~0xFF);

Do we need to do that in the error case as well?

sys/net/iflib.c
3100

I'm not sure. We do clear those bits in the newer drivers, too, but it's not clear if that's a holdover from the legacy em to prevent processing the same descriptors multiple times in one function call/loop or if that's something that's necessary to keep things working normally.