Page MenuHomeFreeBSD

iflib: fix panic during driver reload stress test
ClosedPublic

Authored by przemyslawx.lewandowski_intel.com on Apr 7 2023, 7:42 AM.
Tags
None
Referenced Files
F107091732: D39457.diff
Thu, Jan 9, 11:44 PM
Unknown Object (File)
Wed, Jan 8, 11:39 PM
Unknown Object (File)
Nov 29 2024, 2:09 PM
Unknown Object (File)
Nov 22 2024, 4:30 PM
Unknown Object (File)
Nov 22 2024, 10:43 AM
Unknown Object (File)
Nov 22 2024, 10:42 AM
Unknown Object (File)
Nov 22 2024, 10:42 AM
Unknown Object (File)
Nov 22 2024, 10:18 AM

Details

Summary

During driver reload stress test, after 50-300 reloads occurs panic.
Adding sleeps after load/unload driver issue does not occur.
Too fast load/unload may cause that gt_taskqueue pointer is freed earlier.
Checking null pointer for this fixes it.

Script used to test it:

i=1
while true
do
	echo "Attempt number $i"
        echo "Loading if_ice"
        kldload if_ice
        echo "Unloading if_ice"
        kldunload if_ice
        i=$(expr $i + 1)
        echo "Last good try: $i" > ice_load_unload_lastGoodTry.txt
done

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

Humm, seems like a race where the taskqueue needs to be destroyed after the interrupt handler is fully deregistered instead. Several other handlers also queue tasks and would otherwise need similar checks. It may be that the current interrupt code has a race in bus_teardown_intr() when used with filters in that we can't ensure that bus_teardown_intr() will block if the filter handler is running on another core, but in that case that's the real race to fix.

In D39457#900952, @jhb wrote:

It may be that the current interrupt code has a race in bus_teardown_intr() when used with filters in that we can't ensure that bus_teardown_intr() will block if the filter handler is running on another core, but in that case that's the real race to fix.

@jhb, are you volunteering to debug and fix that? :p

This revision is now accepted and ready to land.Jul 27 2023, 10:45 PM