HomeFreeBSD

linuxkpi: races between linux_queue_delayed_work_on() and…

Description

linuxkpi: races between linux_queue_delayed_work_on() and linux_cancel_delayed_work_sync()

  1. Suppose that linux_queue_delayed_work_on() is called with non-zero delay and found the work.state WORK_ST_IDLE. It resets the state to WORK_ST_TIMER and locks timer.mtx. Now, if linux_cancel_delayed_work_sync() was also called meantime, read state as WORK_ST_TIMER and already taken the mutex, it is executing callout_stop() on non-armed callout. Then linux_queue_delayed_work_on() continues and schedules callout. But the return value from cancel() is false, making it possible to the requeue from callback to slip in.
  1. If linux_cancel_delayed_work_sync() returned true, we need to cancel again. The requeue from callback could have revived the work.

The end result is that we schedule callout that might be freed, since
cancel_delayed_work_sync() claims that everything was stopped. This
contradicts the way the KPI is used in Linux, where consumers expect
that cancel_delayed_work_sync() is reliable on its own.

Reviewed by: markj
Discussed with: bz
Sponsored by: NVidia networking
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42468

Details