1. Suppose that linux_queue_delayed_work_on() is called with non-zero delay and found the work.state WORK_ST_IDLE. It resets the state to WORK_ST_TIMER and locks timer.mtx. Now, if linux_cancel_delayed_work_sync() was also called meantime, read state as WORK_ST_TIMER and already taken the mutex, it is executing callout_stop() on non-armed callout. Then linux_queue_delayed_work_on() continues and schedules callout. But the return value from cancel() is false, making it possible to the requeue from callback to slip in. 2. If linux_cancel_delayed_work_sync() returned true, we need to cancel again. The requeue from callback could have revived the work. The end result is that we schedule callout that might be freed, since cancel_delayed_work_sync() claims that everything was stopped. This contradicts the way the KPI is used in Linux, where consumers expect that cancel_delayed_work_sync() is reliable on its own.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
sys/compat/linuxkpi/common/src/linux_work.c | ||
---|---|---|
475 | All functions in this file are prefixed with _linux, even static. I can use _internal, _int is just shorter. |
Comment Actions
My suggestions are for pre-existing nits, since I had to read the code anyway. Please feel free to ignore them.
- If linux_cancel_delayed_work_sync() returned true, we need to cancel again. The requeue from callback could have revived the work.
Assuming that the first bug is fixed, how can this happen? taskqueue_drain() should only return once the task has finished executing and it's no longer pending. But, if the task queued itself with a timeout, it'll be the callout that's pending, not the task. Flipping the order of callout_drain() and taskqueue_drain() calls does not solve the problem either. Ok.
sys/compat/linuxkpi/common/src/linux_work.c | ||
---|---|---|
209 | ||
233 | ||
471 |