Page MenuHomeFreeBSD

Different fix for the killpg race, part2
ClosedPublic

Authored by kib on Jul 21 2023, 9:47 AM.
Tags
None
Referenced Files
F108593984: D41128.id125092.diff
Sun, Jan 26, 6:14 PM
Unknown Object (File)
Thu, Jan 23, 6:25 PM
Unknown Object (File)
Sun, Jan 19, 1:07 PM
Unknown Object (File)
Fri, Jan 10, 3:03 AM
Unknown Object (File)
Dec 10 2024, 4:46 PM
Unknown Object (File)
Dec 6 2024, 1:58 PM
Unknown Object (File)
Dec 2 2024, 11:17 PM
Unknown Object (File)
Nov 26 2024, 5:37 AM
Subscribers

Details

Summary

Note: the revert is not included into the differential because it is trivial and only hinders readability.

Revert "killpg(): close a race with fork(), part 2"

This reverts commits 81a37995c757b4e3ad8a5c699864197fd1ebdcf5 and
565a343ae3a30bc2973182ff8dfd2fa37d7f615f.

There is still a leakage of the p_killpg_cnt, some but not all sources
of which were identified.

Second, and more important, is that there is a fundamental issue with
blocked signals having KSI_KILLPG flag set.  Queueing of such signal
increments p_killpg_cnt, but it cannot be decremented until the signal
is delivered.  If, for instance, a single-threaded process with blocked
signal receives killpg-kill and executes fork(2), the fork enter check
returns with ERESTART.  And since signal is blocked, the condition
cannot be cleared.
sigtd(): prefer non-stopped thread as a target for signal queue

This should improve signal delivery latency and better expose the
process state to the executing threads.
killpg(): close a race with fork(), part 2

When we are sending terminating signal to the group, killpg() needs
to guarantee that all group members are to be terminated (it does not
need to ensure that they are terminated on return from killpg()).  The
pg_killsx change eliminates the largest window there, but still, if
a multithreaded process is signalled, the following could happen:
- thread 1 is selected for the signal delivery and gets descheduled
- thread 2 waits for pg_killsx lock, obtains it and forks
- thread 1 continue executing and terminates the process
This scenario allows the child to escape still.

Fix it by single-threading forking parent if a conflict with pg_killsx
is noted.  We try to lock pg_killsx without sleeping, and failure to
acquire it means that a parallel killpg(2) is executed.  Then, stop
other threads for running and in particular, receive signals, to avoid
the situation explained above.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

kib requested review of this revision.Jul 21 2023, 9:47 AM
sys/kern/kern_fork.c
954

This comment should move down, I think, and a comment should explain why we are single-threading.

957

This is a racy load, right?

1077

I suggest putting this hunk at the end of the function, with an unconditional goto cleanup before the fail0 label.

kib marked 3 inline comments as done.Jul 24 2023, 7:57 PM
kib added inline comments.
sys/kern/kern_fork.c
957

Yes, it is same as in many other places. It is actually safe because p_numthreads cannot become > 1 if it is == 1 right now.

kib marked an inline comment as done.

Changes: move cleanup, update comments.

Upload the right patch, that includes the comments changes.

sys/kern/kern_fork.c
955
956

Suppose the current thread has the signal masked. Then, won't sigtd() still potentially choose a "random" stopped thread, allowing the child to escape?

958
kib marked 3 inline comments as done.Jul 25 2023, 3:51 PM
kib added inline comments.
sys/kern/kern_fork.c
956

If sigtd() is tasked with selecting some thread, then we must already block on pg_killsx. Now, if the signal to be delivered can be masked, we are in the same situation as with any non-threaded process that masked signal generated by killpg(2). There is nothing to guarantee there, because process actively interacts with the signal.

kib marked an inline comment as done.Jul 25 2023, 11:18 PM
This revision is now accepted and ready to land.Jul 26 2023, 2:54 PM