Otherwise consumers get unexpected EINTR errors without seeing a properly discarded signal.
Details
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
This fixes my testcase with strace(1), and also seems to fix problems with man(1) and ninja(8) on Bionic. Thank you :-)
sys/kern/kern_sig.c | ||
---|---|---|
2227 | Why isn't it ERESTART? How can a sleepq_wait_sig() caller distinguish between a proper wakeup and a wakeup due to an ignored signal? |
sys/kern/kern_sig.c | ||
---|---|---|
2227 | I do not believe ERESTART would work. First, several interesting syscalls (WRT to this change) convert ERESTART to EINTR. Second, sleepq_wait_sig() caller does not need to distinguish these cases. For caller, this looks like a spurious wakeup which we allow anyway. For instance, msleep() would return 0 (AKA spurious wakeup), sx retries because the lock cannot be taken still, cv gets spurious wakeup etc. There, we are waken up anyway, so we need to only prevent a damage to the state. I was worried that this change breaks sigwait() back to the state before sig_discard was added, but both code reading and trasz' testing show that it is not. |
sys/kern/kern_sig.c | ||
---|---|---|
2227 | I think we have discussed this before but I can't remember any details: how can spurious wakeups occur? I do not believe it is possible with the existing sleepqueue design, so long as the wchan is unique. I believe it is harmless for many consumers, especially when an interlock is used (as in msleep() etc.), but I'm sure spurious wakeups would be problematic in general. Consider the implementation of pause() for instance. |
sys/kern/kern_sig.c | ||
---|---|---|
2227 | For instance, spurious wakeup could happen due to race, where the address is reused for different kind of object, and old object is woken up by code assuming its previous use (and waking up without interlock). Another example, in case of wait(2), if more than one thread does the syscall, only one thread would find the zombie that signaled them. |
sys/kern/kern_sig.c | ||
---|---|---|
2227 | These scenarios are possible only when the consumer allows them to happen. Arbitrary sleepq interface consumers might not handle spurious wakeups. Consider the cv_wait_signal() call in vn_sendfile(). It does not handle spurious wakeups. Even if it did, when a signal is received, the system call should return, but with this change vn_sendfile() cannot determine whether to sleep again or not. |
sys/kern/kern_sig.c | ||
---|---|---|
2227 | Well, address collisions cannot be controlled by the caller. Looking at vn_sendfile(), I think it is at least weird, to say it mildly. Intent of SF_SYNC is to provide the caller of sendfile() a notification that the operation finished, e.g. to allow further modifications of the file/shmfd. Now, a signal send does not stop the operation, it only stops the wait for the finish. Since signals are external events that might be created by actors not controlled by the code requested sendfile(), getting EINTR (ERESTART is translated) means that they cannot correctly continue at all. And of course, secondary problem is that they are not prepared for spurious wakeups that can happen. |
sys/kern/kern_sig.c | ||
---|---|---|
2227 |
|
When queuing ignored signal, only abort target thread' sleep if it is inside sigwait().
sys/kern/subr_sleepqueue.c | ||
---|---|---|
1129 | Can it be (intrval == 0 && (td->td_flags & TDF_SIGWAIT) != 0) || intrval == EINTR || intrval == ERESTART? |
I'm afraid it panics for me: panic: mtx_lock_spin: recursed on non-recursive mutex sleepq chain @ /usr/home/trasz/git/freebsd/sys/kern/subr_sleepqueue.c:267. Traceback looks like this:
panic() at panic+0x44 __mtx_lock_spin_flags() at __mtx_lock_spin_flags+0x190 wakeup() at wakeup+0x10 exit1() at exit1+0xb8c linux_exit_group() at linux_exit_group+0x10 do_el0_sync() at do_el0_sync+0x4a8 handle_el0_sync() at handle_el0_sync+0x90