Page MenuHomeFreeBSD

dd: Fix SIGINT handling.
ClosedPublic

Authored by des on Apr 18 2023, 8:13 AM.
Tags
None
Referenced Files
F102833697: D39641.id121290.diff
Sun, Nov 17, 6:43 PM
Unknown Object (File)
Wed, Nov 13, 9:21 PM
Unknown Object (File)
Wed, Nov 13, 9:08 PM
Unknown Object (File)
Wed, Nov 13, 7:21 PM
Unknown Object (File)
Wed, Nov 13, 7:13 PM
Unknown Object (File)
Wed, Nov 13, 1:44 PM
Unknown Object (File)
Tue, Nov 12, 6:47 PM
Unknown Object (File)
Mon, Nov 11, 5:53 PM

Details

Summary

Currently, we handle SIGINT by calling summary() and _exit() directly from the signal handler, which we install after setup(). There are several issues with this:

  • summary() is not signal safe;
  • the parent is not informed about the signal;
  • setup() can block on open(), and catching SIGINT at that stage will produce the correct exit status but will not print anything to stderr as POSIX demands.

Fix this by making SIGINT non-restartable, changing our signal handler to only set a flag, installing it before setup(), and checking the termination flag before and after every blocking operation, i.e. open(), read(), write().

Also add two test cases, one for catching SIGINT while opening the input and one for catching it while reading. I couldn't think of an easy way to test catching SIGINT while writing (it's certainly feasible, but perhaps not from a shell script).

MFC after: 1 week
Sponsored by: Klara, Inc.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 50997
Build 47888: arc lint + arc unit

Event Timeline

des requested review of this revision.Apr 18 2023, 8:13 AM
ngie added a subscriber: ngie.
ngie added inline comments.
bin/dd/tests/dd2_test.sh
80

(picking a line)
These tests are started in subprocesses. I don't think it harms anything to leave set -m enabled.

This revision is now accepted and ready to land.Apr 18 2023, 5:19 PM

There seem to be a couple of read()/write() in position.c that seem like they should also get a check_terminate call -- I agree that this approach is sane, though.

kib added inline comments.
bin/dd/dd.c
103

Why not use sigaction(2) instead of these two obsoleted interfaces?

bin/dd/misc.c
164

You should consider flushing stdout and might be stderr before raising the signal, since with the default disposition, raise(2) behaves like _exit(2), leaving stdio buffers non-flushed.

des marked 3 inline comments as done.Apr 20 2023, 6:28 AM

There seem to be a couple of read()/write() in position.c that seem like they should also get a check_terminate call -- I agree that this approach is sane, though.

Good catch, I didn't realize. They already _almost_ do the right thing as long as the calls aren't restarted but the exit code would be 1 instead of 130.

bin/dd/dd.c
103

Because it takes at least five lines of code to do the exact same job using sigaction(2) as these two lines.

bin/dd/misc.c
164

We don't use stdout, unless no output file was specified, in which case we use it but not with stdio. We do use stderr but it should be unbuffered.

bin/dd/tests/dd2_test.sh
80

I don't think it harms anything not to, either :)

des marked 3 inline comments as done.Apr 20 2023, 6:29 AM

Incorporate review feedback.

This revision now requires review to proceed.Apr 20 2023, 6:53 AM
This revision is now accepted and ready to land.Apr 20 2023, 3:31 PM

Note that such signal handling makes the program behavior strange. Suppose that the signal was delivered in time when the code was executing, not during the read or write syscall. Then, you enter the next read/write call, and only after it returns, the new code reacts to the ctrl-C request. This would annoy users if they operate over a slow device.

In D39641#904361, @kib wrote:

Note that such signal handling makes the program behavior strange. Suppose that the signal was delivered in time when the code was executing, not during the read or write syscall. Then, you enter the next read/write call, and only after it returns, the new code reacts to the ctrl-C request. This would annoy users if they operate over a slow device.

You're right, I didn't think of that. It's a very small window but it's not zero. Should I call check_terminate() immediately before each blocking call, in addition to after? I suppose the canonical answer is longjmp() but if we catch a signal somewhere inside printf() and longjmp() out of the handler and then call printf() from summary() we're up shit creek.

In D39641#906503, @des wrote:
In D39641#904361, @kib wrote:

Note that such signal handling makes the program behavior strange. Suppose that the signal was delivered in time when the code was executing, not during the read or write syscall. Then, you enter the next read/write call, and only after it returns, the new code reacts to the ctrl-C request. This would annoy users if they operate over a slow device.

You're right, I didn't think of that. It's a very small window but it's not zero. Should I call check_terminate() immediately before each blocking call, in addition to after? I suppose the canonical answer is longjmp() but if we catch a signal somewhere inside printf() and longjmp() out of the handler and then call printf() from summary() we're up shit creek.

Checking for pending request before syscalls would reduce this window, but the window is not completely eliminated. It is enough to receive a signal after the check and before entering kernel, to get into the same issue. Yes of course doing setjmp/longjmp causes the same signal unsafety issues.

It seems that the easiest way to handle the issue is to block SIGINFO on the program start, then unblock it right before doing a syscall, and blocking after the return from syscall. This way you know that the signal delivery cannot interrupt any of the signal-unsafe function calls. The signal handler would print summary and terminate the process. Note that this is still not POSIX-compliant, but our implementation (as well as any reasonable Unix) guarantees that it is safe to do anything in signal handler if you did not interrupted not-async-signal-safe function.

In D39641#906525, @kib wrote:

It seems that the easiest way to handle the issue is to block SIGINFO on the program start, then unblock it right before doing a syscall, and blocking after the return from syscall. This way you know that the signal delivery cannot interrupt any of the signal-unsafe function calls. The signal handler would print summary and terminate the process. Note that this is still not POSIX-compliant, but our implementation (as well as any reasonable Unix) guarantees that it is safe to do anything in signal handler if you did not interrupted not-async-signal-safe function.

If blocking just delayed delivery of the signal I'd agree but it actually inhibits it and I'm not comfortable with that. I'll just add check_terminate() calls immediately prior to blocking calls.

In D39641#908089, @des wrote:
In D39641#906525, @kib wrote:

It seems that the easiest way to handle the issue is to block SIGINFO on the program start, then unblock it right before doing a syscall, and blocking after the return from syscall. This way you know that the signal delivery cannot interrupt any of the signal-unsafe function calls. The signal handler would print summary and terminate the process. Note that this is still not POSIX-compliant, but our implementation (as well as any reasonable Unix) guarantees that it is safe to do anything in signal handler if you did not interrupted not-async-signal-safe function.

If blocking just delayed delivery of the signal I'd agree but it actually inhibits it and I'm not comfortable with that. I'll just add check_terminate() calls immediately prior to blocking calls.

Blocking does not inhibit delivery, blocked signals are queued. Setting signal disposition to ignored makes kernel drop signal.

Check termination flag before blocking operations as well as after.

This revision now requires review to proceed.May 1 2023, 1:31 PM

This looks good to me. I also ran it through my old SIGINT tests.

This revision is now accepted and ready to land.May 5 2023, 1:48 AM
This revision was automatically updated to reflect the committed changes.

I have to agree with kib here, the right fix is to unblock the signal around open/read/write/close rather than this business with flags.