Page MenuHomeFreeBSD

ptrace(PT_COREDUMP)
ClosedPublic

Authored by kib on Apr 23 2021, 7:36 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jan 21, 1:46 AM
Unknown Object (File)
Sat, Jan 18, 3:53 AM
Unknown Object (File)
Fri, Jan 17, 4:00 PM
Unknown Object (File)
Fri, Jan 17, 3:08 PM
Unknown Object (File)
Sun, Jan 12, 2:03 PM
Unknown Object (File)
Sat, Jan 11, 3:39 AM
Unknown Object (File)
Sat, Jan 11, 3:18 AM
Unknown Object (File)
Sat, Jan 4, 8:56 AM

Details

Summary

Right now this is a continuation of D29691, I decided to not commandeer the original review.

I switched to a mode where only one thread is temporary run, only to dump core, instead of unsuspending everything, and then trying to catch them in AST. So the process is kept stopped after PT_COREDUMP, but debugger needs to wait on it before PT_CONTINUE is allowed.

I am sure that this requires much more thinking and testing, but I want to provide you with something to not block any other work.

Also I disallowed parallel ptrace(2) operations, so that PT_DETACH cannot occur while PT_COREDUMP is running.

I also tried to plug as many holes in the patch as possible, non-complete list is

  • usermode flags
  • compat32
  • added padding for future ABI-compat extensions
  • minor bugs

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sys/kern/kern_thread.c
1518 ↗(On Diff #88251)

I mean that run is scope-bound, the thread is run for limited block of code, and then put back to suspension. It is not intended to be fully runnable state, e.g. an attempt to return to userspace would cause invariants check.

Like flashlight flash.

1535 ↗(On Diff #88251)

TD_IS_SUSPENDED should be enough, but ok.

sys/kern/subr_sleepqueue.c
858 ↗(On Diff #88251)

You mean, 'sleepq_remove' as the possible name?

Issue there is that PT_COREDUMP might select a target thread that is suspended while it is already put on sleepqueue. This is the sleepq_catch_signals()->sig_ast_needsigchk()->issignal()->ptracestop() case of ptracestop() suspension. But sv_coredump() needs to sleep for misc. VFS resources like rangelocks/vnode/buffer locks etc. So I remove the thread from the sleepq if any (resulting in the usual spurious EINTR from ptrace(2) anyway).

No, I cannot select a better thread generally, this might be the only thread at all.

Nested there denotes the situation of sleep while we are on sleepqueue, if not done.

sys/kern/sys_process.c
826 ↗(On Diff #88251)

Also added missed wmsg

838 ↗(On Diff #88251)

Well, this is the place which informs other threads to block.

1110 ↗(On Diff #88251)

I think this is stray/mismerge.

1376 ↗(On Diff #88251)

Yes this was the intent, initial version lack any writeability check on fd, and fget_write() is the same method as used e.g. by write(2).

1399 ↗(On Diff #88251)

Does not make sense to limit it to root, since, as you notes, PT_IO allows to read the same mappings anyway.

1409 ↗(On Diff #88251)

It is done by the victim thread, see the end of ptrace_coredump().

kib marked 10 inline comments as done.Apr 27 2021, 4:33 PM

Properly pass tc_flags to sv_coredump.
Make blocking in ptrace(2) for other ptrace call interruptible.
Remove stray chunk.
Update comments.

lib/libc/sys/ptrace.2
812 ↗(On Diff #88262)

The
.Fa addr
argument...

826 ↗(On Diff #88262)

Should we indicate what type(s) of file descriptors are permissible?

834 ↗(On Diff #88262)

Can we clarify non-dumpable?
Do we need a comment in mmap(2) alongside MAP_NOCORE? I assume there are multiple reasons MAP_NOCORE may be used.

846 ↗(On Diff #88262)

Maybe "must already be stopped"?

sys/kern/imgact_elf.c
1709

So if global compress_user_cores is not set PT_COREDUMP cores are also uncompressed?

usr.bin/gcore/gcore.1
40–41 ↗(On Diff #88262)

I think convention is for non-arg flags to come first, all together? So [-fk] [-c core]?

70 ↗(On Diff #88262)

Probably "Use the ptrace(2) PT_COREDUMP kernel facility to..."

kib marked 7 inline comments as done.Apr 27 2021, 6:58 PM
kib added inline comments.
sys/kern/imgact_elf.c
1709

Initial intent was to make a flag to adjust global settings. But ok, I now made PC_COMPRESS completely independent from the configuration for automatic core dump.

Next logical step would be to add an option to gcore to request compressed dump, but then it raises the question why this option is only implemented for -k.

kib marked an inline comment as done.

Make PC_COREDUMP self-contained.
Man page updates.

lib/libc/sys/ptrace.2
828 ↗(On Diff #88274)
859 ↗(On Diff #88274)

So in some error cases this is not true. How does userspace know whether it needs to wait?

sys/kern/imgact_elf.c
1709

If it's completely independent, shouldn't the check be if (((flags & SVC_PT_COREDUMP) == 0 && compress_user_cores) || (flags & (SVC_PT_COREDUMP | SVC_NOCOMPRESS)) == SVC_PT_COREDUMP)?

Also we may use either zlib or zstd for compression, and this is controlled by a different global setting.

sys/kern/kern_thread.c
1538 ↗(On Diff #88274)

Why is the caller responsible for kicking the swapper? It seems reasonable to handle it here. I also don't see why the caller drops the proc lock to call kick_proc0().

sys/kern/subr_sleepqueue.c
858 ↗(On Diff #88251)

Yes, sleepq_remove() seems like a better name for this. I don't have a strong feeling about it though.

sys/kern/sys_process.c
1375 ↗(On Diff #88274)

Is there some fundamental reason we couldn't support dumping to a named pipe as well? It could be useful for doing compression in userspace, for instance. I am not suggesting to do it in this review.

1409 ↗(On Diff #88251)

Ok.

usr.bin/gcore/gcore.c
125 ↗(On Diff #88274)

Shouldn't the wait happen only if PT_COREDUMP was successful?

134 ↗(On Diff #88274)
kib marked 8 inline comments as done.Apr 28 2021, 7:06 PM
kib added inline comments.
lib/libc/sys/ptrace.2
859 ↗(On Diff #88274)

I propose to not care about the difference, see updated man page and my other reply.

sys/kern/imgact_elf.c
1709

I think it is good enough for now to not provide a selection of compression method through flags.

sys/kern/subr_sleepqueue.c
858 ↗(On Diff #88251)

IMO sleepq_remove() is too generic, this situation is specific. I want to leave the generic name unused.

sys/kern/sys_process.c
1375 ↗(On Diff #88274)

I did not inspected code to claim that we can.

One easy to see reason is that we do not want a user process (reader) to be able to block the process in this state for indefinite time. This might cause cascaded debugger' hangs.

usr.bin/gcore/gcore.c
125 ↗(On Diff #88274)

No, in all cases when the victim thread was unsuspended. As you noted in ptrace(2) comment, it is hard for userspace to know exactly when to wait, but also it is quite dirty for kernel either to try to suppress clearing of P_WAITED, or to wait internally.

I think the acceptable solution is to wait there with WNOHANG. It would eat stray notification if there is any, and waitpid(2) returns 0 when there was no process to report.

I admit that it is inelegant but it is the least evil as I see the problem.

kib marked 5 inline comments as done.

Kick swapper in thread_run_flash()
waitpid(WNOHANG) after PT_COREDUMP
improve SVC_COMPRESS interpretation

lib/libc/sys/ptrace.2
853 ↗(On Diff #88340)
856 ↗(On Diff #88340)
860 ↗(On Diff #88340)
864 ↗(On Diff #88340)
871 ↗(On Diff #88340)

I believe there is no race here if a thread was unsuspended: after ptrace(PT_COREDUMP) returns, the use of the proc lock in ptracestop() ensures that the new stop event is immediately visible to a waitpid(WNOHANG) caller. Is that correct? I am just concerned about waitpid(WNOHANG) returning a false negative, i.e., there is a new stop event to consume but waitpid(WNOHANG) did not consume it because the re-suspension happens asynchronously.

sys/kern/imgact_elf.c
1709

I'd be fine with simply not supporting compression at all through this interface, at least initially.

In fact I believe the interface is somewhat broken: if SVC_NOCOMPRESS is not set, we will try to use the value of compress_user_cores as the compression algorithm, and if it is not configured compressor_init() will return an error. I think it was correct before my earlier suggestion, sorry.

At the very least I think this knob, kern.compress_user_cores, should be referenced in the documentation.

sys/kern/sys_process.c
1375 ↗(On Diff #88274)

Ok, but the debugger can decide who the reader is. One can imagine e.g., gcore that uses a named pipe to stream the core dump through a compression library, using a second thread to read from the pipe. You don't even need a named pipe for this.

kib marked 7 inline comments as done.Apr 29 2021, 6:09 PM
kib added inline comments.
lib/libc/sys/ptrace.2
871 ↗(On Diff #88340)

The waitpid(WNOHANG) should be executed atfer ptrace(PT_COREDUMP) returned. The unsuspended thread suspends itself, which generates the event by thread_suspend_switch()->thread_stopped()->childproc_stopped() before wakeup.

So you are right that there is no race, but the main reason is that the event generation is in fact synchronous.

sys/kern/sys_process.c
1375 ↗(On Diff #88274)

It still means that user SIGSTOPping the debugger leaves debuggee in unacceptable state. Lets postpone this till real consumers appear.

kib marked 2 inline comments as done.

Set compression method to something if PR_COREDUMP requested compression.
Editing for man page.

Lets postpone this till real consumers appear.

A consumer should arrive with LLDB 13

Lets postpone this till real consumers appear.

A consumer should arrive with LLDB 13

I mean, a consumer that wants to pipe the core dump and do something with it on flight.

markj added inline comments.
lib/libc/sys/ptrace.2
871 ↗(On Diff #88340)

Doesn't the wakeup happen in ptrace_coredump(), before the thread_suspend_switch() call?

sys/compat/freebsd32/freebsd32_misc.c
1033 ↗(On Diff #88385)

Do you have to update data to make it equal to sizeof(r.pc)?

usr.bin/gcore/gcore.1
73 ↗(On Diff #88385)
74 ↗(On Diff #88385)
78 ↗(On Diff #88385)

I'm not sure what "precise" really means here. AFAIK gcore also stops the process while collecting a core, so there shouldn't be any consistency problems, and gcore includes most (all?) of the same notes as the kernel core dumper.

This revision is now accepted and ready to land.Apr 30 2021, 1:50 PM
kib marked 5 inline comments as done.Apr 30 2021, 4:34 PM
kib added inline comments.
lib/libc/sys/ptrace.2
871 ↗(On Diff #88340)

Hm, yes. I think I will not change anything, because indeed the process lock is held. But the sleep/wakeup there can be changed to p->p_pptr wchan, and then wakeup in ptrace_coredump() is not needed.

usr.bin/gcore/gcore.1
78 ↗(On Diff #88385)

By 'precise' I mean that it match the kernel-generated dumps both now and in future, if e.g. some notes or new kinds of segments are added to the kernel dumper, which not yet copied to gcore.

kib marked 2 inline comments as done.

Fix compat32.
Man page editing

This revision now requires review to proceed.Apr 30 2021, 4:43 PM
markj added inline comments.
usr.bin/gcore/gcore.1
78 ↗(On Diff #88385)

You already say that in the same sentence though, so I would drop "precise". Without this explanation the meaning of "precise" there is not obvious IMO.

This revision is now accepted and ready to land.Apr 30 2021, 4:48 PM
kib marked an inline comment as done.Apr 30 2021, 5:02 PM

Fixes for allowing only one ptrace request:

  • recheck p_cansee and p_candebug after process lock was dropped
  • do not sleep with proctree_lock owned. _exit1() waits for this lock after setting P_WEXIT
  • since p_lock/PHOLD can be taken for other reasons, and since p->p_lock-- == 0 condition causes wakeup only when P_WEXIT is set, use a p_flag2 flag instead of p_lock as indicator of ptrace request in flight

Passed pho tested.

This revision now requires review to proceed.May 2 2021, 1:53 PM
sys/kern/sys_process.c
845 ↗(On Diff #88502)

Why not check td2->td_proc != p before calling proc_can_trace()?

kib marked an inline comment as done.

Reorder checks after sleep for P2_PTRACEREQ

This revision is now accepted and ready to land.May 3 2021, 3:59 PM