Details

Reviewers

Group Reviewers

bhyve
manpages

Commits

rS355724: Support software breakpoints in the debug server on Intel CPUs.
rS355686: Fix a mismerge in r355683 and remove the local gdb_port from main.
rS355683: Don't call into the debug server if it isn't configured.

Summary

Allow the userland hypervisor to intercept breakpoint exceptions (BP#) in the guest. A new capability (VM_CAP_BPT_EXIT) is used to enable this feature. These exceptions are reported to userland via a new VM_EXITCODE_BPT that includes the length of the original breakpoint instruction. If userland wishes to pass the exception through to the guest, it must be explicitly re-injected via vm_inject_exception().
Export VMCS_ENTRY_INST_LENGTH as a VM_REG_GUEST_ENTRY_INST_LENGTH pseudo-register. Injecting a BP# on Intel requires setting this to the length of the breakpoint instruction. AMD SVM currently ignores writes to this register (but reports success) and fails to read it.
Rework the per-vCPU state tracked by the debug server. Rather than a single 'stepping_vcpu' global, add a structure for each vCPU that tracks state about that vCPU ('stepping', 'stepped', and 'hit_swbreak'). A global 'stopped_vcpu' tracks which vCPU is currently reporting an event. Event handlers for MTRAP and breakpoint exits loop until the associated event is reported to the debugger.

Breakpoint events are discarded if the breakpoint is not present when a vCPU resumes in the breakpoint handler to retry submitting the breakpoint event.
Maintain a linked-list of active breakpoints in response to the GDB 'Z0' and 'z0' commands.

Test Plan

tested breakpoints while booting a freebsd kernel. Ran ptrace_test via kyua which used software breakpoints in the guest while kernel breakpoints were active to test BP# injection.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Herald added subscribers: bcran, rgrimes, emaste. · View Herald TranscriptMay 18 2019, 6:29 PM

Harbormaster completed remote builds in B24306: Diff 57537.May 18 2019, 6:29 PM

Harbormaster completed remote builds in B24306: Diff 57537.

jhb added a child revision: D20308: Add initial support for qSupported..May 18 2019, 6:31 PM

jhb removed a child revision: D20308: Add initial support for qSupported..

jhb added a parent revision: D20308: Add initial support for qSupported..

jhb added a subscriber: scottph.

Still working on fully reviewing this, but I noticed this one thing in testing.

usr.sbin/bhyve/gdb.c
708 ↗	(On Diff #57537)	gdb_cpu_add is called whether gdb is in use or not, but vcpu_state is only initialized when gdb is in use. So this will segfault when not using gdb.

jhb added inline comments.May 30 2019, 5:07 PM

usr.sbin/bhyve/gdb.c
708 ↗	(On Diff #57537)	Huh, that is actually an old bug. PTHREAD_MUTEX_INITIALIZER just happens to be all zeros such that the first call to pthread_mutex_lock initialized it implicitly, but I should really be bailing from several of these gdb_cpu_ hooks if the stub isn't initialized.

Rebase.
Don't call into the debug server if it isn't configured.

Harbormaster completed remote builds in B24613: Diff 58097.May 31 2019, 12:28 AM

jhb marked an inline comment as done.May 31 2019, 12:30 AM

jhb added inline comments.

usr.sbin/bhyve/bhyverun.c
170 ↗	(On Diff #58097)	The changes in this file to avoid calling into the debug server I will commit first as a separate change before the actual breakpoint change, but I've uploaded to them this review for now.

markj added inline comments.Jun 3 2019, 9:07 PM

usr.sbin/bhyve/gdb.c
1196 ↗	(On Diff #58097)	Extra space after `==`, ditto below.
1215 ↗	(On Diff #58097)	The protocol specification says that a 'Z0' packet may include conditional expressions, delimited by a semi-colon. Shouldn't we search for that too?
1229 ↗	(On Diff #58097)	If we're not honouring the full request, I'd think that we should return an empty string.

jhb marked 3 inline comments as done.Jun 4 2019, 3:54 PM

jhb added inline comments.

usr.sbin/bhyve/gdb.c
1215 ↗	(On Diff #58097)	This is supposed to be a ';' rather than a ':' so we can detect the expressions. It then follows into the else block below. Note that the debugger should only send those to us if we advertise support for ConditionalBreakpoints and/or BreakpointCommands in qSupported (which we don't).

Fixes from Mark.

Harbormaster completed remote builds in B24681: Diff 58232.Jun 4 2019, 3:55 PM

markj added inline comments.Jun 4 2019, 9:21 PM

usr.sbin/bhyve/gdb.c
1236 ↗	(On Diff #58232)	Should this be a `return`?

markj added inline comments.Jun 4 2019, 9:27 PM

usr.sbin/bhyve/gdb.c
102 ↗	(On Diff #58232)	Could you add a comment explaining the state machine? It's not very easy to reconstruct it from reading the code.

More fixes from Mark.

Harbormaster completed remote builds in B24691: Diff 58252.Jun 5 2019, 12:26 AM

markj added inline comments.Jun 5 2019, 1:32 AM

usr.sbin/bhyve/gdb.c
672 ↗	(On Diff #58252)	I'd add braces here too.
1129 ↗	(On Diff #58252)	Missing space before the colon.
1254 ↗	(On Diff #58252)	I don't think we need to do any of this work if we're just going to return.
1773 ↗	(On Diff #58252)	We never initialize `vcpu` in the members of this array.

markj added inline comments.Jun 5 2019, 1:40 AM

usr.sbin/bhyve/gdb.c
1773 ↗	(On Diff #58252)	Oops, sorry, that's not true. The problem is that new_connection() may zero the array after the fact.

Simplify parsing of breakpoint kind.
Don't trash vcpu member of vcpu state.
Style fixes.

Harbormaster completed remote builds in B24706: Diff 58280.Jun 5 2019, 7:37 PM

jhb added inline comments.Jun 5 2019, 7:38 PM

usr.sbin/bhyve/gdb.c
672 ↗	(On Diff #58252)	Or remove them for stepped (I removed another line in that block in the previous revision).
1254 ↗	(On Diff #58252)	I'd kind of left it as a placeholder in case we were going to support it in the future, but I can fix that. In fact, I'll probably restructure it a bit so we just fail for the cp != NULL case first and then fall through to the valid case.
1773 ↗	(On Diff #58252)	Oh, that is a bug. I had this in C++ where that was a vector and the vcpu was an index at one point, but went back to C and the bug originates there. I will probably fix new_connection to clear vCPU state explicitly instead.

markj added inline comments.Jun 6 2019, 7:36 PM

usr.sbin/bhyve/gdb.c
664 ↗	(On Diff #58280)	I think we want to update cur_vcpu here too.

Set cur_vcpu when reporting a new stop event.

Harbormaster completed remote builds in B24752: Diff 58367.Jun 7 2019, 3:59 PM

Use vcpu instead of vs->vcpu in one place.

Harbormaster completed remote builds in B24753: Diff 58368.Jun 7 2019, 4:02 PM

markj accepted this revision as: markj.Jun 7 2019, 5:16 PM

This revision is now accepted and ready to land.Jun 7 2019, 5:16 PM

I need to rework this a bit more. If the debugger removes the breakpoint, then we shouldn't report any pending events for that breakpoint. I had this happen during testing where I set a breakpoint on Xinvlpg and during boot the stub did the nice thing of reporting a breakpoint for each vCPU thread in turn. However, when I disabled the breakpoint so I could make forward progress, I got two more spurious SIGTRAP events in the debugger due to breakpoint events that were queued but no longer valid.

The simple implementation would be to dispense with the queue of events entirely and just report the most recent event each time. The vCPUs would hit the breakpoint again when they all resumed in that case. I think what I might do though which is slightly more elegant will be to have the case where 'continue' wants to report an already-queued event is to discard swbreak events if the associated breakpoint is no longer active. OTOH, I still need to think a bit about how to handle threads that have stepped and how to report a step. I think the vCPU needs to stay in the "stepped" state perhaps until it's step is actually reported. I can maybe handle these things by having the vCPU threads keep reporting the event if it wasn't ACKed and the event is still true perhaps? Maybe that is the better way to do this.

Rework event reporting.
Fix test to report stepped event.
Make breakpoint packets idempotent.
Remove any active breakpoints when the debugger goes away.

This revision now requires review to proceed.Jun 8 2019, 5:23 PM

Harbormaster completed remote builds in B24771: Diff 58405.Jun 8 2019, 5:23 PM

Style fixes.

Harbormaster completed remote builds in B24775: Diff 58418.Jun 8 2019, 9:40 PM

Harbormaster completed remote builds in B24775: Diff 58418.

markj added inline comments.Jun 10 2019, 6:25 PM

usr.sbin/bhyve/bhyverun.c
699 ↗	(On Diff #58418)	In vmexit_debug() and vmexit_breakpoint() you omitted the extra newline.
usr.sbin/bhyve/gdb.c
919 ↗	(On Diff #58418)	What's the point of fetching the capability here?

jhb marked 2 inline comments as done.Jun 10 2019, 7:40 PM

jhb added inline comments.

usr.sbin/bhyve/gdb.c
919 ↗	(On Diff #58418)	MTRAP exits are an Intel-only feature. AMD CPUs don't support them. That said, what I probably should do is invoke this once during init_gdb to set a global 'mtrap_supported' bool that I then check here instead of calling it each time. I might do that as a followup change.

Extra blank line.

Harbormaster completed remote builds in B24799: Diff 58488.Jun 10 2019, 7:40 PM

Harbormaster completed remote builds in B24799: Diff 58488.

I've been testing this. I observe the following problem when I put a breakpoint on pmap_pinit(). The breakpoint fires, I continue, and gdb/bhyve single steps the trapping CPU by enabling monitor traps. We get a monitor trap with $rip = Xtimerint_pti, gdb re-enables the breakpoint, and in the guest we re-execute the breakpoint when returning from the timer interrupt, so we're basically stuck in a loop.

Weirdly, I was testing an earlier version of the diff last week, and I didn't see this problem. I've tried going back to earlier versions of the diff and earlier kernels, but I keep seeing the same problem.

In D20309#445175, @markj wrote:

I've been testing this. I observe the following problem when I put a breakpoint on pmap_pinit(). The breakpoint fires, I continue, and gdb/bhyve single steps the trapping CPU by enabling monitor traps. We get a monitor trap with $rip = Xtimerint_pti, gdb re-enables the breakpoint, and in the guest we re-execute the breakpoint when returning from the timer interrupt, so we're basically stuck in a loop.

Weirdly, I was testing an earlier version of the diff last week, and I didn't see this problem. I've tried going back to earlier versions of the diff and earlier kernels, but I keep seeing the same problem.

Hummmm. I haven't seen this behavior, but I can try it. I wonder what qemu does in this case. On a purely software model it probably pauses all the timers, etc. while stopped in the debugger, but I'm not sure if it does that if you are running on top of KVM?

The situation is akin to if the breakpoint in a userland program causes a signal handler to be invoked. On Linux single stepping will enter the signal handler. On FreeBSD we disable TF while entering the signal handler and restore it on exit so that we end up stepping over the entire signal. Probably on GDB this is handled by having the signal be reported to the debugger first.

In D20309#445179, @jhb wrote:

In D20309#445175, @markj wrote:

I've been testing this. I observe the following problem when I put a breakpoint on pmap_pinit(). The breakpoint fires, I continue, and gdb/bhyve single steps the trapping CPU by enabling monitor traps. We get a monitor trap with $rip = Xtimerint_pti, gdb re-enables the breakpoint, and in the guest we re-execute the breakpoint when returning from the timer interrupt, so we're basically stuck in a loop.

Weirdly, I was testing an earlier version of the diff last week, and I didn't see this problem. I've tried going back to earlier versions of the diff and earlier kernels, but I keep seeing the same problem.

Hummmm. I haven't seen this behavior, but I can try it. I wonder what qemu does in this case. On a purely software model it probably pauses all the timers, etc. while stopped in the debugger, but I'm not sure if it does that if you are running on top of KVM?

The situation is akin to if the breakpoint in a userland program causes a signal handler to be invoked. On Linux single stepping will enter the signal handler. On FreeBSD we disable TF while entering the signal handler and restore it on exit so that we end up stepping over the entire signal. Probably on GDB this is handled by having the signal be reported to the debugger first.

In D20309#445179, @jhb wrote:

In D20309#445175, @markj wrote:

I've been testing this. I observe the following problem when I put a breakpoint on pmap_pinit(). The breakpoint fires, I continue, and gdb/bhyve single steps the trapping CPU by enabling monitor traps. We get a monitor trap with $rip = Xtimerint_pti, gdb re-enables the breakpoint, and in the guest we re-execute the breakpoint when returning from the timer interrupt, so we're basically stuck in a loop.

Weirdly, I was testing an earlier version of the diff last week, and I didn't see this problem. I've tried going back to earlier versions of the diff and earlier kernels, but I keep seeing the same problem.

Hummmm. I haven't seen this behavior, but I can try it. I wonder what qemu does in this case. On a purely software model it probably pauses all the timers, etc. while stopped in the debugger, but I'm not sure if it does that if you are running on top of KVM?

It might be possible to reproduce this by injecting an NMI while the vCPUs are suspended after a swbreak. I haven't tried it, though.

The situation is akin to if the breakpoint in a userland program causes a signal handler to be invoked. On Linux single stepping will enter the signal handler. On FreeBSD we disable TF while entering the signal handler and restore it on exit so that we end up stepping over the entire signal. Probably on GDB this is handled by having the signal be reported to the debugger first.

So I've thought some more and for one-off events like NMIs which should be rare I think the current behavior we are just stuck with. The same for synchronous exceptions or faults if the instruction being stepped raises an exception. However, for device interrupts like timers, it might be sensible to block those while stepping. I need to look to see if I can "cheat" by setting some state in the VMCS that lets me avoid altering IF in %rflags. If I have to mess with IF then it gets messier as I will need to manually emulate pushf to avoid leaking IF into the guest-visible state.

Do you think it's worth while to add the IF-frobbing for stepping as a separate commit btw?