Page MenuHomeFreeBSD

dtrace: Fix fbt return probes on RISC-V
ClosedPublic

Authored by kp on Sep 10 2020, 12:26 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Nov 14, 11:32 PM
Unknown Object (File)
Fri, Nov 8, 5:28 AM
Unknown Object (File)
Mon, Nov 4, 9:31 PM
Unknown Object (File)
Mon, Nov 4, 9:31 PM
Unknown Object (File)
Mon, Nov 4, 9:30 PM
Unknown Object (File)
Mon, Nov 4, 9:30 PM
Unknown Object (File)
Mon, Nov 4, 9:30 PM
Unknown Object (File)
Mon, Nov 4, 9:19 PM
Subscribers

Details

Summary
  • return values are passed in a0
  • set fbtp_roffset so that we get the correct return location in arg0

Sponsored by: Axiado

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 33495
Build 30768: arc lint + arc unit

Event Timeline

kp requested review of this revision.Sep 10 2020, 12:26 PM
kp created this revision.
This revision is now accepted and ready to land.Sep 10 2020, 12:33 PM

The original implementation is copied from AArch64. I'm not convinced this change is correct; RISC-V returns small structs in a0/a1, so this will miss half of such structs.

The original implementation is copied from AArch64.

The AArch64 implementation looks wrong to me. It only handles entry probes.

I'm not convinced this change is correct; RISC-V returns small structs in a0/a1, so this will miss half of such structs.

DTrace doesn't really handle function arguments or return values larger than the register width, unfortunately. The assumption that argument and return types are either pointers or base integer types appears in several places.

So, whatever the right way to fix this is, RISC-V, AArch64, Arm and MIPS should all be changed in the same way. PowerPC seems to be as is done in this patch. IMO the eax argument is a waste of time, and it would be far better if the FBT code just looked at the trapframe to get out whatever set of registers it needs; it's clearer, it can get at more than just the one register and it avoids redundancy.

The original implementation is copied from AArch64.

Yes. I think AArch64's fbt:return probe is similarly broken. I currently don't have any running aarch64 hardware to test changes on though.

I'm not convinced this change is correct; RISC-V returns small structs in a0/a1, so this will miss half of such structs.

I didn't know that. I'm not sure how to persuade DTrace that the return probe has more than one return value to read though. The value's always a uintptr_t, so we can't merge a0 and a1, and we can't assume that a1 is meaningful anyway.

The original implementation is copied from AArch64.

The AArch64 implementation looks wrong to me. It only handles entry probes.

I'm not convinced this change is correct; RISC-V returns small structs in a0/a1, so this will miss half of such structs.

DTrace doesn't really handle function arguments or return values larger than the register width, unfortunately. The assumption that argument and return types are either pointers or base integer types appears in several places.

If you gave both though a script could still look at both args items and extract out the bits it wants? It'd be unfriendly and ABI-specific, but the information would at least be accessible.

I'm not convinced this change is correct; RISC-V returns small structs in a0/a1, so this will miss half of such structs.

DTrace doesn't really handle function arguments or return values larger than the register width, unfortunately. The assumption that argument and return types are either pointers or base integer types appears in several places.

If you gave both though a script could still look at both args items and extract out the bits it wants? It'd be unfriendly and ABI-specific, but the information would at least be accessible.

That's true. I don't think we make the full trapframe available in probe context (e.g., using a thread field, so it's accessible via curthread), but that would also make the full return value accessible.

So, whatever the right way to fix this is, RISC-V, AArch64, Arm and MIPS should all be changed in the same way. PowerPC seems to be as is done in this patch. IMO the eax argument is a waste of time, and it would be far better if the FBT code just looked at the trapframe to get out whatever set of registers it needs; it's clearer, it can get at more than just the one register and it avoids redundancy.

Sounds reasonable to me. For this diff I guess we just want to change fbt_invop() to fetch the return value out of the trap frame instead of using the extra argument from dtrace_invop().

That's true. I don't think we make the full trapframe available in probe context (e.g., using a thread field, so it's accessible via curthread), but that would also make the full return value accessible.

I'm not very familiar with the internals of DTrace, but given that we don't pass the trapframe to dtrace_probe() I expect that to be a fairly invasive change.

Sounds reasonable to me. For this diff I guess we just want to change fbt_invop() to fetch the return value out of the trap frame instead of using the extra argument from dtrace_invop().

I have a weak objection to that, as that's different from what other platforms do. x86 also obtains the return value as the argument to fbt_invop().

In D26389#586864, @kp wrote:

That's true. I don't think we make the full trapframe available in probe context (e.g., using a thread field, so it's accessible via curthread), but that would also make the full return value accessible.

I'm not very familiar with the internals of DTrace, but given that we don't pass the trapframe to dtrace_probe() I expect that to be a fairly invasive change.

Why's that relevant? The only thing that matters is the call to dtrace_invop, which passes the trapframe everywhere except i386 (I think amd64 even passes it from glancing at the assembly). And everywhere except x86 is getting that argument out of the trapframe in the caller, which is just silly.

Sounds reasonable to me. For this diff I guess we just want to change fbt_invop() to fetch the return value out of the trap frame instead of using the extra argument from dtrace_invop().

I have a weak objection to that, as that's different from what other platforms do. x86 also obtains the return value as the argument to fbt_invop().

Yeah, but the other platforms only do that because that's what x86 does, which is special. You can see that from the fact that the dtrace_invop _prototype_ has the last argument called eax on all architectures.

In D26389#586864, @kp wrote:

That's true. I don't think we make the full trapframe available in probe context (e.g., using a thread field, so it's accessible via curthread), but that would also make the full return value accessible.

I'm not very familiar with the internals of DTrace, but given that we don't pass the trapframe to dtrace_probe() I expect that to be a fairly invasive change.

Why's that relevant? The only thing that matters is the call to dtrace_invop, which passes the trapframe everywhere except i386 (I think amd64 even passes it from glancing at the assembly). And everywhere except x86 is getting that argument out of the trapframe in the caller, which is just silly.

I'm thinking about how we'd get the trapframe to the userspace dtrace process, to allow the script to extract information form a0/a1 in case there's relevant information there. That flow passes through dtrace_probe().

An alternative might be to pass a1 (as opposed to the 0 we pass now) in the return dtrace_probe(). That'll let scripts access it as arg2. That'd be different from all other platforms, but arg2 is not defined for the return probe anyway.

Sounds reasonable to me. For this diff I guess we just want to change fbt_invop() to fetch the return value out of the trap frame instead of using the extra argument from dtrace_invop().

I have a weak objection to that, as that's different from what other platforms do. x86 also obtains the return value as the argument to fbt_invop().

Yeah, but the other platforms only do that because that's what x86 does, which is special. You can see that from the fact that the dtrace_invop _prototype_ has the last argument called eax on all architectures.

Sure. I'll post an update in a few minutes. I'm inclined to keep the change in riscv/dtrace_subr.c, because passing the sepc isn't useful either.

Use trapframe rather than rval

This revision now requires review to proceed.Sep 10 2020, 3:21 PM
In D26389#586875, @kp wrote:

An alternative might be to pass a1 (as opposed to the 0 we pass now) in the return dtrace_probe(). That'll let scripts access it as arg2. That'd be different from all other platforms, but arg2 is not defined for the return probe anyway.

So this:

diff --git a/sys/cddl/dev/fbt/riscv/fbt_isa.c b/sys/cddl/dev/fbt/riscv/fbt_isa.c
index 92a0397cefb..406fcd3f9be 100644
--- a/sys/cddl/dev/fbt/riscv/fbt_isa.c
+++ b/sys/cddl/dev/fbt/riscv/fbt_isa.c
@@ -65,7 +65,7 @@ fbt_invop(uintptr_t addr, struct trapframe *frame, uintptr_t rval)
                                    frame->tf_a[3], frame->tf_a[4]);
                        } else {
                                dtrace_probe(fbt->fbtp_id, fbt->fbtp_roffset,
-                                   frame->tf_a[0], 0, 0, 0);
+                                   frame->tf_a[0], frame->tf_a[1], 0, 0);
                        }

                        cpu->cpu_dtrace_caller = 0;

I'd hate to commit us to supporting this indefinitely across all platforms though.

In D26389#586875, @kp wrote:

I'm thinking about how we'd get the trapframe to the userspace dtrace process, to allow the script to extract information form a0/a1 in case there's relevant information there. That flow passes through dtrace_probe().

An alternative might be to pass a1 (as opposed to the 0 we pass now) in the return dtrace_probe(). That'll let scripts access it as arg2. That'd be different from all other platforms, but arg2 is not defined for the return probe anyway.

What I meant was, add a field to struct thread, akin to td_frame, allowing dtrace scripts to access any register by dereferencing curthread-><new field>. Then scripts can access the full value, assuming they understand the calling convention. It's less convenient than just adding another probe argument, but more flexible.

In fact, DTrace has a built-in variables, regs, which is supposed to provide something similar. We don't implement it, presumably because we have no good way to get at the trap frame from probe context (but also because some types of probes, like SDT, don't call dtrace_probe() in an exception context on FreeBSD). Probably the right solution would be to add a trapframe pointer field to struct kdtrace_thread instead. fbt_invop() would set it, allowing the DIF_VAR_REGS handler access to the register file.

In D26389#586881, @kp wrote:
In D26389#586875, @kp wrote:

An alternative might be to pass a1 (as opposed to the 0 we pass now) in the return dtrace_probe(). That'll let scripts access it as arg2. That'd be different from all other platforms, but arg2 is not defined for the return probe anyway.

So this:

diff --git a/sys/cddl/dev/fbt/riscv/fbt_isa.c b/sys/cddl/dev/fbt/riscv/fbt_isa.c
index 92a0397cefb..406fcd3f9be 100644
--- a/sys/cddl/dev/fbt/riscv/fbt_isa.c
+++ b/sys/cddl/dev/fbt/riscv/fbt_isa.c
@@ -65,7 +65,7 @@ fbt_invop(uintptr_t addr, struct trapframe *frame, uintptr_t rval)
                                    frame->tf_a[3], frame->tf_a[4]);
                        } else {
                                dtrace_probe(fbt->fbtp_id, fbt->fbtp_roffset,
-                                   frame->tf_a[0], 0, 0, 0);
+                                   frame->tf_a[0], frame->tf_a[1], 0, 0);
                        }

                        cpu->cpu_dtrace_caller = 0;

I'd hate to commit us to supporting this indefinitely across all platforms though.

Yeah that's what I had in mind. I think it's completely fine to leave it implementation-defined whether an architecture supports multiple register returns; the calling convention for anything other than a single general-purpose register is already implementation-specific.

In D26389#586875, @kp wrote:
In D26389#586864, @kp wrote:

That's true. I don't think we make the full trapframe available in probe context (e.g., using a thread field, so it's accessible via curthread), but that would also make the full return value accessible.

I'm not very familiar with the internals of DTrace, but given that we don't pass the trapframe to dtrace_probe() I expect that to be a fairly invasive change.

Why's that relevant? The only thing that matters is the call to dtrace_invop, which passes the trapframe everywhere except i386 (I think amd64 even passes it from glancing at the assembly). And everywhere except x86 is getting that argument out of the trapframe in the caller, which is just silly.

I'm thinking about how we'd get the trapframe to the userspace dtrace process, to allow the script to extract information form a0/a1 in case there's relevant information there. That flow passes through dtrace_probe().

An alternative might be to pass a1 (as opposed to the 0 we pass now) in the return dtrace_probe(). That'll let scripts access it as arg2. That'd be different from all other platforms, but arg2 is not defined for the return probe anyway.

Sounds reasonable to me. For this diff I guess we just want to change fbt_invop() to fetch the return value out of the trap frame instead of using the extra argument from dtrace_invop().

I have a weak objection to that, as that's different from what other platforms do. x86 also obtains the return value as the argument to fbt_invop().

Yeah, but the other platforms only do that because that's what x86 does, which is special. You can see that from the fact that the dtrace_invop _prototype_ has the last argument called eax on all architectures.

Sure. I'll post an update in a few minutes. I'm inclined to keep the change in riscv/dtrace_subr.c, because passing the sepc isn't useful either.

Alternatively it could be passed 0 to be clear it's unused/useless?

  • Return a1 as well as it can sometimes be used for return values.
  • Set rtval to 0 because it's never used. Remove it as an argument for dtrace_invop().
This revision is now accepted and ready to land.Sep 10 2020, 10:31 PM
This revision was automatically updated to reflect the committed changes.