sdt: Implement SDT probes using hot-patching
The idea here is to avoid a memory access and conditional branch per
probe site. Instead, the probe is represented by an "unreachable"
unconditional function call. asm goto is used to store the address of
the probe site (represented by a no-op sled) and the address of the
function call into a tracepoint record. Each SDT probe carries a list
of tracepoints.
When the probe is enabled, the no-op sled corresponding to each
tracepoint is overwritten with a jmp to the corresponding label. The
implementation uses smp_rendezvous() to park all other CPUs while the
instruction is being overwritten, as this can't be done atomically in
general. The compiler moves argument marshalling code and the
sdt_probe() function call out-of-line, i.e., to the end of the function.
Per gallatin@ in D43504, this approach has less overhead when probes are
disabled. To make the implementation a bit simpler, I removed support
for probes with 7 arguments; nothing makes use of this except a
regression test case. It could be re-added later if need be.
The approach taken in this patch enables some more improvements:
- We can now automatically fill out the "function" field of SDT probe names. The SDT macros let the programmer specify the function and module names, but this is really a bug and shouldn't have been allowed. The intent was to be able to have the same probe in multiple functions and to let the user restrict which probes actually get enabled by specifying a function name or glob.
- We can avoid branching on SDT_PROBES_ENABLED() by adding the ability to include blocks of code in the out-of-line path. For example:
if (SDT_PROBES_ENABLED()) {
int reason = CLD_EXITED; if (WCOREDUMP(signo)) reason = CLD_DUMPED; else if (WIFSIGNALED(signo)) reason = CLD_KILLED; SDT_PROBE1(proc, , , exit, reason);
}
could be written
SDT_PROBE1_EXT(proc, , , exit, reason,
int reason; reason = CLD_EXITED; if (WCOREDUMP(signo)) reason = CLD_DUMPED; else if (WIFSIGNALED(signo)) reason = CLD_KILLED;
);
In the future I would like to use this mechanism more generally, e.g.,
to remove branches and marshalling code used by hwpmc, and generally to
make it easier to add new tracepoint consumers without having to add
more conditional branches to hot code paths.
Reviewed by: Domagoj Stolfa, avg
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D44483