I was trying to use dtrace ustack() to log all of the malloc/free calls in
an SSL-using application (nginx), and I found that many of the stack traces
were missing library and function names for lots of the stack frames.
The same thing happens with the much simpler to use "openssl" command,
here is an example from that:
CPU ID FUNCTION:NAME
1 68835 malloc:entry libc.so.7`malloc+0x1 `0x8012c4fd3 `0x8012f8043 `0x8012fcc9c openssl`main+0xd2 openssl`_start+0x10d `0x8010f1008
It turned out that the mystery library is libcrypto.so, and here is how
that library is mapped into the openssl command address space:
2926 0x8011a9000 0x8012c2000 r-- 281 2408 17 0 CN--- vn /lib/libcrypto.so.111
2926 0x8012c2000 0x8012c3000 --- 0 0 0 0 CN--- gd
2926 0x8012c3000 0x80146b000 r-x 424 2408 17 0 CN--- vn /lib/libcrypto.so.111
2926 0x80146b000 0x801496000 r-- 43 0 1 0 C---- vn /lib/libcrypto.so.111
2926 0x801496000 0x80149c000 rw- 6 0 1 0 C---- vn /lib/libcrypto.so.111
The "gd" guard page left in the middle of the file mappings is a bit odd,
but it looks legitimate based on the ELF headers of the library:
/lib/libcrypto.so.111: file format elf64-x86-64-freebsd
Program Header:
PHDR off 0x0000000000000040 vaddr 0x0000000000000040 paddr 0x0000000000000040 align 2**3 filesz 0x0000000000000230 memsz 0x0000000000000230 flags r-- LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**12 filesz 0x00000000001185b4 memsz 0x00000000001185b4 flags r-- LOAD off 0x0000000000119000 vaddr 0x000000000011a000 paddr 0x000000000011a000 align 2**12 filesz 0x00000000001a7bc0 memsz 0x00000000001a7bc0 flags r-x LOAD off 0x00000000002c0bc0 vaddr 0x00000000002c2bc0 paddr 0x00000000002c2bc0 align 2**12 filesz 0x0000000000029678 memsz 0x0000000000029678 flags rw- LOAD off 0x00000000002ea240 vaddr 0x00000000002ed240 paddr 0x00000000002ed240 align 2**12 filesz 0x0000000000005d20 memsz 0x0000000000009400 flags rw-
The ELF phdrs do not map anything at vaddr 0x0000000000119000,
so that bit of the guard mapping that reserves the space for a
shared library is still there after the library is fully mapped.
The problem here is in librtld_db's rd_loadobj_iter(), where we translate
the kernel's mapping information obtained from kinfo_getvmmap() into
rtld_db's rd_loadobj_t records, which is then later converted by libproc's
proc_rdagent() and map_iter() into libproc's struct map_info records,
where the information is eventually used by proc_addr2sym() called from
libdtrace's dt_print_ustack(). There are two related bugs in
this code in rd_loadobj_iter() that are triggered by the guard page
that is left in between the first two libcrypto.so mappings:
if (kve->kve_type == KVME_TYPE_VNODE) { if (kve->kve_vn_fileid != fileid) { base = kve->kve_start; fileid = kve->kve_vn_fileid; path = kve->kve_path; } } else { base = 0; path = NULL; }
(1) we need to set "path" for all vnode mappings, not just ones
that are in an uninterrupted run after the initial mapping of a new fileid.
(2) we should not reset "base" to 0 when we see a non-vnode mapping,
because more vnode mappings for the same fileid might still be coming later.
With the patch in this review, dtrace ustack() now shows the library
and function names for the libcrypto functions:
CPU ID FUNCTION:NAME
3 68835 malloc:entry libc.so.7`malloc+0x1 libcrypto.so.111`CRYPTO_zalloc+0x43 libcrypto.so.111`BIO_new+0x23 libcrypto.so.111`BIO_new_fp+0x1c openssl`main+0xd2 openssl`_start+0x10d `0x8010f1008
The tests in /usr/tests/lib/libproc also continue to pass with this patch.