Page MenuHomeFreeBSD

TLS: Use <machine/tls.h> for libc and rtld.
ClosedPublic

Authored by jhb on Dec 8 2021, 9:52 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 25, 6:34 AM
Unknown Object (File)
Dec 23 2024, 12:00 PM
Unknown Object (File)
Dec 14 2024, 4:47 AM
Unknown Object (File)
Dec 11 2024, 7:14 PM
Unknown Object (File)
Oct 25 2024, 10:03 AM
Unknown Object (File)
Oct 23 2024, 4:42 AM
Unknown Object (File)
Oct 17 2024, 7:20 AM
Unknown Object (File)
Oct 16 2024, 9:02 AM
Subscribers

Details

Summary
  • Include <machine/tls.h> in MD rtld_machdep.h headers.
  • Remove local definitions of TLS_* constants from rtld_machdep.h headers and libc using the values from <machine/tls.h> instead.
  • Use _tcb_set() instead of inlined versions in MD allocate_initial_tls() routines in rtld. The one exception is amd64 which optionally writes to fsbase directly rather than always using the sysarch() which is what _tcb_set() does.
  • Use _tcb_set() instead of _set_tp() in libc.
  • Use '&_tcb_get()->tcb_dtv' instead of _get_tp() in both rtld and libc. This permits removing _get_tp.c from rtld.
  • Use TLS_TCB_SIZE and TLS_TCB_ALIGN with allocate_tls() in MD allocate_initial_tls() routines in rtld.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jhb requested review of this revision.Dec 8 2021, 9:52 PM
libexec/rtld-elf/aarch64/rtld_machdep.h
83

This structure is the same on all platforms. Not sure if that is accidental or if it is truly MI.

libexec/rtld-elf/amd64/reloc.c
525

All of the allocate_initial_tls routines are now almost identical. The use of tls_last_size in the computation of tls_static_space would need to be conditional on TLS variant I, e.g.:

    tls_static_space = tls_last_offset + RTLD_STATIC_TLS_EXTRA;
#ifdef TLS_VARIANT_I
    tls_static_space += tls_last_size;
#endif

(And in general it seems like all of tls_last_size should perhaps be under TLS_VARIANT_I whereas only places that set it to a non-zero value are currently under the #ifdef).

The one other exception here is that for amd64 I have not used _tcb_set to preserve the use of wrfsbase. I'm not sure if that optimization is worth preserving however? If not, we could use an MI version of this function. Note that equivalent code is MI in libc and used _set_tp that always used the sysarch().

547

Similarly, if you use the powerpc/mips/risc-v version of this that adds in TLS_DTV_OFFSET (which is zero on the other architectures), this function could also now be MI as it is in libc).

libexec/rtld-elf/aarch64/rtld_machdep.h
83

Drepper' tls.pdf says the following:

The values for the module ID and the TLS block offset are determined by
the dynamic linker at run-time and then passed to the __tls_get_addr
function in the architecture-specific way.

Then each arch describes the layout of GOT-pointed place. I think it just happens this way,
since pointers are effectively longs on all supported arches.

libexec/rtld-elf/amd64/reloc.c
525

libc use of _set_tp() is limited to static binaries AFAIR, so this is the place where I do not care much about one more syscall on startup. OTOH I do for dynamically linked (normal) binaries.

libexec/rtld-elf/amd64/reloc.c
525

Ok. Should _tcb_set in machine/tls.h for amd64 query cpuid and try to use fsbase? It would make it longer, but would permit avoiding a syscall in libc and libthr as well. It's not a requirement to make allocate_initial_tls MI, but it would be nice to do so. To be clear though, I would only suggest doing this as further followups, not as part of this series. (Including possibly changing _tcb_set for amd64 to use wrfsbase).

libexec/rtld-elf/amd64/reloc.c
525

libc and libthr can benefit from ifuncs, i.e. amd64_{s,g}et_fsbase are usable for libraries, but not for rtld. I believe this is the case, at least for dynamic binaries, right now.

I cannot say, without a lot of digging in the series of reviews, is it achievable with your restructuring.

This revision is now accepted and ready to land.Dec 9 2021, 1:05 AM
libexec/rtld-elf/amd64/reloc.c
525

Bah, I had misread the x86 cases and indeed I have regressed libc and libthr to stop using ifuncs for x86. I may have to revisit a bit what I put into <machine/tls.h> then.

libexec/rtld-elf/amd64/reloc.c
525

Upon further reflection, ifuncs for amd64_set_fsbase should still be working. The x86 tls.h use i386_set_gsbase and amd64_set_fsbase which are declared in <machine/sysarch.h>. We only use _tcb_set in libc and libthr for this. rtld still uses its inline version which uses wrfsbase directly.

This revision was automatically updated to reflect the committed changes.