Paths

Table of Contentst

Differential D35537

maps
ClosedPublic
Actions

Authored by dchagin on Jun 21 2022, 1:07 PM.

Details

Reviewers

kib

Group Reviewers

Linux Emulation

Commits

rGb1d0fe755bb1: linprocfs: Skip printing of the guard page in the /proc/self/maps
rGef1976ccf542: linprocfs: Skip printing of the guard page in the /proc/self/maps

Summary

To calculate the base (lowest addressable) address of the stack of the
initial thread glibc parses /proc/self/maps. In fact, the base address is
calculated as 'to' value of stack entry of /proc/self/maps - stack size
limit (if the stack grows down). The base address should fit in between
preceding entry and stack entry of /proc/self/maps.
In FreeBSD, since 19bd0d9 (Implement address space guards), we actually
have two mappings for the stack region. The first one is the no-access
mapping for the region the stack can grow into (guard page), and the
second - initial stack region with size sgrowsiz.
The first mapping confuses Glibc, in the end which is improperly
calculate stack size and the base address.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 46049
Build 42938: arc lint + arc unit

Event Timeline

dchagin created this revision.Jun 21 2022, 1:07 PM

Herald added a subscriber: imp. · View Herald TranscriptJun 21 2022, 1:07 PM

dchagin requested review of this revision.Jun 21 2022, 1:07 PM

Harbormaster completed remote builds in B46049: Diff 107205.Jun 21 2022, 1:08 PM

dchagin added reviewers: kib, Linux Emulation.Jun 21 2022, 1:09 PM

dchagin changed the repository for this revision from rS FreeBSD src repository - subversion to rG FreeBSD src repository.

It looks like libc wants to find the mapping containing __libc_stack_end

if (sscanf (line, "%" SCNxPTR "-%" SCNxPTR, &from, &to) != 2)
  continue;
if (from <= (uintptr_t) __libc_stack_end
    && (uintptr_t) __libc_stack_end < to)
  {
    /* Found the entry.  Now we have the info we need.  */
    iattr->stackaddr = stack_end;
    iattr->stacksize =
      rl.rlim_cur - (size_t) (to - (uintptr_t) stack_end);

so I am curious why this does not already find only the expected entry?

Oh I see, it keeps track of last_to.

I wonder if instead of just omitting the MAP_ENTRY_GUARD we should keep track of its start/end addresses, and just add it to the subsequent stack region?

Hm, perhaps do it for Linux ABI process only? And please add some comment explaining the reason.

In D35537#806005, @kib wrote:

Hm, perhaps do it for Linux ABI process only? And please add some comment explaining the reason.

Sorry, I did not noticed that this is linprocfs, not procfs. So please add a comment.

sys/compat/linprocfs/linprocfs.c
1313	if ((entry->eflags & (MAP_ENTRY_IS_SUBMAP \| MAP_ENTRY_GUARD)) != 0) continue;

In D35537#806003, @emaste wrote:

Oh I see, it keeps track of last_to.

I wonder if instead of just omitting the MAP_ENTRY_GUARD we should keep track of its start/end addresses, and just add it to the subsequent stack region?

well, in that case we need traverse the maps in the back order, then calculate 'to' of guard page as stack entry 'to' - rlim_cur. I think it's too excessive

fixed, thank you

Harbormaster completed remote builds in B46050: Diff 107211.Jun 21 2022, 3:30 PM

In D35537#806012, @dchagin wrote:

well, in that case we need traverse the maps in the back order, then calculate 'to' of guard page as stack entry 'to' - rlim_cur. I think it's too excessive

I don't think we need to do it in reverse order, if the previous entry was MAP_ENTRY_GUARD and its end is equal to our start just use the previous start instead of ours. Anyway I think this is fine with the comment there and I'm not sure how Linux handles/reports stack guards in procfs.

In D35537#806020, @emaste wrote:

In D35537#806012, @dchagin wrote:

well, in that case we need traverse the maps in the back order, then calculate 'to' of guard page as stack entry 'to' - rlim_cur. I think it's too excessive

I don't think we need to do it in reverse order, if the previous entry was MAP_ENTRY_GUARD and its end is equal to our start just use the previous start instead of ours. Anyway I think this is fine with the comment there and I'm not sure how Linux handles/reports stack guards in procfs.

ah, but
#if _STACK_GROWS_DOWN

		      /* The limit might be too high.  */
		      if ((size_t) iattr->stacksize
			  > (size_t) iattr->stackaddr - last_to)
			iattr->stacksize = (size_t) iattr->stackaddr - last_to;

#else

breaks this logic, the previous entry should be corrected for last_to

kib accepted this revision.Jun 21 2022, 5:37 PM

This revision is now accepted and ready to land.Jun 21 2022, 5:37 PM

I mean still omit the MAP_ENTRY_GUARD as you have, just include its range in the subsequent stack entry - instead of (what we have right now)

...
0000000801600000-0000000801e00000 rw-p 00000000 00:00 0 
00007fffdffff000-00007ffffffdf000 ---p 00000000 00:00 0 
00007ffffffdf000-00007ffffffff000 rw-p 00000000 00:00 0 
00007ffffffff000-0000800000000000 r-xs 00000000 00:00 0

report it as

...
0000000801600000-0000000801e00000 rw-p 00000000 00:00 0 
00007fffdffff000-00007ffffffff000 rw-p 00000000 00:00 0 
00007ffffffff000-0000800000000000 r-xs 00000000 00:00 0

This is assuming we could do that only for the guard associated with a MAP_STACK region - we wouldn't want to fake this for an explicitly requested MAP_GUARD.

In any case I think the change as in review now is fine, I was just thinking about hypothetical Linux applications trying to determine their stack mapping -- in effect the whole guard+stack is the "available" stack region. We can always revisit this (including checking actual Linux behaviour) if it turns out a real Linux application cares.

In D35537#806030, @emaste wrote:

This is assuming we could do that only for the guard associated with a MAP_STACK region - we wouldn't want to fake this for an explicitly requested MAP_GUARD.

In any case I think the change as in review now is fine, I was just thinking about hypothetical Linux applications trying to determine their stack mapping -- in effect the whole guard+stack is the "available" stack region. We can always revisit this (including checking actual Linux behaviour) if it turns out a real Linux application cares.

yep, I got it. I put it on my watch-list.

I see another problem related to the stack.
There is a DSO with a GNU-stack note set for executable stack.
It is loaded, stack marked as executable, method from the DSO execute callback on stack.
Then app put big data on the stack, new non-executable stack mapping is created.
And if app call this DSO method with callback on the stack, it is aborted.