Page MenuHomeFreeBSD

Retrieve LOCALBASE with getlocalbase()
ClosedPublic

Authored by se on Nov 16 2020, 11:47 AM.
Tags
None
Referenced Files
F102368876: D27236.id79596.diff
Mon, Nov 11, 7:51 AM
Unknown Object (File)
Tue, Nov 5, 2:44 PM
Unknown Object (File)
Mon, Nov 4, 7:20 AM
Unknown Object (File)
Wed, Oct 30, 7:44 AM
Unknown Object (File)
Thu, Oct 24, 11:11 AM
Unknown Object (File)
Thu, Oct 17, 7:51 AM
Unknown Object (File)
Wed, Oct 16, 7:45 AM
Unknown Object (File)
Tue, Oct 15, 6:42 AM

Details

Summary

This is a different implementation of getlocalbase() than the one that has been reverted by Scott Long in SVN rev. r367711

Instead of a user provided buffer and calling conventions that require error checks, this version always returns a pointer to a valid string that is suitable as LOCALBASE path (but not validated).

Test Plan

Build libutil with these patches
Call getlocalbase() from a trivial test program that just prints the result returned, varying the available parameters (sysctl value, environment variable) and verify the result.
Build applications (nvmecontrol, mailwrapper, pkg) with patches that make them utilize the getlocalbase() function (provided in a follow-up review).

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

se requested review of this revision.Nov 16 2020, 11:47 AM
hselasky added inline comments.
lib/libutil/getlocalbase.c
43 ↗(On Diff #79596)

Doesn't this static steal memory from all linked programs?

The user.localbase sysctl seems like a poor design, and doesn't exist at the moment. That needs to be resolved before this change is committed.

lib/libutil/getlocalbase.c
31–34 ↗(On Diff #79596)

style(9):

If <sys/cdefs.h> is needed
for __FBSDID(), include it first.  If either <sys/types.h> or
<sys/param.h> is needed, include it before other include files.
(<sys/param.h> includes <sys/types.h>; do not include both.)

That is, delete the redundant sys/types include and move sys/param to the top.

43 ↗(On Diff #79596)

It might be more appropriate for the caller to provide a buffer to store the result.

bapt requested changes to this revision.Nov 16 2020, 5:30 PM
bapt added a subscriber: bapt.
bapt added inline comments.
lib/libutil/getlocalbase.c
48 ↗(On Diff #79596)

This will break the usage of many people and probably the main usage of getenv(LOCALBASE), one of the usage for instance is to build the ports tree locally as a user...

This revision now requires changes to proceed.Nov 16 2020, 5:30 PM
lib/libutil/getlocalbase.c
48 ↗(On Diff #79596)

my bad I mis read the function ;)

In D27236#608193, @cem wrote:

The user.localbase sysctl seems like a poor design, and doesn't exist at the moment. That needs to be resolved before this change is committed.

The advantage of the sysctl design (once you fix the not going to into the kernel for all user. sysctl) is (a) sysctl can get this value for shell scripts and (b) you can have different, per-jail values and (c) you can make it a tunable to avoid the recompile the world to change issue. the disadvantage is that we have to fix libc to not use compiled-in user.* values which effectively negates all the usual advantages of a sysctl. This could be avoided by selecting a different sysctl name space to use, though it's a poor fit to any of the others.

The advantage of getconf is that it doesn't require we fix the user namespace in sysctl and there's a program that can be used to retrieve it. However, sysconf() has to get the value from $SOMEWHERE and there's no way to make the value come from a config file to make it per-jail, which gets us back to the 'recompile the world' or at least a part thereof, to change it issue.

The advantage of just the env var is that it's simple and direct. However, setuid programs can't trust env values and there's no good central place to set it (though if there were, it would easily generalize into being jail friendly). This poor fit motivated the addition of sysctl in the first place.

lib/libutil/getlocalbase.c
43 ↗(On Diff #79596)

Yea,
char *getlocalbase(char *buffer, size_t buflen)
seems better and doesn't suffer the multi-threaded issues (though for this, there are really no MT issues).

But having that interface, though does require extra copies.

In D27236#608246, @imp wrote:
In D27236#608193, @cem wrote:

The user.localbase sysctl seems like a poor design, and doesn't exist at the moment. That needs to be resolved before this change is committed.

The user.localbase sysctl exists in -CURRENT since SVN rev. 367179.
It has not been merged to -STABLE, yet.

The advantage of the sysctl design (once you fix the not going to into the kernel for all user. sysctl) is (a) sysctl can get this value for shell scripts and (b) you can have different, per-jail values and (c) you can make it a tunable to avoid the recompile the world to change issue. the disadvantage is that we have to fix libc to not use compiled-in user.* values which effectively negates all the usual advantages of a sysctl. This could be avoided by selecting a different sysctl name space to use, though it's a poor fit to any of the others.

I have updated libc to still support the R/O user variables while supporting additional R/W variables like user.localbase in the kernel.
See SVN rev, 367243 for the patch that separates the libc-only and the (optionally) kernel provided variables (overriding a prior version that was not as easily extendable).

The advantage of getconf is that it doesn't require we fix the user namespace in sysctl and there's a program that can be used to retrieve it. However, sysconf() has to get the value from $SOMEWHERE and there's no way to make the value come from a config file to make it per-jail, which gets us back to the 'recompile the world' or at least a part thereof, to change it issue.

There is nothing to be fixed - it has been in -CURRENT for more than 2 weeks.

The advantage of just the env var is that it's simple and direct. However, setuid programs can't trust env values and there's no good central place to set it (though if there were, it would easily generalize into being jail friendly). This poor fit motivated the addition of sysctl in the first place.

The env variable LOCALBASE has been supported for this purpose by a (small) number of programs. The getlocalbase() function simplifies the access to this env variable, allows to set a system-wide (possibly jail-wide) override and returns a sane default to reduce the number of checks that need to be performed by the caller.

lib/libutil/getlocalbase.c
31–34 ↗(On Diff #79596)

Yes, I had to add sys/param.h for MAXPATHLEN and forgot to remove the include of sys/types.h at that time.

This will be fixed before the commit.

43 ↗(On Diff #79596)

Doesn't this static steal memory from all linked programs?

Yes, this is a static buffer that will be allocated (once) for programs that link against libutil.
It can easily be converted into a heap allocated variable, if the cost is considered too high.
The advantage of the static buffer is that this function is thread safe, as is, since even if entered multiple times, the result will be correct for each caller.
If a heap allocated variable is used, then there is a very short time window, where two threads could enter and allocate the storage, with one allocation being leaked.
There are ways around this, but I did not want to put the effort into code that currently is linked into 3 programs in base (but with more to come ...).

43 ↗(On Diff #79596)

Yea,
char *getlocalbase(char *buffer, size_t buflen)
seems better and doesn't suffer the multi-threaded issues (though for this, there are really no MT issues).

That would be a signature much like the one that Scott just reverted. It was much more complex, especially with regard to error handling, and required significantly more complex code in the caller.
One problem is, whether a shortened result or NULL is returned if the buffer length is insufficient, and each caller needs to check for errors.

The statically assigned buffer makes this program thread-safe since all threads see the same environment and the same sysctl values (as you already know).
Multiple invocations may cause overlapping calls to sysctl() - but since each invocation copies the same source value to the same destination, I'd expect that even this case is safe.

But having that interface, though does require extra copies.

This interface is an easy drop-in replacement for programs that used to query the environment (for LOCALBASE.
This code always returns a valid result (albeit possibly too long to be used as a path name, if the LOCALBASE env variable has such a long value - much like the getenv() calls it replaces in e.g. pkg or mailwrapper.

43 ↗(On Diff #79596)

Doesn't this static steal memory from all linked programs?

43 ↗(On Diff #79596)

It might be more appropriate for the caller to provide a buffer to store the result.

See the commit history - a version that did just that has been reverted.

48 ↗(On Diff #79596)

The issetugid check is required to prevent privilege escalations and is identical to e.g. the prevention of LD_PRELOAD being used for SUID binaries.

Updated diff with an alternate implementation that does not include a static buffer of length MAXPATHLEN.
(The actual memory requirements of that alternative implementation are larger than those of the version with the static buffer, though.)

I want to preserve the non-failing semantics of this functions, to make it safe to use without error checks in the caller.
Therefore, if the malloc() fails, the compiled in default path is returned.
An alternative could be to return NULL in that case (with errno being set to ENOMEM by the failed malloc call).

An attempt is made to prevent a memory leak in the case of a parallel execution of this function in multiple threads.
The version with the static buffer will have multiple sysctl() calls write the same data to the same buffer.
The version that allocates a buffer will in the worst case leak the amount of heap space allocated by one thread when another thread overwrites the localbase pointer.

I do prefer the version with the static buffer because of its simplicity. It adds 1 KB to the bss of the about 200 programs in base that are linked with libutil (but see below).
The version with the dynamically allocated buffer is more complex and thus adds to the code size, but that code is shared between all executing programs.

I do assume (without checking) that the bss of dynamic libraries is allocated independently for each library (not pooled with the bss of the program).
In that case, the static buffer version may need slightly less memory than the one with the dynamically allocated buffer, since either version will have a bss of size 8KB on amd64:

Version with static buffer:

$ size /usr/lib/libutil.so 
   text	   data	    bss	    dec	    hex	filename
  72986	   3120	   7952	  84058	  1485a	/usr/lib/libutil.so

Version with malloc():

$ size /usr/lib/libutil.so 
   text	   data	    bss	    dec	    hex	filename
  73082	   3120	   6944	  83146	  144ca	/usr/lib/libutil.so
se marked 3 inline comments as done.Nov 17 2020, 9:31 AM
lib/libutil/getlocalbase.c
46 ↗(On Diff #79641)

This should be "static const"

74 ↗(On Diff #79641)

"static const" here.

90 ↗(On Diff #79641)

One simple way to avoid issue with races, is to use the constructor keyword on a separate function, that only initializes "localbase". Then localbase only needs to return a pointer.

Hans Petter Selasky suggested the use of a constructor, something that I had not thought about at all ...

This makes us depend on a compiler that supports __attribute ((constructor)), but that has been in GCC for more than a decade and is provided in a compatible way in CLANG, too.

The resulting code is quite simple and brings does not require allocation of a static buffer:

$ size /usr/lib/libutil.so 
   text	   data	    bss	    dec	    hex	filename
  73122	   3128	   6928	  83178	  144ea	/usr/lib/libutil.so

If malloc() fails, the compiled in default value is returned, which should be a safe default even on systems that have a different LOCALBASE (since non-privileged users should not be able to create files in /usr/local).

se marked 4 inline comments as done.Nov 17 2020, 1:30 PM

Looks good to me. I hope it does not affect the performance of any apps, that are launched frequently.

Looks good to me. I hope it does not affect the performance of any apps, that are launched frequently.

Yes, I had thought about this but then forgotten to mention it in the comment to that version. This is the only drawback of this approach ...
The getenv() call should be cheap, but the 2 sysctl() invocations call into the kernel.
But compared to all the other start-up overhead I do not think that the effect is relevant in practice.

But you cannot have both, an on-demand invocation and guaranteed single invocation in a threaded process.
An alternative would be to use atomics or locking in an on-demand invocation, but I'd rather not add that complexity.

If you ktrace a regular application starting up you already see a lot of system calls, so I think adding one more is fine.

MIght be we also need a destructor to free the memory, if any, to make valgrind happy.

If you ktrace a regular application starting up you already see a lot of system calls, so I think adding one more is fine.

Except there's many efforts to reduce this. At present, this function is called by a tiny number of binaries, so I'd rather have any overhead be limited to those programs that use it.

honestly, I think we're through the looking glass with this discussion...

MIght be we also need a destructor to free the memory, if any, to make valgrind happy.

I could add a destructor, but this would complicate the code just to make a test tool happy ...

I'd need 2 temporary pointers then, one for getenv() - not to be freed - and one for sysctl.
I could assign NULL to tmppath after assigning to localbase in the getenv case, but this might be hard to understand (and somebody might later think that this assignment was not required, remove it, and make the getenv case crash in the destructor when free is applied to LOCALBASE in the environment).

If there is strong demand, I'll add the destructor.
But I really think an assumed valgrind user ought to be able to understand that this left-over variable is benign ...
(And anyway: It does only affect valgrind runs with non-default user.localbase setting in the kernel ...)

In D27236#608678, @imp wrote:

If you ktrace a regular application starting up you already see a lot of system calls, so I think adding one more is fine.

Except there's many efforts to reduce this. At present, this function is called by a tiny number of binaries, so I'd rather have any overhead be limited to those programs that use it.

+1

We don't want to saddle every program that links libutil with a constructor like this. Instead, construct on demand. This could be relatively trivial: drop the ctor attribute, and instead:

getlocalbase(void)
{
    static bool inited = false;
    if (!inited) {
        initlocalbase();
        inited = true;
    }
    return (foo);
}

That said, I still think this design is broken and a bad idea.

In D27236#608681, @cem wrote:
In D27236#608678, @imp wrote:

If you ktrace a regular application starting up you already see a lot of system calls, so I think adding one more is fine.

Except there's many efforts to reduce this. At present, this function is called by a tiny number of binaries, so I'd rather have any overhead be limited to those programs that use it.

+1

We don't want to saddle every program that links libutil with a constructor like this. Instead, construct on demand. This could be relatively trivial: drop the ctor attribute, and instead:

getlocalbase(void)
{
    static bool inited = false;
    if (!inited) {
        initlocalbase();
        inited = true;
    }
    return (foo);
}

That said, I still think this design is broken and a bad idea.

There's no benefit from optimizing this at all. None. This is called once, maybe twice in a few programs. The added complexity isn't worth the maint hassles, nor the added risk from there being an 'oops' in the code leading to some security hole.

In D27236#608678, @imp wrote:

If you ktrace a regular application starting up you already see a lot of system calls, so I think adding one more is fine.

Except there's many efforts to reduce this. At present, this function is called by a tiny number of binaries, so I'd rather have any overhead be limited to those programs that use it.

This affects only programs that link against libutil, some 200 of 1000 programs in base.

honestly, I think we're through the looking glass with this discussion...

There is either a statically allocated 1KB buffer or 2 system calls at start-up, or we go back to the 2nd version I suggested (with malloc, but before the one with the constructor).

We could easily change the implementation (given the call signature is kept) without problem for programs converted to use getlocalbase().
But I want to get us started using that feature ...

In D27236#608681, @cem wrote:
In D27236#608678, @imp wrote:

If you ktrace a regular application starting up you already see a lot of system calls, so I think adding one more is fine.

Except there's many efforts to reduce this. At present, this function is called by a tiny number of binaries, so I'd rather have any overhead be limited to those programs that use it.

+1

We don't want to saddle every program that links libutil with a constructor like this. Instead, construct on demand. This could be relatively trivial: drop the ctor attribute, and instead:

I had suggested 2 versions that did not rely on a constructor, before.

getlocalbase(void)
{
    static bool inited = false;
    if (!inited) {
        initlocalbase();
        inited = true;
    }
    return (foo);
}

This is similar to my 2nd suggested version.
It is not multi-threading safe (think multiple threads started on different cores at exactly the same time).
But the effect of multiple simultaneous invocations can be limited to a small memory leak, or atomics or locks could be used for a "safe" implementation.
Reducing the window was possible by setting inited to true before calling initlocalbase(), but my prior implementation was simpler.

That said, I still think this design is broken and a bad idea.

Why is it a bad idea?

A version that used a caller-provided buffer has been reverted after a number of iterations to get its details right.
It allowed to strcat to the end of the string returned by getlocalbase(), but only after checking for errors.

This version is trivial to use in programs that already use getenv("LOCALBASE") to retrieve a user-supplied path, and this is a common pattern, see D27237 for the required patches and compare with the reverted once that used the prior function signature of the reverted implementation.

Since neither the statically allocated 1024 char buffer nor the version with the constructor has made the reviewers happy, I'm going back to the 2nd proposed version, which does only perform system calls when this function is actually called and allocates a minimally sized buffer only when necessary to hold the sysctl result.

Overlapping invocation in multi-threaded programs is possible and will return a correct result, but might leak the heap memory allocated in one of the threads, in the worst case (and only on systems that have a non-default user.localbase sysctl value).

Calling this function from within threads is not really useful (the initialization should happen in the main thread) and not relevant for any of the programs that have been prepared to call this function.

If there are no strong objections, I'm going to commit this version and its invocations as offered for review in D27237.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 18 2020, 7:44 PM
This revision was automatically updated to reflect the committed changes.