Page MenuHomeFreeBSD

nfs, rpc: Ensure kernel credentials have at least one group
ClosedPublic

Authored by olce on Oct 4 2024, 8:10 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 2, 11:51 PM
Unknown Object (File)
Thu, Oct 31, 8:59 AM
Unknown Object (File)
Thu, Oct 10, 1:44 AM
Unknown Object (File)
Wed, Oct 9, 4:32 AM
Unknown Object (File)
Oct 7 2024, 9:01 PM
Unknown Object (File)
Oct 5 2024, 6:50 AM
Subscribers

Details

Summary

This fixes several bugs where some 'struct ucred' in the kernel,
constructed from user input (via nmount(2)) or obtained from other
servers (e.g., gssd(8)), could have an unfilled 'cr_groups' field and
whose 'cr_groups[0]' (or 'cr_gid', which is an alias) was later
accessed, causing an uninitialized access giving random access rights.

Use crsetgroups_empty_fallback() to enforce a fallback group when
possible. For NFS, the chosen fallback group is that of the NFS server
in the current VNET (NFSD_VNET(nfsrv_defaultgid)).

There does not seem to be any sensible fallback available in rpc code
(sys/rpc/svc_auth.c, svc_getcred()) on AUTH_UNIX (TLS or not), so just
fail credential retrieval there. Stock NSS sources, rpc.tlsservd(8) or
rpc.tlsclntd(8) provide non-empty group lists, so will not be impacted.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

olce requested review of this revision.Oct 4 2024, 8:10 AM

You claim there are bugs that result in no groups
(cr_ngroups == 0), but you do not show a specific
test case that finds this or describe exactly how it
happens.
--> If there is such a case (I am not aware of one),

we should try to isolate/fix that such that there
is always at least one group in the credentials.

(There is always at least one group in the credentials
that come on on-the-wire, since there is a separate gid
in the Sun RPC AUTH_UNIX authenticator from the
list of additional groups.)

For the userpsace upcalls, the credentials are generated
from getpw{uid or name}() and there is always a pw_gid
in that passwd file entry, so the daemons should never
create a credential without at least one gid.
--> The exception might be mountd when the credentials

are listed as numbers for -maproot/-mapall.
We can check the code for that specific case,
although I doubt it is broken.

When/how do you detect the case where there is no
groups?

sys/rpc/svc_auth.c
206

This can never happen unless there is some sort of kernel
memory corruption.
If you look in _svcauth_unix() in sys/rpc/svc_auth_unix.c,
you will see that cr_ngroups is never set to less than 1.

That also implies that there will never be a case where
there are no gids.

If you are convinced that you have seen this, you can
add a KASSERT() and try and figure out how it can be
triggerred.

You claim there are bugs that result in no groups
(cr_ngroups == 0), but you do not show a specific
test case that finds this or describe exactly how it
happens.

See, e.g., current processing of a list uid:gid:.. when just uid: is passed (no GID, but a colon at the end), and the below patch. This possibility is incidentally documented as the way to specify no groups in the exports(5) man page (under -maproot).

All other cases I have seen are of insufficient validation of userland-provided input to the kernel, namely by gssd and rpc.tlsservd, so aren't bugs in these daemons, but rather kernel security bugs.

--> If there is such a case (I am not aware of one),

we should try to isolate/fix that such that there
is always at least one group in the credentials.

I have to disagree. We are talking here about the userland <-> kernel interface. Even if you're right that, normally:

(There is always at least one group in the credentials
that come on on-the-wire, since there is a separate gid
in the Sun RPC AUTH_UNIX authenticator from the
list of additional groups.)

(having learnt this the hard way, by spending between 1 and 2 days full-time diving into and analyzing (part of) RPC/NFS code related to authentication),
this is not a reason not to validate inputs in the kernel.

On the contrary, from the kernel side, we can never know if:

  • Some userland daemon has been compromised, and to which extent.
  • Whether the base-system-provided daemons have been entirely replaced, on purpose or maliciously.

Additionally, NFS in the kernel is quite a big pile of code, with lots of ramifications, special cases and complex call stacks. Relying on it working perfectly doesn't seem to be an appropriate strategy security-wise, I'd rather use a defensive approach, especially when it comes to invariants that are vital to correct functioning of other parts of the kernel.

For the userpsace upcalls, the credentials are generated
from getpw{uid or name}() and there is always a pw_gid
in that passwd file entry, so the daemons should never
create a credential without at least one gid.

Yes, they should. Provided they can't be forced to misbehave or are not replaced entirely with other implementations. I don't think it's a risk we should take in the kernel, however small.

--> The exception might be mountd when the credentials

are listed as numbers for -maproot/-mapall.
We can check the code for that specific case,
although I doubt it is broken.

Unless I'm mistaken, it is. I first produced a two-line patch to mountd.c to fix it:

    mountd(8): parsecred(): Credentials must always have one group
    
    Make sure that this is the case even when '<uid>:' (with no groups after
    the ':') is passed as an argument to '-maproot' or '-mapall'.

diff --git a/usr.sbin/mountd/mountd.c b/usr.sbin/mountd/mountd.c
index 81b038d18b56..eacb8146f467 100644
--- a/usr.sbin/mountd/mountd.c
+++ b/usr.sbin/mountd/mountd.c
@@ -3694,6 +3694,13 @@ parsecred(char *namelist, struct expcred *cr)
                }
                groups[cr->cr_ngroups++] = group;
        }
+       if (cr->cr_ngroups == 0)
+               /*
+                * 'groups' above was initialized with GID_NOGROUP in
+                * 'groups[0]'.  If no group was explicitly configured, we'll
+                * copy that one.  The kernel expects at least an effective GID.
+                */
+               cr->cr_ngroups = 1;
        if (cr->cr_ngroups > SMALLNGROUPS)
                cr->cr_groups = malloc(cr->cr_ngroups * sizeof(gid_t));
        memcpy(cr->cr_groups, groups, cr->cr_ngroups * sizeof(gid_t));

But then, for the reasons exposed above and just below, I considered that fix inappropriate and did not upload it.

Another, more specific reason is: Why does nmount() should decide which fallback group is supposed to represent an absence of group (given that, in the kernel, no group isn't a possibility in struct ucred), when there is one kernel-only NFS setting exactly meant to control this information (AFAICT)? For consistency, I think it's natural that this setting should apply, shouldn't it?

When/how do you detect the case where there is no groups?

In this series of revision, I'm adding assertions to the internal groups machinery precisely to prevent access to cr_gid (alias cr_groups[0]) without it having been initialized beforehand. Is that what you are asking?

Such uninitialized accesses (but no faults, as there is always some backing storage, cr_smallgroups) could, more often than not, lead to consider that wheel is associated to the credentials as the effective GID. This situation is a significant security risk, and should be prevented.

You claim there are bugs that result in no groups
(cr_ngroups == 0), but you do not show a specific
test case that finds this or describe exactly how it
happens.

See, e.g., current processing of a list uid:gid:.. when just uid: is passed (no GID, but a colon at the end), and the below patch. This possibility is incidentally documented as the way to specify no groups in the exports(5) man page (under -maproot).

Yuck! I never noticed this in the man page (I'm not going to look and
see who the author was, it might have been me long ago;-).

So, it says that an exports mapping credential can have no groups.

The easiest fix might be to just add a couple of lines at the beginning of grouplist()
retuning failed when cr_ngroups == 0.

I think you need to ask the "collective" (on a mailing list, such as freebsd-stable@
or freebsd-current@) what should be done about this.
I.e:
--> Fix grouplist() and any others to handle the case of cr_ngroups == 0 correctly.
OR
--> Fix the exports(5) man page and do not allow the case of "no groups".

rick

All other cases I have seen are of insufficient validation of userland-provided input to the kernel, namely by gssd and rpc.tlsservd, so aren't bugs in these daemons, but rather kernel security bugs.

--> If there is such a case (I am not aware of one),

we should try to isolate/fix that such that there
is always at least one group in the credentials.

I have to disagree. We are talking here about the userland <-> kernel interface. Even if you're right that, normally:

(There is always at least one group in the credentials
that come on on-the-wire, since there is a separate gid
in the Sun RPC AUTH_UNIX authenticator from the
list of additional groups.)

(having learnt this the hard way, by spending between 1 and 2 days full-time diving into and analyzing (part of) RPC/NFS code related to authentication),
this is not a reason not to validate inputs in the kernel.

On the contrary, from the kernel side, we can never know if:

  • Some userland daemon has been compromised, and to which extent.
  • Whether the base-system-provided daemons have been entirely replaced, on purpose or maliciously.

Additionally, NFS in the kernel is quite a big pile of code, with lots of ramifications, special cases and complex call stacks. Relying on it working perfectly doesn't seem to be an appropriate strategy security-wise, I'd rather use a defensive approach, especially when it comes to invariants that are vital to correct functioning of other parts of the kernel.

For the userpsace upcalls, the credentials are generated
from getpw{uid or name}() and there is always a pw_gid
in that passwd file entry, so the daemons should never
create a credential without at least one gid.

Yes, they should. Provided they can't be forced to misbehave or are not replaced entirely with other implementations. I don't think it's a risk we should take in the kernel, however small.

--> The exception might be mountd when the credentials

are listed as numbers for -maproot/-mapall.
We can check the code for that specific case,
although I doubt it is broken.

Unless I'm mistaken, it is. I first produced a two-line patch to mountd.c to fix it:

    mountd(8): parsecred(): Credentials must always have one group
    
    Make sure that this is the case even when '<uid>:' (with no groups after
    the ':') is passed as an argument to '-maproot' or '-mapall'.

diff --git a/usr.sbin/mountd/mountd.c b/usr.sbin/mountd/mountd.c
index 81b038d18b56..eacb8146f467 100644
--- a/usr.sbin/mountd/mountd.c
+++ b/usr.sbin/mountd/mountd.c
@@ -3694,6 +3694,13 @@ parsecred(char *namelist, struct expcred *cr)
                }
                groups[cr->cr_ngroups++] = group;
        }
+       if (cr->cr_ngroups == 0)
+               /*
+                * 'groups' above was initialized with GID_NOGROUP in
+                * 'groups[0]'.  If no group was explicitly configured, we'll
+                * copy that one.  The kernel expects at least an effective GID.
+                */
+               cr->cr_ngroups = 1;
        if (cr->cr_ngroups > SMALLNGROUPS)
                cr->cr_groups = malloc(cr->cr_ngroups * sizeof(gid_t));
        memcpy(cr->cr_groups, groups, cr->cr_ngroups * sizeof(gid_t));

But then, for the reasons exposed above and just below, I considered that fix inappropriate and did not upload it.

Another, more specific reason is: Why does nmount() should decide which fallback group is supposed to represent an absence of group (given that, in the kernel, no group isn't a possibility in struct ucred), when there is one kernel-only NFS setting exactly meant to control this information (AFAICT)? For consistency, I think it's natural that this setting should apply, shouldn't it?

When/how do you detect the case where there is no groups?

In this series of revision, I'm adding assertions to the internal groups machinery precisely to prevent access to cr_gid (alias cr_groups[0]) without it having been initialized beforehand. Is that what you are asking?

Such uninitialized accesses (but no faults, as there is always some backing storage, cr_smallgroups) could, more often than not, lead to consider that wheel is associated to the credentials as the effective GID. This situation is a significant security risk, and should be prevented.

You claim there are bugs that result in no groups
(cr_ngroups == 0), but you do not show a specific
test case that finds this or describe exactly how it
happens.

See, e.g., current processing of a list uid:gid:.. when just uid: is passed (no GID, but a colon at the end), and the below patch. This possibility is incidentally documented as the way to specify no groups in the exports(5) man page (under -maproot).

Yuck! I never noticed this in the man page (I'm not going to look and
see who the author was, it might have been me long ago;-).

So, it says that an exports mapping credential can have no groups.

The easiest fix might be to just add a couple of lines at the beginning of grouplist()
retuning failed when cr_ngroups == 0.

I think you need to ask the "collective" (on a mailing list, such as freebsd-stable@
or freebsd-current@) what should be done about this.
I.e:
--> Fix grouplist() and any others to handle the case of cr_ngroups == 0 correctly.
OR
--> Fix the exports(5) man page and do not allow the case of "no groups".

Oops, when I said grouplist() above, I meant groupmember(), which I
think is the only function that checks to see if a credential has a group
that matches the group ownership of the file (or the ACE in the ACL for
the file).

rick

All other cases I have seen are of insufficient validation of userland-provided input to the kernel, namely by gssd and rpc.tlsservd, so aren't bugs in these daemons, but rather kernel security bugs.

--> If there is such a case (I am not aware of one),

we should try to isolate/fix that such that there
is always at least one group in the credentials.

I have to disagree. We are talking here about the userland <-> kernel interface. Even if you're right that, normally:

(There is always at least one group in the credentials
that come on on-the-wire, since there is a separate gid
in the Sun RPC AUTH_UNIX authenticator from the
list of additional groups.)

(having learnt this the hard way, by spending between 1 and 2 days full-time diving into and analyzing (part of) RPC/NFS code related to authentication),
this is not a reason not to validate inputs in the kernel.

On the contrary, from the kernel side, we can never know if:

  • Some userland daemon has been compromised, and to which extent.
  • Whether the base-system-provided daemons have been entirely replaced, on purpose or maliciously.

Additionally, NFS in the kernel is quite a big pile of code, with lots of ramifications, special cases and complex call stacks. Relying on it working perfectly doesn't seem to be an appropriate strategy security-wise, I'd rather use a defensive approach, especially when it comes to invariants that are vital to correct functioning of other parts of the kernel.

For the userpsace upcalls, the credentials are generated
from getpw{uid or name}() and there is always a pw_gid
in that passwd file entry, so the daemons should never
create a credential without at least one gid.

Yes, they should. Provided they can't be forced to misbehave or are not replaced entirely with other implementations. I don't think it's a risk we should take in the kernel, however small.

--> The exception might be mountd when the credentials

are listed as numbers for -maproot/-mapall.
We can check the code for that specific case,
although I doubt it is broken.

Unless I'm mistaken, it is. I first produced a two-line patch to mountd.c to fix it:

    mountd(8): parsecred(): Credentials must always have one group
    
    Make sure that this is the case even when '<uid>:' (with no groups after
    the ':') is passed as an argument to '-maproot' or '-mapall'.

diff --git a/usr.sbin/mountd/mountd.c b/usr.sbin/mountd/mountd.c
index 81b038d18b56..eacb8146f467 100644
--- a/usr.sbin/mountd/mountd.c
+++ b/usr.sbin/mountd/mountd.c
@@ -3694,6 +3694,13 @@ parsecred(char *namelist, struct expcred *cr)
                }
                groups[cr->cr_ngroups++] = group;
        }
+       if (cr->cr_ngroups == 0)
+               /*
+                * 'groups' above was initialized with GID_NOGROUP in
+                * 'groups[0]'.  If no group was explicitly configured, we'll
+                * copy that one.  The kernel expects at least an effective GID.
+                */
+               cr->cr_ngroups = 1;
        if (cr->cr_ngroups > SMALLNGROUPS)
                cr->cr_groups = malloc(cr->cr_ngroups * sizeof(gid_t));
        memcpy(cr->cr_groups, groups, cr->cr_ngroups * sizeof(gid_t));

But then, for the reasons exposed above and just below, I considered that fix inappropriate and did not upload it.

Another, more specific reason is: Why does nmount() should decide which fallback group is supposed to represent an absence of group (given that, in the kernel, no group isn't a possibility in struct ucred), when there is one kernel-only NFS setting exactly meant to control this information (AFAICT)? For consistency, I think it's natural that this setting should apply, shouldn't it?

When/how do you detect the case where there is no groups?

In this series of revision, I'm adding assertions to the internal groups machinery precisely to prevent access to cr_gid (alias cr_groups[0]) without it having been initialized beforehand. Is that what you are asking?

Such uninitialized accesses (but no faults, as there is always some backing storage, cr_smallgroups) could, more often than not, lead to consider that wheel is associated to the credentials as the effective GID. This situation is a significant security risk, and should be prevented.

Yuck! I never noticed this in the man page (I'm not going to look and
see who the author was, it might have been me long ago;-).

So, it says that an exports mapping credential can have no groups.

The easiest fix might be to just add a couple of lines at the beginning of grouplist()
retuning failed when cr_ngroups == 0.

I think you need to ask the "collective" (on a mailing list, such as freebsd-stable@
or freebsd-current@) what should be done about this.
I.e:
--> Fix grouplist() and any others to handle the case of cr_ngroups == 0 correctly.
OR
--> Fix the exports(5) man page and do not allow the case of "no groups".

Oops, when I said grouplist() above, I meant groupmember(), which I
think is the only function that checks to see if a credential has a group
that matches the group ownership of the file (or the ACE in the ACL for
the file).

groupmember() (and realgroupmember() I added a while ago) are just the tip of the iceberg. The harder problem is that cr_groups[0] (or its alias, cr_gid) is accessed in multiple places without any checks. And even adding checks can only go so far, i.e., will cause crash (or malfunction in production if checks are under INVARIANTS) only in very specific cases that may be hard to trigger. There are simply places where not having a GID is not a possibility, especially for process credentials. struct ucred is arguably used for other purposes that may not formally require a GID (and other stuff like MAC labels), which would plead in favor of having other structures/simplified access check functions for other cases, which you had to do for the NFS interfaces. I doubt that trying to remove the one-group minimum (i.e., compulsory effective GID) requirement is worth it without also doing the latter part. Just moving the required one minimum slot from cr_groups to a really separate cr_gid field alone is probably not worth the trouble. In any case, I don't have time to devote to that for now.

I think a much less time-consuming alternative would be to sanctify the use of some special GID value as effectively marking no group membership, meaning, e.g., access control functions deny access to files having nogroup as their group, even if the passed credentials have nogroup in their GID set, i.e., treat it really specifically in the kernel.

Yuck! I never noticed this in the man page (I'm not going to look and
see who the author was, it might have been me long ago;-).

So, it says that an exports mapping credential can have no groups.

The easiest fix might be to just add a couple of lines at the beginning of grouplist()
retuning failed when cr_ngroups == 0.

I think you need to ask the "collective" (on a mailing list, such as freebsd-stable@
or freebsd-current@) what should be done about this.
I.e:
--> Fix grouplist() and any others to handle the case of cr_ngroups == 0 correctly.
OR
--> Fix the exports(5) man page and do not allow the case of "no groups".

Oops, when I said grouplist() above, I meant groupmember(), which I
think is the only function that checks to see if a credential has a group
that matches the group ownership of the file (or the ACE in the ACL for
the file).

groupmember() (and realgroupmember() I added a while ago) are just the tip of the iceberg. The harder problem is that cr_groups[0] (or its alias, cr_gid) is accessed in multiple places without any checks. And even adding checks can only go so far, i.e., will cause crash (or malfunction in production if checks are under INVARIANTS) only in very specific cases that may be hard to trigger. There are simply places where not having a GID is not a possibility, especially for process credentials. struct ucred is arguably used for other purposes that may not formally require a GID (and other stuff like MAC labels), which would plead in favor of having other structures/simplified access check functions for other cases, which you had to do for the NFS interfaces. I doubt that trying to remove the one-group minimum (i.e., compulsory effective GID) requirement is worth it without also doing the latter part. Just moving the required one minimum slot from cr_groups to a really separate cr_gid field alone is probably not worth the trouble. In any case, I don't have time to devote to that for now.

Well, the NFS server does not use these mapped exported credentials in many ways.
As far as I can see at a quick look, groupmember(), which is called from VOP_ACCESS()/VOP_ACCESSX()
is about it.
Since the exports(5) man page suggests that this is a valid case, then I think the code should be fixed
to handle cr_ngroups == 0 for this specific case (not all uses of cr_ngroups).
Note that this appears to have been broken by commit 7f92e57 on Jun 20, 2009.
(Prior to that, groupmember would always return failure for cr_ngroups == 0.)

I think the first step is to fix groupmember() by adding something like:

/*
 * The NFS server can use a credential with cr_ngroups == 0 when using a mapped
 * exported credential.
 */
if (cred->cr_ngroups == 0)
    return (false);

at the beginning of groupmember().
Since the code in vfs_domount_update() sets export.ex_groups = NULL, it should
be easy to find any other places that need to be fixed by running some testing.
(I will start doing that to-day.)

Making the credentials into a grouplist of one element set to GID_NOGROUP is
not the best plan, since it is possible to have a file with a gid ownership of GID_NOGROUP.
(It can happen frequently when NFSv4 is misconfigured, so the Owner_group translates
to GID_NOGROUP, since it cannot do the correct translation.)

The above change is obviously safe and can be MFC'd.

After doing this, you can ask a mailing list if the "collective" thinks supporting the
case of a 0 groups exported mapped credential makes sense.
If the "collective" opinion is no, then you can proceed to change the code
appropriately. (Fix mountd(8) and exports(5) to not allow that case.)

I think a much less time-consuming alternative would be to sanctify the use of some special GID value as effectively marking no group membership, meaning, e.g., access control functions deny access to files having nogroup as their group, even if the passed credentials have nogroup in their GID set, i.e., treat it really specifically in the kernel.

Yuck! I never noticed this in the man page (I'm not going to look and
see who the author was, it might have been me long ago;-).

So, it says that an exports mapping credential can have no groups.

The easiest fix might be to just add a couple of lines at the beginning of grouplist()
retuning failed when cr_ngroups == 0.

I think you need to ask the "collective" (on a mailing list, such as freebsd-stable@
or freebsd-current@) what should be done about this.
I.e:
--> Fix grouplist() and any others to handle the case of cr_ngroups == 0 correctly.
OR
--> Fix the exports(5) man page and do not allow the case of "no groups".

Oops, when I said grouplist() above, I meant groupmember(), which I
think is the only function that checks to see if a credential has a group
that matches the group ownership of the file (or the ACE in the ACL for
the file).

groupmember() (and realgroupmember() I added a while ago) are just the tip of the iceberg. The harder problem is that cr_groups[0] (or its alias, cr_gid) is accessed in multiple places without any checks. And even adding checks can only go so far, i.e., will cause crash (or malfunction in production if checks are under INVARIANTS) only in very specific cases that may be hard to trigger. There are simply places where not having a GID is not a possibility, especially for process credentials. struct ucred is arguably used for other purposes that may not formally require a GID (and other stuff like MAC labels), which would plead in favor of having other structures/simplified access check functions for other cases, which you had to do for the NFS interfaces. I doubt that trying to remove the one-group minimum (i.e., compulsory effective GID) requirement is worth it without also doing the latter part. Just moving the required one minimum slot from cr_groups to a really separate cr_gid field alone is probably not worth the trouble. In any case, I don't have time to devote to that for now.

Well, the NFS server does not use these mapped exported credentials in many ways.
As far as I can see at a quick look, groupmember(), which is called from VOP_ACCESS()/VOP_ACCESSX()
is about it.
Since the exports(5) man page suggests that this is a valid case, then I think the code should be fixed
to handle cr_ngroups == 0 for this specific case (not all uses of cr_ngroups).
Note that this appears to have been broken by commit 7f92e57 on Jun 20, 2009.
(Prior to that, groupmember would always return failure for cr_ngroups == 0.)

I think the first step is to fix groupmember() by adding something like:

/*
 * The NFS server can use a credential with cr_ngroups == 0 when using a mapped
 * exported credential.
 */
if (cred->cr_ngroups == 0)
    return (false);

at the beginning of groupmember().
Since the code in vfs_domount_update() sets export.ex_groups = NULL, it should
be easy to find any other places that need to be fixed by running some testing.
(I will start doing that to-day.)

Making the credentials into a grouplist of one element set to GID_NOGROUP is
not the best plan, since it is possible to have a file with a gid ownership of GID_NOGROUP.
(It can happen frequently when NFSv4 is misconfigured, so the Owner_group translates
to GID_NOGROUP, since it cannot do the correct translation.)

The above change is obviously safe and can be MFC'd.

I did test the above patch along with a "hack" that set cr_groups = NULL,
so any access would cause a crash.
The only one I got was from a totally bogus line in nfsd_excred(0 which
looks like: nd->nd_cred->cr_gid = credanon->cr_gid;
just before crsetgroups(), so it is harmless noise (left over from code meant
to work for BSDen that had a separate cr_gid.)

By inspection, there are also two places in nfs_nfsdsubs.c where the comparison
against cr_gid is done before calling groupmember(). Those can easily be fixed
by either me or you.

After doing this, you can ask a mailing list if the "collective" thinks supporting the
case of a 0 groups exported mapped credential makes sense.
If the "collective" opinion is no, then you can proceed to change the code
appropriately. (Fix mountd(8) and exports(5) to not allow that case.)

I think a much less time-consuming alternative would be to sanctify the use of some special GID value as effectively marking no group membership, meaning, e.g., access control functions deny access to files having nogroup as their group, even if the passed credentials have nogroup in their GID set, i.e., treat it really specifically in the kernel.

The problem is "how do you find a value that no one will ever use?".
Since group databases are often on things like LDAP and used by multiple different
OSs, it won't be easy to find a value that no one ever uses for a group and keep it
that way.
--> "nogroup" is now used in assorted ways as a default group, for example.

Again, I think it is time that this be discussed with others on a mailing list.
Which one? I'd say any of freebsd-current@, freebsd-fs@ or maybe freebsd-stable@.
(freebsd-stable@ doesn't get read by as many developers, so I'd probably choose
one of the others.)

If I do not see a post from you in a couple of days, I'll do one and cc you.

Well, the NFS server does not use these mapped exported credentials in many ways.
As far as I can see at a quick look, groupmember(), which is called from VOP_ACCESS()/VOP_ACCESSX()
is about it.

I reached the same conclusion during my analysis.

Since the exports(5) man page suggests that this is a valid case, then I think the code should be fixed
to handle cr_ngroups == 0 for this specific case (not all uses of cr_ngroups).
Note that this appears to have been broken by commit 7f92e57 on Jun 20, 2009.
(Prior to that, groupmember would always return failure for cr_ngroups == 0.)

That this was broken by this commit confirms the view I developed above that people generally have assumed that at least one group is always present (outside NFS).

I'm not saying that this can't be changed, on the contrary I can support that (see below). It's just that, if going this way, we have to communicate clearly (comments, man page revision), and I'm then insisting that we change all direct references to cr_gid and cr_groups[0] with an accessor function performing a sanity check.

I think the first step is to fix groupmember() by adding something like:

/*
 * The NFS server can use a credential with cr_ngroups == 0 when using a mapped
 * exported credential.
 */
if (cred->cr_ngroups == 0)
    return (false);

at the beginning of groupmember().

We can easily do that, but again that's the tip of the iceberg. That said, the iceberg is probably not big, contrary to what I was thinking initially, so going to tackle it.

Making the credentials into a grouplist of one element set to GID_NOGROUP is
not the best plan, since it is possible to have a file with a gid ownership of GID_NOGROUP.
(It can happen frequently when NFSv4 is misconfigured, so the Owner_group translates
to GID_NOGROUP, since it cannot do the correct translation.)

Yes, that's why I said:

I think a much less time-consuming alternative would be to sanctify the use of some special GID value as effectively marking no group membership, meaning, e.g., access control functions deny access to files having nogroup as their group, even if the passed credentials have nogroup in their GID set, i.e., treat it really specifically in the kernel.

From the point of view of NFS, I agree this is more a hack than something really satisfying, for the reason you explained:

The problem is "how do you find a value that no one will ever use?".
Since group databases are often on things like LDAP and used by multiple different
OSs, it won't be easy to find a value that no one ever uses for a group and keep it
that way.
--> "nogroup" is now used in assorted ways as a default group, for example.

However, I don't see how the mention of LDAP in this passage, if used as a system's users and groups membership provider, is relevant, since in this case it should have a primary group per user, regardless of the OS, shouldn't it? And, to reiterate, not having at least a group for *process* credentials is not an option (POSIX, compatibility, etc.).

After doing this, you can ask a mailing list if the "collective" thinks supporting the
case of a 0 groups exported mapped credential makes sense.

It seems to me you're the best person to answer that, but we can ask if you prefer.

I think the question for the collective is rather about allowing struct ucred without groups, and acknowledging its multiple, conceptually slightly different, uses.

Again, I think it is time that this be discussed with others on a mailing list.
Which one? I'd say any of freebsd-current@, freebsd-fs@ or maybe freebsd-stable@.
(freebsd-stable@ doesn't get read by as many developers, so I'd probably choose
one of the others.)

Wouldn't freebsd-hackers@ be more appropriate? Not sure about the policy, but I probably can cross-post anyway (to a limited number of them).

If I do not see a post from you in a couple of days, I'll do one and cc you.

Sending a mail is certainly a good idea, but I think doing so right now is slightly premature, as it's probably better to evaluate further the alternatives first. Having listed uses of cr_gid and cr_groups[0], I think changing all of them is tractable (thanks in part to factored out code in some places, such as vaccess()) and may be acceptable.

So please first let me propose other revisions with alternatives, and after your feedback on these I'll send a mail requesting wider comments/advices.

This revision was not accepted when it landed; it landed in state Needs Review.Sat, Nov 2, 8:40 PM
This revision was automatically updated to reflect the committed changes.