if allow.routing is set, the jail can modify the system routing table even if
it's not a VNET jail.
Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped - Build Status
Buildable 63524 Build 60408: arc lint + arc unit
Event Timeline
Shouldn't such a feature allow setting the read/write permission per FIB instead of a single all or nothing flag?
In which application scenarios could allowing jailed processes to take command of system's routing tables be considered useful / desirable?
the intended use case (or at least, my intended use case) is running a routing daemon such as BIRD in a service jail; see the related diff D49844 which adds the svcj side of this.
is there a use case for allowing a daemon to modify one routing table but not another? since PRIV_NET_ROUTE is not fib-specific, supporting this would require a significant amount of new code in a sensitive codepath. i'm not opposed to the idea, but i wonder if the effort is worthwhile. (unless perhaps there's already an existing mechanism that could be used for this, and i'm overestimating the complexity?)
Will other routing daemons, which manage not only routing tables, but also system interfaces and their addresses, work 100% happily in this new ALLOW_ROUTING jail, or is this hack BIRD specific? What if BIRD folks one day decide to start adding direct interface management capabilities there, should we then follow up with punching more holes and start adding ALLOW_IFADDR, ALLOW_IFUPDN etc. etc. And if so, where's the boundary, where do we stop?
Let me also ask a different question? What does the jail buy here over a simple chroot (apart from the current system integration of jails)?
In a chroot all of Marko's concerns about managing interfaces etc are basically sorted as well.
i don't consider this a hack; it's simply giving administrators more control over their system, rather than dictating a specific set of permissions they have to take or leave as a whole.
i also don't think it's necessary to define a place to stop. as service jails become more widely used, i'm sure we'll run into more examples of things where it's useful to delegate permissions to jails which currently aren't supported. if you consider that an inherently negative thing, i'd be interested in hearing why; to me, this seems like simply providing more flexibility to users.
that said, i also don't think we need to add every single conceivable permission today. i don't use a routing daemon that manages interface addresses, so i haven't added support for that. if you wanted to do that, i certainly wouldn't object.
an svcj is not actually a chroot; they all run with path=/. adding more filesystem restrictions is something i think could be improved in future, but that's a different topic entirely. for now, chroot and svcjs are entirely orthogonal.
i am approaching this more from the position that, at least in theory, it should be possible to run all services in a service jail unless there's a clear reason why it doesn't make sense to do that. why? because this provides a unified mechanism to restrict system access and resources for any service, on the jail level. compare it to how Linux uses cgroups to run services; on Linux, it's trivial to set resource limits for any service by configuring its cgroup. with jails, we can easily do that for any jail (using rctl) and at the same time, we benefit from the additional security measures of jails (again, similar to how Linux uses namespace to restrict services, but in FreeBSD, security and resource restrictions are both provided by jails).
in future, as jails gain more capabilities, those capabilities will be automatically available to any service via service jails. isn't this, at least in part, why svcj was added in the first place?
Ideally an application that needs to modify routing tables/addresses but wants to reduce its own privileges down that and only that, it should be written with Capsicum in mind and should limits its capabilities. Unfortunately, in the real would we would have 3-rd part applications written with focus on Linux and not supporting Capsicum. A service jail seems like a good enough solution to deal with such applications. Note that it is indeed orthogonal to chrooting.
I am strongly in favor of this change. As Lexi mentioned before, users prefer to run all services in jails. I also wanted to do the same for my BGP routers to run OpenBGPd inside jail. However, there was no permission to edit the main system routing table. There are many softwares that only need to edit the routing table, not anything else (interfaces, their IP addresses, ...), like OpenBGPd and many other routing daemons.
I would like to see this permission implemented in FreeBSD.
ideally everything would be capsicumised, but i think that's also largely orthogonal to this change. as we saw with the bhyve vulnerability, it's still possible to break out of a capsicum sandbox, in which case jails add another layer of protection. and, capsicum doesn't do resource management (since that's not its job).
as we saw with the bhyve vulnerability, it's still possible to break out of a capsicum sandbox
I support changes like this one, but do note that the kinds of kernel vulnerabilities used to escape a Capsicum sandbox (like mishandled reference counts) generally will be equally usable for escaping a jail.
oh, i was actually thinking of https://www.freebsd.org/security/advisories/FreeBSD-SA-24:16.libnv.asc rather than the bhyve vulnerability itself (my understanding is both of those were chained to get full host access).
But if we keep poking holes in the jail boundary to accommodate more and more services, it becomes very hard to make any claims about the security properties of jailing a particular service. At some point we're treating jails as resource containers just because rctl makes that convenient, but there's a marketing problem there.
I don't see any particular problems with this change and don't mean to object to it, though it does feel weird that a jailed process can modify the system routing tables. But as we add more and more escape hatches it becomes impossible to reason about the security benefits of jailing a privileged process (and to be clear, this is already a problem).
in future, as jails gain more capabilities, those capabilities will be automatically available to any service via service jails. isn't this, at least in part, why svcj was added in the first place?
The svcj documentation in rc.conf.5 doesn't say anything about why one might want to run a service in a service jail, and what benefits that confers. I think that's a bug, especially given that the feature uses the term "jail" and not "container", and the former has specific connotations relating to security, at least in FreeBSD. And frankly I'm not sure what added security is obtained from having a privileged daemon run in a jail with path=/.
i am sympathetic to this concern; since jails were originally introduced there's been an assumption that if a process is in a jail, it can do X but it can't do Y, and changing that makes it cognitively more difficult to understand what "this process is in a jail" actually means. we have already violated that somewhat with permissions like adjtime/settime, which are semantically similar to this new routing permission, i.e. they allow the jail to modify something which is usually the concern of the host system.
from a purely technical point of view, i would to see jails become flexible enough that you can configure a jail however you like, including creating the type of "null jails" that have been floated before. this opens the door to doing a lot of interesting things, using the existing jail framework rather than implementing a copy of Solaris process contracts or Linux cgroups. and again, from a technical point of this, i don't see any reasonable objection to this.
in terms of this specific change, i think running a routing daemon in a jail (using svcj or otherwise) provides a significant security benefit because they process a lot of untrusted network data, and that alone is sufficient justification for *this specific* privilege being exposed to jails. as i said earlier, i don't intend to immediately go and add a jail flag for every existing privilege.
so my position would be that i think this change is reasonable on its own merits, but we should also think about whether we want to change (or at least clarify) terminology here going forward.
The svcj documentation in rc.conf.5 doesn't say anything about why one might want to run a service in a service jail, and what benefits that confers. I think that's a bug, especially given that the feature uses the term "jail" and not "container", and the former has specific connotations relating to security, at least in FreeBSD. And frankly I'm not sure what added security is obtained from having a privileged daemon run in a jail with path=/.
i have been mulling over how we can add more restrictions to svcj. i don't think "just use nullfs" is the answer here because that makes everything more complicated, but i don't yet have another proposal. i don't think this is impossible to fix in principle though.
I assume you refer to the path=/ part here. There is no generic way of determining what files a given service needs in a lightweight way so that it is simply xxx_svcj=yes. As soon as you specify a list of files which the frameworks puts into its own subtree, it is not lightweight anymore and you are better off to manually jail the service. If you provide an alternate path, you did some manual work before to provide a subtree, and then you are not lightweight in terms of the svcj design of "just do xxx_svcj=yes" anymore, and you are again IMO better off to use a non-svcj jail. The whole idea of service jails comes from the fact of path=/ (while doing the tech review of MWLs jail book... a little section at the end of the book titled "Jails as Control Groups" (SVCJs are not meant as cgroups, but can be used like that)... a nice addition to service jails would be some rctl stuff (via xxx_svcj_rctl maybe)).
i think we risk getting off topic here, but what i'm thinking about is services that only need to access a relatively small number of well-defined path names. for example, many services only need to write their pidfile. a database may need to write a pidfile, a data directory and a UNIX socket. this is stuff that can easily be configured in the rc(8) script itself and then used by rc.subr to configure the jail appropriately.
As soon as you specify a list of files which the frameworks puts into its own subtree, it is not lightweight anymore
i don't think svcj should be creating nullfs subtrees for services, that is definitely not in the spirit of the feature. i have some other (somewhat vague) ideas of how we can do this in a better way. but this diff is not the right place to mention those :-)
I don't have any objection. I just want to observe that naming is important (and hard). :)
in terms of this specific change, i think running a routing daemon in a jail (using svcj or otherwise) provides a significant security benefit because they process a lot of untrusted network data, and that alone is sufficient justification for *this specific* privilege being exposed to jails. as i said earlier, i don't intend to immediately go and add a jail flag for every existing privilege.
What exactly is the security benefit? The routing daemon is still running as root (since we have no way for a process to drop privileges in a fine-grained manner, so that one retains only PRIV_NET_ROUTE, say) and has full access to the filesystem. Is there a threat model where being jailed makes a significant difference?
i think what i meant to say here is something like, this is a UX/UI problem rather than a technical problem. or in other words i was agreeing with you :-)
What exactly is the security benefit? The routing daemon is still running as root (since we have no way for a process to drop privileges in a fine-grained manner, so that one retains only PRIV_NET_ROUTE, say) and has full access to the filesystem. Is there a threat model where being jailed makes a significant difference?
well, although i haven't tested this, with this change it should also be possible to run the routing daemon in a normal (non-svcj, non-vnet) jail, in which case the path=/ issue isn't a concern. note that in this specific case, while it also currently works to run a routing daemon in a vnet jail (i have tested this extensively) that doesn't achieve the required result since it will modify the vnet's routing table instead of the host's. unless you actually want to modify the vnet's routing table, but the new functionality here is you can jail the routing daemon but still modify the host routing table.