Paths

Table of Contentst

Differential D48222

syslogd: Ensure that forwarded messages are sent from port 514
ClosedPublic
Actions

Authored by markj on Dec 27 2024, 10:01 PM.

Details

Reviewers

jfree

Commits

rGae4f708f0b38: syslogd: Ensure that forwarded messages are sent from port 514

Summary

Prior to commit 4ecbee2760f7, syslogd used its listening socket(s) to
forward messages to remote hosts, when so configured. As a consequence,
they are sent from the address+port to which those sockets are bound,
typically 0.0.0.0:514.

When in capability mode, sendto() is not permitted, so we instead
pre-create sockets and connect them to the forwarding addresses, letting
the kernel pick an ephemeral source port. However, this doesn't match
syslogd's previous behaviour, breaking some setups.

So, restore the old behaviour by binding forwarding sockets to the
addresses on which syslogd is listening. Since we cannot use the same
sockets for receiving messages and also for forwarding them, use
SO_REUSEPORT to enable duplicate bindings to port 514, relying on the
existing behaviour that the first socket bound to that port is the one
that actually receives messages.

Add some regression tests to cover this and related functionality of
syslogd's -a option.

Reported by: Michael Butler <imb@protected-networks.net>
Fixes: 4ecbee2760f7 ("syslogd: Open forwarding socket descriptors")

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Dec 27 2024, 10:01 PM

Herald added subscribers: jonathan, imp. · View Herald TranscriptDec 27 2024, 10:01 PM

markj requested review of this revision.Dec 27 2024, 10:01 PM

Harbormaster completed remote builds in B61361: Diff 148448.Dec 27 2024, 10:02 PM

jfree added inline comments.Dec 28 2024, 5:52 PM

usr.sbin/syslogd/syslogd.c
3204	We could end up with a lot of forwarding sockets if `shead` contains a large amount of listening INET sockets. Is it safe to assume that the primary listening INET socket (the one specified with -b or the one bound to *:514 if -b is not specified) will always be able deliver the message? I cannot think of a situation where other INET sockets would be able to, but that one wouldn't. If we can assume that this primary listening INET socket can always deliver the message, then we could get away with just creating one forwarding socket (bound to that primary listening socket) instead of one forwarding socket per listening INET socket.

jfree accepted this revision.Dec 28 2024, 5:52 PM

This revision is now accepted and ready to land.Dec 28 2024, 5:52 PM

markj added inline comments.Dec 28 2024, 8:44 PM

usr.sbin/syslogd/syslogd.c
3204	Yes, this might blow up if a user has many listening addresses (i.e., many uses of -b) and many forwarding addresses. I don't see a good way to address that within the constraints of capability mode. I thought about just creating a single forwarding socket per address family, but that would be a (probably small) compatibility break. Since the capsicum changes were quite large, I'd like to keep things compatible as much as possible until we're confident that there aren't any regressions left.

jfree added inline comments.Dec 28 2024, 9:16 PM

usr.sbin/syslogd/syslogd.c
3204	The pre-capsicumized implementation would just cycle through all of the listening sockets (bound to addresses passed through -b) and attempt delivery using those sockets. If delivery succeeds on one socket, then the remaining sockets are left untouched. I guess my question from earlier is: assuming the user passed two listening addresses (using -b). In what circumstance would message delivery fail when being delivered from the socket connected to the first listening address but succeed when being delivered from the socket connected to the second listening address? Wouldn’t the socket connected to the first listening address in `shead` always be used to send the message?

markj added inline comments.Dec 28 2024, 9:30 PM

usr.sbin/syslogd/syslogd.c
3204	I think we might fall back to "secondary" listening addresses if, for example, someone updates the network configuration on the host such that the primary address is no longer available. More generally, there various reasons that in_pcbconnect_setup(), called from udp_send() for unconnected sendto(2)s, may fail. That's not very likely to happen in general, but I wouldn't be surprised if there are configurations which rely on this. Logging is critical functionality, so I don't want to assume too much about users having "sane" configurations. I don't claim that my approach is absolutely necessary: it's quite possible I'm being too paranoid and there's a good reason that the fallback behaviour doesn't need to be preserved, but I don't see one so far.

Handle the possibility of duplicate forwarding addresses.

Because we both bind and connect forwarding sockets now, we have to avoid
creating duplicate connections. This gets pretty hairy - bind and connect may
rewrite the input addresses; for example, in a classic jail, binding to 0.0.0.0
may result in binding to the jail's IP address, so a subsequent getsockaddr()
could return, say, 127.0.0.2 instead of 0.0.0.0.

This solution stores the sockaddrs used to bind and connect each forwarding
socket. When connect() fails due to EADDRINUSE, we go through the nvlist
assembled so far and try to find the socket that we're duplicating, and dup
that instead of creating a new one, avoiding the problem.

I'm not happy with the complexity and fragility of this approach, but I don't
see a better short-term solution. Having a casper service to forward messages
is also possible, that's really quite inefficient - forwarding is very common,
unlike other delivery types where we do indeed use a separate service for
logging.

It might be reasonable to have a UDP socket type which allows sendto() even in
capability mode, akin to a directory descriptor for "/" which allows access to
the filesystem in capability mode.

This revision now requires review to proceed.Dec 30 2024, 3:08 PM

Harbormaster completed remote builds in B61405: Diff 148551.Dec 30 2024, 3:08 PM

markj added a parent revision: D48250: syslogd: Fix resource leaks.Dec 30 2024, 3:08 PM

I do not like how complicated this is getting, but the code all looks good.

I cannot think of a better approach given the state of networking with Capsicum right now. I don't think it would be too difficult to implement some sort of allowed address list:

int addrlist(int flags);
int addrlist_add(int addrlist_fd, const struct sockaddr *addr, addrlen_t addrlen);

...

addrlist_fd = addrlist(0);
addrlist_add(addrlist_fd, src_addr, src_addrlen);
addrlist_add(addrlist_fd, dst_addr, dst_addrlen);

cap_enter();

bindat(addrlist_fd, sock_fd, src_addr, src_addrlen);
connectat(addrlist_fd, sock_fd, dst_addr, dst_addrlen);

This suggestion would require new system calls, which is definitely an issue, but I'm just throwing some ideas out there.

jfree accepted this revision.Jan 3 2025, 5:05 AM

This revision is now accepted and ready to land.Jan 3 2025, 5:05 AM

In D48222#1101530, @jfree wrote:

I do not like how complicated this is getting, but the code all looks good.

Yeah, this code is kind of crazy. I would really like to fix this properly before capsicumized syslogd ships in 15.0.

I cannot think of a better approach given the state of networking with Capsicum right now. I don't think it would be too difficult to implement some sort of allowed address list:
int addrlist(int flags);
int addrlist_add(int addrlist_fd, const struct sockaddr *addr, addrlen_t addrlen);

...

addrlist_fd = addrlist(0);
addrlist_add(addrlist_fd, src_addr, src_addrlen);
addrlist_add(addrlist_fd, dst_addr, dst_addrlen);

cap_enter();

bindat(addrlist_fd, sock_fd, src_addr, src_addrlen);
connectat(addrlist_fd, sock_fd, dst_addr, dst_addrlen);
This suggestion would require new system calls, which is definitely an issue, but I'm just throwing some ideas out there.

I think it makes sense. This is basically analogous to cap_ioctls_limit() and cap_fcntls_limit(). One issue is that we'd want/need some kind of limit on the number of addresses per socket, to avoid unbounded kernel memory consumption from userspace. So, syslogd might need multiple sockets in the end if the config references many forwarding destinations, but I think it wouldn't be very complicated to support. We also need to consider the performance cost of looking up the destination address in a whitelist upon each sendto() call.

Another option might be to extend connectat(2) to support IP sockets. For instance, connectat(s, {AF_INET, 0.0.0.0, 0}), disallowed in capability, could permit sendto(s) in capability mode. Or, connectat(s, {AF_INET, 192.168.1.0}, 0) permits sendto() to 192.168.1.0/24. This would allow syslogd to transmit to any sockaddr, so weakens the sandbox a bit, but it makes userspace much simpler.

In D48222#1101641, @markj wrote:

Another option might be to extend connectat(2) to support IP sockets. For instance, connectat(s, {AF_INET, 0.0.0.0, 0}), disallowed in capability, could permit sendto(s) in capability mode. Or, connectat(s, {AF_INET, 192.168.1.0}, 0) permits sendto() to 192.168.1.0/24. This would allow syslogd to transmit to any sockaddr, so weakens the sandbox a bit, but it makes userspace much simpler.

If I'm understanding correctly, calling connactat(2) would add the sockaddr passed as an argument to a whitelist of addresses attached the provided INET socket. Then using sendto(2) on the same socket with any of those addresses would be allowed in capability mode?

This doesn't require a new interface, so that makes implementation easier, but is a little unintuitive to me. I feel like there is already a convention for how *at() calls work: they perform the base operation, but relative to some anchor point. Openat(2) performs the same operation as open(2), but relative to the provided descriptor. Same idea goes for the existing PF_LOCAL implementations of connectat(2) and bindat(2); they respectively perform connect(2) and bind(2) based on their provided descriptors. This option would throw that convention out the window since the base function wouldn't actually be performed and we would be operating on the provided descriptor. It feels like we'd be shoehorning functionality into a call that was created with a different intention.

Creating a new interface solely for programs that use Capsicum might be a stretch, but I feel it is important to make things intuitive for people that are trying to Capsicumize their programs because the process is already quite tedious. In the worst case, a new interface could just be abandoned/deprecated altogether. It would be much harder to remove functionality from an interface that would continue to live on.

I think the idea itself is great, but maybe it would be better suited to its own call? It could probably be a wrapper around an ioctl since the operation would be on the socket descriptor.

In D48222#1102073, @jfree wrote:

In D48222#1101641, @markj wrote:

Another option might be to extend connectat(2) to support IP sockets. For instance, connectat(s, {AF_INET, 0.0.0.0, 0}), disallowed in capability, could permit sendto(s) in capability mode. Or, connectat(s, {AF_INET, 192.168.1.0}, 0) permits sendto() to 192.168.1.0/24. This would allow syslogd to transmit to any sockaddr, so weakens the sandbox a bit, but it makes userspace much simpler.

If I'm understanding correctly, calling connactat(2) would add the sockaddr passed as an argument to a whitelist of addresses attached the provided INET socket. Then using sendto(2) on the same socket with any of those addresses would be allowed in capability mode?

Not quite. My - poorly explained - notion was that you could (ab)use connectat() to define a mask of permissible sendto() destination addresses, so an empty mask permits unrestricted sendto().

This doesn't require a new interface, so that makes implementation easier, but is a little unintuitive to me. I feel like there is already a convention for how *at() calls work: they perform the base operation, but relative to some anchor point. Openat(2) performs the same operation as open(2), but relative to the provided descriptor. Same idea goes for the existing PF_LOCAL implementations of connectat(2) and bindat(2); they respectively perform connect(2) and bind(2) based on their provided descriptors. This option would throw that convention out the window since the base function wouldn't actually be performed and we would be operating on the provided descriptor. It feels like we'd be shoehorning functionality into a call that was created with a different intention.

This is true. But, connectat() has no meaning for PF_INET(6) sockets at the moment, so we're free to define one.

Creating a new interface solely for programs that use Capsicum might be a stretch, but I feel it is important to make things intuitive for people that are trying to Capsicumize their programs because the process is already quite tedious. In the worst case, a new interface could just be abandoned/deprecated altogether. It would be much harder to remove functionality from an interface that would continue to live on.

Well, connectat() itself is a new interface pretty much solely for programs that use Capsicum. AFAIK other unixes don't implement it.

I think the idea itself is great, but maybe it would be better suited to its own call? It could probably be a wrapper around an ioctl since the operation would be on the socket descriptor.

Yeah, that's safer from an interface perspective. I'll start a discussion on -hackers in any case so that we can get this wrapped up somehow before 15.0.

Closed by commit rGae4f708f0b38: syslogd: Ensure that forwarded messages are sent from port 514 (authored by markj). · Explain WhyJan 14 2025, 3:26 PM

This revision was automatically updated to reflect the committed changes.

markj added a commit: rGae4f708f0b38: syslogd: Ensure that forwarded messages are sent from port 514.