Details

Reviewers

Group Reviewers

network
transport

Commits

rGc7f803c71dae: inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuse

Summary

This combination doesn't make any sense. However, we shouldn't panic.
There are two options here: refuse connect(2) on a socket that has the
option set previously, or ignore (and clear) the option. We have chosen
the latter, because connect(2) is a SUS API that has a defined set of
error codes, none of which corresponds to "a socket has non-standard
incompatible socket option set". We don't want to extend connect(2) error
codes list to report a awkward misuse of a non-standard socket option.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

glebius created this revision.Thu, Feb 27, 4:10 AM

Herald added subscribers: melifaro, asomers, imp. · View Herald TranscriptThu, Feb 27, 4:10 AM

glebius requested review of this revision.Thu, Feb 27, 4:10 AM

glebius added a parent revision: D49142: inpcb: inline in_pcbconnect_setup() into in_pcbconnect().

Harbormaster completed remote builds in B62654: Diff 151560.Thu, Feb 27, 4:10 AM

glebius added a child revision: D49151: inpcb: retire two-level port hash database.Thu, Feb 27, 5:59 AM

What exactly is the panic you see? (I can't easily run the test case at this moment.)

What happens if the socket is bound before connect() is called? It would still be in the lbgroup hash table, but not visible to lookups (since listen() has not been called). Do we need to remove it from the lbgroup structure?

In D49150#1122265, @markj wrote:

What exactly is the panic you see? (I can't easily run the test case at this moment.)
What happens if the socket is bound before connect() is called? It would still be in the lbgroup hash table, but not visible to lookups (since listen() has not been called). Do we need to remove it from the lbgroup structure?

The lbgroup linkage is aliased with exact hash linkage, so a socket can't be connected one and a member of lbgroup at the same time. That of course makes sense. Actual panic message is something about queue(3) invariants violated.

In D49150#1122565, @glebius wrote:

In D49150#1122265, @markj wrote:

What exactly is the panic you see? (I can't easily run the test case at this moment.)
What happens if the socket is bound before connect() is called? It would still be in the lbgroup hash table, but not visible to lookups (since listen() has not been called). Do we need to remove it from the lbgroup structure?

The lbgroup linkage is aliased with exact hash linkage, so a socket can't be connected one and a member of lbgroup at the same time. That of course makes sense. Actual panic message is something about queue(3) invariants violated.

Then, isn't the patch incomplete? If the SO_REUSEPORT_LB socket is already bound, connect() needs to fail.

In D49150#1123226, @markj wrote:

Then, isn't the patch incomplete? If the SO_REUSEPORT_LB socket is already bound, connect() needs to fail.

connect(2) shall fail on any listening socket. Hmm, bound but not listening... I'm going to check that...

Yep, confirmed that if we bind(2), then connect(2) the panic happens and indeed patch is incomplete.

Maybe we'd better fail any connect(2) on the socket that has the flag set?

This just updates with additional test case, that would trigger a panic
that is not _yet_ covered by this revision.

Harbormaster completed remote builds in B62794: Diff 151967.Thu, Mar 6, 6:07 PM

In D49150#1123435, @glebius wrote:

Yep, confirmed that if we bind(2), then connect(2) the panic happens and indeed patch is incomplete.

Maybe we'd better fail any connect(2) on the socket that has the flag set?

Yes, I think so. It doesn't make sense to call connect(2) on a bound socket with SO_REUSEPORT_LB set. Thanks for looking at this and writing tests.

In D49150#1123639, @markj wrote:

In D49150#1123435, @glebius wrote:

Yep, confirmed that if we bind(2), then connect(2) the panic happens and indeed patch is incomplete.

Maybe we'd better fail any connect(2) on the socket that has the flag set?

Yes, I think so. It doesn't make sense to call connect(2) on a bound socket with SO_REUSEPORT_LB set. Thanks for looking at this and writing tests.

Well, I think it should be consistent with not bound socket. Do you agree? Any socket with SO_REUSEPORT_LB set should fail connect() same way as an already listening one does.

I started to write patch and found immediately two problems here :)

For AF_INET and AF_INET6 we would return different errors on connect(2) after listen(2).
The SOLISTENING() check is done without a lock.

I'll try to come up at least with some improvement here.

Fail instead of silently hiden misuse. Mostly rewrite the commit message.

Herald added a reviewer: transport. · View Herald TranscriptFri, Mar 7, 6:24 AM

Harbormaster completed remote builds in B62814: Diff 151995.Fri, Mar 7, 6:24 AM

In D49150#1123646, @glebius wrote:

In D49150#1123639, @markj wrote:

In D49150#1123435, @glebius wrote:

Yep, confirmed that if we bind(2), then connect(2) the panic happens and indeed patch is incomplete.

Maybe we'd better fail any connect(2) on the socket that has the flag set?

Yes, I think so. It doesn't make sense to call connect(2) on a bound socket with SO_REUSEPORT_LB set. Thanks for looking at this and writing tests.

Well, I think it should be consistent with not bound socket. Do you agree? Any socket with SO_REUSEPORT_LB set should fail connect() same way as an already listening one does.

Yes, you're right.

I started to write patch and found immediately two problems here :)

For AF_INET and AF_INET6 we would return different errors on connect(2) after listen(2).

The SOLISTENING() check is done without a lock.

It is done under the inpcb lock, which is sufficient to interlock with tcp_usr_listen().

sys/netinet/tcp_usrreq.c
596	Should we disallow this for UDP too, just for consistency?

Note that the added check is racy. An application with one thread bombing on setsockopt(2) and other thread doing connect(2) may trigger the panic. This is generic problem with "mislayered" setsockopts. This is how I called setsockopts that are badly named when they were introduced and pretend to belong to one socket level, while actually affecting a different one. In this particular case the option works solely on TCP level, not socket level but is declared in the latter namespace. I have some plan how to close those races, but don't want to pack it in this fairly small revision.

In D49150#1123664, @markj wrote:

It is done under the inpcb lock, which is sufficient to interlock with tcp_usr_listen().

Right, but nor the setsockopt() :(

P.S. Just updated this review.

glebius added inline comments.Fri, Mar 7, 6:32 AM

sys/netinet/tcp_usrreq.c
596	Do we have any test case or a working application for UDP and SO_REUSEPORT_LB? Let's check in as is and don't pretend UDP was really in mind at this moment :)

fixup test

Harbormaster completed remote builds in B62815: Diff 151999.Fri, Mar 7, 6:42 AM

In D49150#1123679, @glebius wrote:

In D49150#1123664, @markj wrote:

It is done under the inpcb lock, which is sufficient to interlock with tcp_usr_listen().

Right, but nor the setsockopt() :(

Hum, yes. Another potential use for pr_lock...

sys/netinet/tcp_usrreq.c
596	I believe BIND can be configured to use REUSEPORT_LB with UDP sockets. I'm ok with leaving it as it is though.

markj accepted this revision as: markj.Fri, Mar 7, 6:50 AM

This revision was not accepted when it landed; it landed in state Needs Review.Fri, Mar 7, 7:01 AM

Closed by commit rGc7f803c71dae: inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuse (authored by glebius). · Explain Why

This revision was automatically updated to reflect the committed changes.

glebius added a commit: rGc7f803c71dae: inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuse.

inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuse
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 152002

sys/netinet/tcp_usrreq.c

tests/sys/netinet/so_reuseport_lb_test.c

inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuseClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 152002

sys/netinet/tcp_usrreq.c

tests/sys/netinet/so_reuseport_lb_test.c

inpcb: fix a panic with SO_REUSEPORT_LB + connect(2) misuse
ClosedPublic
Actions

Revision Contents
Changeset List