Page MenuHomeFreeBSD

tcp: keep the inp locked while processing the ACK of the handshake
AbandonedPublic

Authored by tuexen on Jul 6 2022, 12:43 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Sep 26, 12:01 AM
Unknown Object (File)
Wed, Sep 25, 4:14 PM
Unknown Object (File)
Mon, Sep 23, 1:16 AM
Unknown Object (File)
Sat, Sep 21, 9:44 PM
Unknown Object (File)
Thu, Sep 19, 12:50 AM
Unknown Object (File)
Wed, Sep 18, 7:25 PM
Unknown Object (File)
Mon, Sep 16, 10:30 AM
Unknown Object (File)
Tue, Sep 10, 5:11 PM

Details

Reviewers
rrs
glebius
rscheff
Group Reviewers
transport
Summary

When processing the ACK (the third message) of the three way handshake, keep the inp locked to avoid races.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

With the improved socket referencing scheme, when a socket created in sonewconn() has a reference immediately, are those races fatal? Would the problem go away completely if we revert 6f3caa6d8159?

With the improved socket referencing scheme, when a socket created in sonewconn() has a reference immediately, are those races fatal? Would the problem go away completely if we revert 6f3caa6d8159?

Here is the race I have in mind (unlikely, but possible):

  • The`ACK` segment is received.
  • In the TCP syncache code sonewconn() is called, a TCP connection in created in the CLOSED state. After sonewconn() no inp lock is held.
  • Another TCP segment is received for the TCP connection, the TCP connection is in the CLOSED state, so a RST segment is sent. The peer will drop the connection.
  • The processing of the ACK segment continues and the TCP connection is moved in the ESTABLSIHED state. The peer however is in CLOSED.

The problem is that during processing of the ACK segment the inp is locked, unlocked, and locked again. The patch here avoids this.

Understood. Let me think if reverting that change that forced incomplete queue will heal such a race.

Understood. Let me think if reverting that change that forced incomplete queue will heal such a race.

OK. However, I think this is a TCP specific issue (the processing of the ACK is done in multiple steps) and therefore a TCP specific fix might be appropriate. Can you provide a pointer to the commit you want to revert? I don't have a commit message in my e-mail log for 6f3caa6d8159.

Understood. Let me think if reverting that change that forced incomplete queue will heal such a race.

OK. However, I think this is a TCP specific issue (the processing of the ACK is done in multiple steps) and therefore a TCP specific fix might be appropriate. Can you provide a pointer to the commit you want to revert? I don't have a commit message in my e-mail log for 6f3caa6d8159.

Maybe this: https://cgit.freebsd.org/src/commit/?id=6f3caa6d8159

Understood. Let me think if reverting that change that forced incomplete queue will heal such a race.

OK. However, I think this is a TCP specific issue (the processing of the ACK is done in multiple steps) and therefore a TCP specific fix might be appropriate. Can you provide a pointer to the commit you want to revert? I don't have a commit message in my e-mail log for 6f3caa6d8159.

Maybe this: https://cgit.freebsd.org/src/commit/?id=6f3caa6d8159

Yes, that is it. I didn't find it, because it was committed via svn: http://svnweb.freebsd.org/changeset/base/261242

Thanks for the pointer. I looked at the patch. I do not think it changes the race I would like to address with the patch in this review.

I think the race you described is different. The second ACK will not be able to find our CLOSED connection, because it is not yet in the pcb hash. It will find the listening socket instead. The listening socket has exclusive lock through all the processing of the previous packet, as it was a non-SYN. Now the second packet will block on the listening socket until the thread processing the first packet will unlock it after the comment Socket is created in state SYN_RECEIVED (btw, it should be INP_WUNLOCK(), my mistake). The second packet will now proceed through syncache_expand() and latter would fail and we indeed send RST after comment No syncache entry or ACK was. But write locking the new born pcb for a longer time won't help, IMHO.

I agree with your analysis. My mistake. This patch doesn't help.

Hi!

I got a patchset inspired by this revision: D36062, D36063, D36064, D36065. It does not close the discussed race with two packets! However, it definitely streamlines creation of a new socket from a listener. I have done only stability testing so far, but looking into doing benchmarks, too.

I also got a very different patch to fix the discussed race: D36066. I don't yet have a tool to instrument and reproduce the discussed race, so a help here is appreciated. Need a packetdrill guru :)

Hi!

I got a patchset inspired by this revision: D36062, D36063, D36064, D36065. It does not close the discussed race with two packets! However, it definitely streamlines creation of a new socket from a listener. I have done only stability testing so far, but looking into doing benchmarks, too.

I also got a very different patch to fix the discussed race: D36066. I don't yet have a tool to instrument and reproduce the discussed race, so a help here is appreciated. Need a packetdrill guru :)

Unfortunately packetdrill is not the right tool for testing race conditions. But I think some pf tests were triggering this kind of issue... You might ask kp@ for a way to reproduce it.