Page MenuHomeFreeBSD

tcp: Rack may still calculate long RTT on persists probes.
ClosedPublic

Authored by rrs on Nov 8 2021, 8:52 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Oct 29, 11:04 PM
Unknown Object (File)
Tue, Oct 29, 11:04 PM
Unknown Object (File)
Tue, Oct 29, 11:04 PM
Unknown Object (File)
Tue, Oct 29, 11:04 PM
Unknown Object (File)
Tue, Oct 29, 2:42 PM
Unknown Object (File)
Tue, Oct 29, 2:42 PM
Unknown Object (File)
Tue, Oct 29, 2:42 PM
Unknown Object (File)
Tue, Oct 29, 2:42 PM
Subscribers

Details

Summary

When a persists probe is lost, we will end up calculating a long
RTT based on the initial probe and when the response comes from the
second probe (or third etc). This means we have a minimum of a
confidence level of 3 on a incorrect probe. This commit will change it
so that we have one of two options
a) Just not count RTT of probes where we had a loss
<or>
b) Count them still but degrade the confidence to 0.

I have set in this the default being to just not measure them, but I am open
to having the default be otherwise.

Test Plan

We should be able to generate a packet drill script with BB logging
on and show a lost probe and how it either does not generate
a RTT estimate or it declares a confidence value of 0.

The following packet drill scripts can be used to generate the condition. You must
capture BB logging (by running tcplog_dumper) and then you can run this script
without the fix. You will find using read_bbrlog that all persists probe responses
show a conf=3 which means we are extremely confident in the probes.

If you apply the patch and leave the default setting, you will see the first
persist probe you will get the conf=3 and it will be correct. The subsequent
persist probe (after a loss) will not generate an RTT.

Now if you change the new misc. setting to allow it to still make the measurements, and
rerun the test, you will see the conf=0 in the second probe (not the first).

--ip_version=ipv4

+0.00 sysctl -w net.inet.tcp.hostcache.purgenow=1
+0.00 sysctl -w net.inet.tcp.syncookies_only=0
+0.00 sysctl -w net.inet.tcp.syncookies=1
+0.00 sysctl -w net.inet.tcp.rfc1323=1
+0.00 sysctl -w net.inet.tcp.sack.enable=1
+0.00 sysctl -w net.inet.tcp.ecn.enable=2
Create a TCP endpoint in the ESTABLISHED state.
+0.00 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0.00 fcntl(3, F_GETFL) = 0x02 (flags O_RDWR)
+0.00 fcntl(3, F_SETFL, O_RDWR | O_NONBLOCK) = 0
+0.00 setsockopt(3, IPPROTO_TCP, TCP_LOG, [4], 4) = 0
+0.00 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0.00 > S 0:0(0) win 65535 <mss 1460,nop,wscale 8,sackOK,TS val 100 ecr 0>
+0.10 < S. 0:0(0) ack 1 win 10000 <mss 1440,sackOK,TS val 400 ecr 100>
+0.00 > . 1:1(0) ack 1 win 65535 <nop,nop,TS val 200 ecr 400>
Change to rack_latest
+0.00 setsockopt(3, IPPROTO_TCP, TCP_NODELAY, [1], 4) = 0
+0.00 setsockopt(3, IPPROTO_TCP, TCP_FUNCTION_BLK, {function_set_name="rack", pcbcnt=0}, 36) = 0
+0.00 send(3, ..., 1000, 0) = 1000
+0.00 > P. 1:1001(1000) ack 1 win 65535 <nop,nop,TS val 250 ecr 400>
+.10 < . 1:1(0) ack 1001 win 10000 <nop, nop, TS val 500 ecr 250>
+0.00 send(3, ..., 1000, 0) = 1000
+0.00 > P. 1001:2001(1000) ack 1 win 65535 <nop,nop,TS val 300 ecr 500>
+.10 < . 1:1(0) ack 2001 win 10000 <nop, nop, TS val 600 ecr 300>
+0.10 send(3, ..., 20000, 0) = 20000

  • > . 2001:3429(1428) ack 1 win 65535 <nop,nop,TS val 400 ecr 600>
  • > . 3429:4857(1428) ack 1 win 65535 <nop,nop,TS val 500 ecr 600>
  • > . 4857:6285(1428) ack 1 win 65535 <nop,nop,TS val 600 ecr 600>
  • > . 6285:7713(1428) ack 1 win 65535 <nop,nop,TS val 700 ecr 600>
  • > . 7713:9141(1428) ack 1 win 65535 <nop,nop,TS val 800 ecr 600>
  • > . 9141:10569(1428) ack 1 win 65535 <nop,nop,TS val 900 ecr 600>
  • > . 10569:11997(1428) ack 1 win 65535 <nop,nop,TS val 1000 ecr 600>

+.10 < . 1:1(0) ack 4857 win 7144 <nop, nop, TS val 700 ecr 500>
+.010 < . 1:1(0) ack 7713 win 4288 <nop, nop, TS val 800 ecr 700>
+.010 < . 1:1(0) ack 10569 win 1428 <nop, nop, TS val 900 ecr 900>
+.10 < . 1:1(0) ack 11997 win 0 <nop, nop, TS val 1000 ecr 1000>

  • > . 11996:11996(0) ack 1 win 65535 <nop,nop,TS val 1100 ecr 1000>

+.01 < . 1:1(0) ack 11997 win 0 <nop, nop, TS val 1100 ecr 1100>

  • > . 11996:11996(0) ack 1 win 65535 <nop,nop,TS val 1200 ecr 1100>
  • > . 11996:11996(0) ack 1 win 65535 <nop,nop,TS val 1300 ecr 1100>

+.01 < . 1:1(0) ack 11997 win 0 <nop, nop, TS val 1200 ecr 1300>

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

rrs requested review of this revision.Nov 8 2021, 8:52 PM
This revision is now accepted and ready to land.Nov 10 2021, 11:49 AM

I had some stray stuff in the previous diff (from the cc changes).

This revision now requires review to proceed.Nov 10 2021, 3:52 PM
This revision is now accepted and ready to land.Nov 10 2021, 4:42 PM