The TCPCB t_rttupdated is used initially to start using the rtt measurements only, once a number of valid data points was collected. However, using a unsigned long (64 bit) was overkill, when the number of valid samples to have collected prior to use of the timing information is well below 100. Switching to a smaller type, and limit the increase to the maximum allowed by that type. Also, the general statistics around the number of rtt samples is collected independently in tcps_rttupdated.
Details
- Reviewers
rrs tuexen rgrimes freebsd_sysctl.cz cc - Group Reviewers
transport - Commits
- rG18b83b626a4f: tcp: reduce the size of t_rttupdated in tcpcb
All packetdrill RTT related testing should pass identical to prior of this change.
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 49078 Build 45967: arc lint + arc unit
Event Timeline
- only touch t_rttupdated and have portable macro to do post-increment with a ceiling
That macro will used for other variables which really only track the initial state (e.g. in dctcp and cubic: num_cong_events, in cubic ticks_since_cong)
- rename macro from INCMAX to CEILINC as we are (post)-incrementing to a ceiling depending on the type of the variable incremented
It might make sense to add gauge8_t, gauge16_t, gauge32_t and gauge64_t types and provide inline functions like
void inline gauge8_incr(gauge8_t *gauge) { if (*gauge < UINT8_MAX) *gauge++; }
Using an (inline) function gives some sort of type safety... Not doing this in the TCP code would allow this to be used also in other code, like SCTP, where gauge16_t could be used for path error counters.
sys/netinet/tcp_var.h | ||
---|---|---|
140 | Be aware this is a cache optimized data structure. | |
270 | Changing the size of and location of this value is going to undo the comment about this structure being carefully cache aligned, though I do not know if that comment is still true. | |
717 | Ah, now this explains why I could not find CEILINC on my 12.1 machine... This set of macros and usage should probably be discussed on the -arch mailling list. This is a general mechanism and should probably be considered on a grander scale and by a wider audiance than just netinet stuff |
I thought the set of macros provided can be evaluated at compile-time properly and in a portable way. However, they do break down for figuring out the signedness of char (int8_t / uint8_t) and short (int16_t / uint16_t).
A benefit of a macro would still be that it is a drop-in replacement for any post-increment operator. I think combining the macro with a new gauge type would make most sense.
An inline function approach would require the developer to adjust all these calls, when the width of a variable (type) is changed; also the function would need to read like
'''
gauge8_t inline
gauge8_incr(gauge8_t *gauge)
{
gauge8_t temp = *gauge; if (*gauge < UINT8_MAX) (*gauge)++; return(temp);
}
'''
In my testing, "*gauge++" was incrementing the pointer address after reading the value, not incrementing the value at the address. Also, at lower optimization levels, this inline function is not actually rolled up and made the simple 3 assembly operations (excluding mov's) as the Macro always is.
Yes, I'm well aware - rrs@ and other folks @NF have hand-optimized the first 3 or 4 cachelines of this struct, for the normal, in-sequence processing some time ago.
However, since that last optimization, the "/* cache line X */" comments haven't been kept up-to-date and currently don't correctly align any more.
Fields further down (cache lines 4+) are more or less fair game.
The one field I try to tackle here (around cacheline 5, maybe already 6 - or even split among both) really only needs to keep track of the first few (<10) rounds of RTT updates, and must never overflow (to bypass the processing, when that count is low again). As it's so far down in the structure, moving it into currently unused "alignment" space, and shrinking the size, the expectation is that this may actually help slightly, by shortening the TCPCB struct again.
As for
tcp_var.h:540
+ ((x) < UTYPE_MAX(x)) ? (x)++ : (x)
+
/*Ah, now this explains why I could not find CEILINC on my 12.1 machine... This set of macros and usage should probably be discussed on the -arch mailling list. This is a general mechanism and should probably be considered on a grander scale and by a wider audiance than just netinet stuff
Perhaps create a Diff for gauge types and associated accessor / utility functions and macros would be a better plan, and make this a one-time (hardcoded) change until these functions become available - without MFC?