Page MenuHomeFreeBSD

Enable M_TSTMP in Chelsio cxgbe driver by creating a mechanism that can sync the time.
ClosedPublic

Authored by rrs on Aug 23 2022, 4:55 PM.
Tags
None
Referenced Files
F107364162: D36315.id110380.diff
Mon, Jan 13, 2:19 AM
Unknown Object (File)
Mon, Jan 6, 2:10 AM
Unknown Object (File)
Thu, Dec 26, 12:36 AM
Unknown Object (File)
Sat, Dec 21, 10:11 PM
Unknown Object (File)
Thu, Dec 19, 2:21 PM
Unknown Object (File)
Thu, Dec 19, 9:22 AM
Unknown Object (File)
Wed, Dec 18, 10:40 PM
Unknown Object (File)
Tue, Dec 17, 8:28 AM
Subscribers

Details

Summary

Chelsio has always been recording a timestamp in the mbuf (rcv_tstmp) but
not setting the M_TSTMP bit in the mbuf flags. This is because the timestamp
was just the free running 60bit clock. This change fixes that so that
we keep a synchronization by periodically (every 30 seconds after startup)
getting the timestamp and the current nanosecond time. We always keep
several sets around and the current one we always keep the current pair
and the previous pair of timestamps. This allows us to setup a ratio
between the two so we can correctly translate the time. Note that
we use special care to split the timestamp into seconds (per the clock tick)
and nanoseconds otherwise 64bit math would overflow.

Test Plan

We can test this by making sure (via bb logs in rack) that the time
is correct and moving forward precisely as it should (reading the BB logs
and studying the timestamps).

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

rrs requested review of this revision.Aug 23 2022, 4:55 PM
sys/dev/cxgbe/t4_main.c
1204

Maybe make this a separate function, and mark the if () as __predict_false() ?

1668

another added line..

2083–2087

nit: extra line added

sys/dev/cxgbe/t4_sge.c
1533

Don't you need to specify 0xfffffffffffffffULL here (and below, and maybe even on the 1000000000 above)?

Take out my debug stuff that I forgot to remove .. opps.. this will now
compile properly and no longer has the calibration query or the float up
of the actual hw_timestamp (had to expand the mbuf to have that and
thats not a good idea in general except for debugging).

rrs marked 2 inline comments as done.Aug 24 2022, 8:40 AM
rrs added inline comments.
sys/dev/cxgbe/t4_main.c
1204

I am not quite sure what you mean by if and predict false.. if you mean the "dump out" stuff I need to just remove
that its more debugging goo I should get rid of.

Or do you mean something else.. your comment appears to be on the t4_calibration_start() which has
no if attached to it when called so it can't be that.. tools are so much fun :)

Address Drew's nits and also get rid of the debugging dump out printing fun.

sys/dev/cxgbe/t4_sge.c
1523

Where is this used?

sys/dev/cxgbe/t4_sge.c
1523

Hmm I think I added that when I first copied over the mlx algorithm.. They used to use
a 10 bit shift to the right to avoid overflow.. which is what I originally copied over here.
But then I realized how bad the accuracy was and devised this new scheme. I should
nuke the line :)

Fix a comment spelling error as well as reduce out a / and %.

sys/dev/cxgbe/t4_main.c
1155

This access is illegal if hw_off_limits(sc) is true here. I think the suspend/reset routines that mark the hw off limits need to stop the calibration callout and we should just put an assert here that it is safe to access the hardware. The resume code should restart calibration.

sys/dev/cxgbe/t4_sge.c
2140–2141

This comment is no longer accurate and should be removed.

It strikes me that this is somewhat similar but different to how we manage timecounters today with sbintime_t, etc. Those manage to cache the scaling fraction computed for each interval along with using fixed-point math. I wonder if it would make sense to borrow more logic from there. That is, instead of storing a timespec for each calibration, use sbinuptime and store the values as sbintime_t values. You can then simplify the math quite a bit by avoiding splitting up seconds vs non-seconds, etc. using fixed-point math. At the end you just use sbttons to get nanoseconds.

sys/dev/cxgbe/adapter.h
1005
sys/dev/cxgbe/t4_main.c
331

The second sysctl uses "normal" instead of "slow", so probably this should be something like "t4_fast_2_normal" instead to be consistent?

1141–1143
1164–1165

This could just be:

atomic_store_rel_int(&cur->gen, 0);

This allows the barrier to be encoded on the store instruction on arches like arm64.

1196
2083
2454
sys/dev/cxgbe/t4_sge.c
1526

Why c4? Should this be t4_tstmp_to_ns?

1544
1568
1574
2137–2138

This comment is now stale and needs updating.

2141–2145
rrs marked 15 inline comments as done.Sep 9 2022, 11:34 AM
rrs added inline comments.
sys/dev/cxgbe/t4_main.c
1155

The suspend/resume routines do stop and restart the callout so I will add a KASSERT here.

rrs marked an inline comment as done.Sep 9 2022, 11:39 AM

John,

Thinking about your "bintime" comment, I think this is probably a good idea, but
I suspect it needs to be applied even broader than in here. You have very similar
code in the Mlx driver since this was my original model. I have a review out on
that side as well making the more highly precision timer fix there.

I would rather see this go in so we (at NF) can start using it and then maybe
that could be a follow on project...

Addresses comments by jhb & np

Please go ahead. I'll follow up with commits that make this work with suspend/reset, when hw access is not allowed.

This revision is now accepted and ready to land.Sep 20 2022, 5:23 PM