Page MenuHomeFreeBSD

Enable M_TSTMP in Chelsio cxgbe driver by creating a mechanism that can sync the time.
ClosedPublic

Authored by rrs on Aug 23 2022, 4:55 PM.
Tags
None
Referenced Files
F102636996: D36315.diff
Fri, Nov 15, 4:00 AM
Unknown Object (File)
Wed, Nov 13, 7:30 AM
Unknown Object (File)
Wed, Nov 13, 5:11 AM
Unknown Object (File)
Fri, Oct 18, 5:16 AM
Unknown Object (File)
Oct 12 2024, 2:43 AM
Unknown Object (File)
Oct 7 2024, 3:06 AM
Unknown Object (File)
Oct 4 2024, 10:00 PM
Unknown Object (File)
Oct 2 2024, 2:02 PM
Subscribers

Details

Summary

Chelsio has always been recording a timestamp in the mbuf (rcv_tstmp) but
not setting the M_TSTMP bit in the mbuf flags. This is because the timestamp
was just the free running 60bit clock. This change fixes that so that
we keep a synchronization by periodically (every 30 seconds after startup)
getting the timestamp and the current nanosecond time. We always keep
several sets around and the current one we always keep the current pair
and the previous pair of timestamps. This allows us to setup a ratio
between the two so we can correctly translate the time. Note that
we use special care to split the timestamp into seconds (per the clock tick)
and nanoseconds otherwise 64bit math would overflow.

Test Plan

We can test this by making sure (via bb logs in rack) that the time
is correct and moving forward precisely as it should (reading the BB logs
and studying the timestamps).

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

rrs requested review of this revision.Aug 23 2022, 4:55 PM
sys/dev/cxgbe/t4_main.c
1204

Maybe make this a separate function, and mark the if () as __predict_false() ?

1670

another added line..

2088

nit: extra line added

sys/dev/cxgbe/t4_sge.c
1516

Don't you need to specify 0xfffffffffffffffULL here (and below, and maybe even on the 1000000000 above)?

Take out my debug stuff that I forgot to remove .. opps.. this will now
compile properly and no longer has the calibration query or the float up
of the actual hw_timestamp (had to expand the mbuf to have that and
thats not a good idea in general except for debugging).

rrs marked 2 inline comments as done.Aug 24 2022, 8:40 AM
rrs added inline comments.
sys/dev/cxgbe/t4_main.c
1204

I am not quite sure what you mean by if and predict false.. if you mean the "dump out" stuff I need to just remove
that its more debugging goo I should get rid of.

Or do you mean something else.. your comment appears to be on the t4_calibration_start() which has
no if attached to it when called so it can't be that.. tools are so much fun :)

Address Drew's nits and also get rid of the debugging dump out printing fun.

sys/dev/cxgbe/t4_sge.c
1506

Where is this used?

sys/dev/cxgbe/t4_sge.c
1506

Hmm I think I added that when I first copied over the mlx algorithm.. They used to use
a 10 bit shift to the right to avoid overflow.. which is what I originally copied over here.
But then I realized how bad the accuracy was and devised this new scheme. I should
nuke the line :)

Fix a comment spelling error as well as reduce out a / and %.

sys/dev/cxgbe/t4_main.c
1155

This access is illegal if hw_off_limits(sc) is true here. I think the suspend/reset routines that mark the hw off limits need to stop the calibration callout and we should just put an assert here that it is safe to access the hardware. The resume code should restart calibration.

sys/dev/cxgbe/t4_sge.c
2127

This comment is no longer accurate and should be removed.

It strikes me that this is somewhat similar but different to how we manage timecounters today with sbintime_t, etc. Those manage to cache the scaling fraction computed for each interval along with using fixed-point math. I wonder if it would make sense to borrow more logic from there. That is, instead of storing a timespec for each calibration, use sbinuptime and store the values as sbintime_t values. You can then simplify the math quite a bit by avoiding splitting up seconds vs non-seconds, etc. using fixed-point math. At the end you just use sbttons to get nanoseconds.

sys/dev/cxgbe/adapter.h
994
sys/dev/cxgbe/t4_main.c
331

The second sysctl uses "normal" instead of "slow", so probably this should be something like "t4_fast_2_normal" instead to be consistent?

1141–1143
1164–1165

This could just be:

atomic_store_rel_int(&cur->gen, 0);

This allows the barrier to be encoded on the store instruction on arches like arm64.

1196
2085
2455
sys/dev/cxgbe/t4_sge.c
1509

Why c4? Should this be t4_tstmp_to_ns?

1527
1551
1557
2122

This comment is now stale and needs updating.

2128–2129
rrs marked 15 inline comments as done.Sep 9 2022, 11:34 AM
rrs added inline comments.
sys/dev/cxgbe/t4_main.c
1155

The suspend/resume routines do stop and restart the callout so I will add a KASSERT here.

rrs marked an inline comment as done.Sep 9 2022, 11:39 AM

John,

Thinking about your "bintime" comment, I think this is probably a good idea, but
I suspect it needs to be applied even broader than in here. You have very similar
code in the Mlx driver since this was my original model. I have a review out on
that side as well making the more highly precision timer fix there.

I would rather see this go in so we (at NF) can start using it and then maybe
that could be a follow on project...

Addresses comments by jhb & np

Please go ahead. I'll follow up with commits that make this work with suspend/reset, when hw access is not allowed.

This revision is now accepted and ready to land.Sep 20 2022, 5:23 PM