Chelsio has always been recording a timestamp in the mbuf (rcv_tstmp) but
not setting the M_TSTMP bit in the mbuf flags. This is because the timestamp
was just the free running 60bit clock. This change fixes that so that
we keep a synchronization by periodically (every 30 seconds after startup)
getting the timestamp and the current nanosecond time. We always keep
several sets around and the current one we always keep the current pair
and the previous pair of timestamps. This allows us to setup a ratio
between the two so we can correctly translate the time. Note that
we use special care to split the timestamp into seconds (per the clock tick)
and nanoseconds otherwise 64bit math would overflow.
Details
We can test this by making sure (via bb logs in rack) that the time
is correct and moving forward precisely as it should (reading the BB logs
and studying the timestamps).
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
Take out my debug stuff that I forgot to remove .. opps.. this will now
compile properly and no longer has the calibration query or the float up
of the actual hw_timestamp (had to expand the mbuf to have that and
thats not a good idea in general except for debugging).
sys/dev/cxgbe/t4_main.c | ||
---|---|---|
1204 | I am not quite sure what you mean by if and predict false.. if you mean the "dump out" stuff I need to just remove Or do you mean something else.. your comment appears to be on the t4_calibration_start() which has |
sys/dev/cxgbe/t4_sge.c | ||
---|---|---|
1523 | Where is this used? |
sys/dev/cxgbe/t4_sge.c | ||
---|---|---|
1523 | Hmm I think I added that when I first copied over the mlx algorithm.. They used to use |
sys/dev/cxgbe/t4_main.c | ||
---|---|---|
1155 | This access is illegal if hw_off_limits(sc) is true here. I think the suspend/reset routines that mark the hw off limits need to stop the calibration callout and we should just put an assert here that it is safe to access the hardware. The resume code should restart calibration. | |
sys/dev/cxgbe/t4_sge.c | ||
2140–2141 | This comment is no longer accurate and should be removed. |
It strikes me that this is somewhat similar but different to how we manage timecounters today with sbintime_t, etc. Those manage to cache the scaling fraction computed for each interval along with using fixed-point math. I wonder if it would make sense to borrow more logic from there. That is, instead of storing a timespec for each calibration, use sbinuptime and store the values as sbintime_t values. You can then simplify the math quite a bit by avoiding splitting up seconds vs non-seconds, etc. using fixed-point math. At the end you just use sbttons to get nanoseconds.
sys/dev/cxgbe/adapter.h | ||
---|---|---|
1005 | ||
sys/dev/cxgbe/t4_main.c | ||
331 | The second sysctl uses "normal" instead of "slow", so probably this should be something like "t4_fast_2_normal" instead to be consistent? | |
1141–1143 | ||
1164–1165 | This could just be: atomic_store_rel_int(&cur->gen, 0); This allows the barrier to be encoded on the store instruction on arches like arm64. | |
1196 | ||
2083 | ||
2454 | ||
sys/dev/cxgbe/t4_sge.c | ||
1526 | Why c4? Should this be t4_tstmp_to_ns? | |
1544 | ||
1568 | ||
1574 | ||
2137–2138 | This comment is now stale and needs updating. | |
2141–2145 |
sys/dev/cxgbe/t4_main.c | ||
---|---|---|
1155 | The suspend/resume routines do stop and restart the callout so I will add a KASSERT here. |
John,
Thinking about your "bintime" comment, I think this is probably a good idea, but
I suspect it needs to be applied even broader than in here. You have very similar
code in the Mlx driver since this was my original model. I have a review out on
that side as well making the more highly precision timer fix there.
I would rather see this go in so we (at NF) can start using it and then maybe
that could be a follow on project...
Please go ahead. I'll follow up with commits that make this work with suspend/reset, when hw access is not allowed.