Page MenuHomeFreeBSD

mlx5 M_TSTMP accuracy looses quite a bit of precision so lets fix it.
ClosedPublic

Authored by rrs on Aug 24 2022, 9:39 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Nov 10, 2:38 AM
Unknown Object (File)
Oct 10 2024, 7:26 AM
Unknown Object (File)
Oct 10 2024, 7:26 AM
Unknown Object (File)
Oct 10 2024, 7:26 AM
Unknown Object (File)
Oct 10 2024, 7:25 AM
Unknown Object (File)
Oct 10 2024, 7:25 AM
Unknown Object (File)
Sep 30 2024, 5:26 AM
Unknown Object (File)
Sep 22 2024, 3:49 PM
Subscribers

Details

Summary

The way that the clock is synchronized between the system and the current mlx5 for the purposes of the M_TSTMP
being carried we loose a lot of precision. Instead lets change the math that calculates this to separate out
the seconds/nanoseconds and operate on the two values so we don't get overflow instead of just
shifting the value down and loosing precision.

Test Plan

One can test with BB logs to see the timestamp and I have also tested this
with added query mechanisms (not included in the patch) so that I could compare
old and new timestamps generated and all of those to the real floating point
result. This new algorithm now gets us to be within the 1/10ths of a nanosecond
where you would expect it to be for integer math.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

rrs requested review of this revision.Aug 24 2022, 9:39 AM
sys/dev/mlx5/mlx5_en/mlx5_en_rx.c
253

Why can't you subtract the HW timestamps AS-IS, and then do the division and modulus on the result? Will save one round of expensive division and modulus?

261

Can negative values happen in res_s ?

sys/dev/mlx5/mlx5_en/mlx5_en_rx.c
253

Good point

261

No I don't think its possible because the min time you can set in the callout is 1 second. And
then you have to have some time after that. So it should never be negative here... If you want
we could add a check for that.. maybe a KASSERT would be best.

sys/dev/mlx5/mlx5_en/mlx5_en_rx.c
241

Is this an advanced English, or a typo in 'equivalent'?

406–409

Why the check is needed now?

sys/dev/mlx5/mlx5_en/mlx5_en_rx.c
261

No reason to panic! Maybe a warn_once() from the LinuxKPI is better?

sys/dev/mlx5/mlx5_en/mlx5_en_rx.c
406–409

If you can't get sync (check the lockless code) and see gen = 0, you return 0.
In such a case you *cannot* convert the timestamp so you need to *not* set the bit.

sys/dev/mlx5/mlx5_en/mlx5_en_rx.c
241

No its my very poor spelling ;)

Fix the spelling pointed out by kib. And take Hans suggestion to cut out a / and *.

sys/dev/mlx5/mlx5_en/mlx5_en_rx.c
258

Can hw_clk_div be zero?

274

Should remove all of this comments and commented code.

rrs marked 2 inline comments as done.Aug 30 2022, 5:49 AM
rrs added inline comments.
sys/dev/mlx5/mlx5_en/mlx5_en_rx.c
232–249

Hans

Here is the comment I was referring too about divisor.. same divisor basically except
its >> 10

258

No it cannot.. Same for the original code I believe there is a comment in the
block somewhere to that effect. The reason behind it makes sense since
you have 1 second or more (min timeout granularity for the callout is 1 second)
between clbr_hw_curr and clbr_hw_prev and the only way that could happen
is if the hw clock stopped. And I believe the comment put in originally by yall
says that that case is covered by gen being set to 0 and thus a return of
0 (which is why I added the if 0 you drop the M_TSTMP).

274

Sure I left it in for your reference.

Code looks sane to me.

sys/dev/mlx5/mlx5_en/mlx5_en_main.c
4801

Are you sure you don't need to cast to ULL or uint64_t to have a proper 64-bit multiplication here?

sys/dev/mlx5/mlx5_en/mlx5_en_main.c
4801

Let me add that, I doubt its needed but it would not hurt :)

Address Hans ULL comment for clock freq

Take out the commented out old code.

kib added inline comments.
sys/dev/mlx5/mlx5_en/mlx5_en_main.c
4801

The cast is definitely not needed.

This revision is now accepted and ready to land.Sep 19 2022, 10:56 AM

I'm good with this. @kib : Make sure to push the patch internally aswell.

This revision was automatically updated to reflect the committed changes.
rrs marked an inline comment as done.