Page MenuHomeFreeBSD

Modify lock_delay() to increase the delay time after spinning
ClosedPublic

Authored by trasz on Nov 23 2020, 1:25 PM.
Tags
None
Referenced Files
F108702392: D27331.diff
Mon, Jan 27, 10:15 AM
Unknown Object (File)
Sat, Jan 18, 9:46 PM
Unknown Object (File)
Dec 24 2024, 8:08 PM
Unknown Object (File)
Dec 14 2024, 6:15 PM
Unknown Object (File)
Nov 16 2024, 6:33 AM
Unknown Object (File)
Nov 14 2024, 4:48 PM
Unknown Object (File)
Oct 17 2024, 11:25 PM
Unknown Object (File)
Oct 14 2024, 2:58 PM

Details

Summary

Modify lock_delay() to increase the delay time after spinning,
not before. Previously we would spin at least twice instead of once.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 34973
Build 31966: arc lint + arc unit

Event Timeline

trasz requested review of this revision.Nov 23 2020, 1:25 PM

It is quite unclear if this is a good idea as is, key point being that these locks are unfair and fiddling with this can further negatively affect the achieved fairness. In particular the current "wait 2 spins" setup make fresh arrivals less likely to grab the lock from under threads which already wait. Replacing this with fair locks in a manner which does not negatively impact single-threaded performance is complicated. Any changes here should come with several benchmarks, showcasing longest time to grab the lock and perhaps some fiddling with min/max for backoff.

In D27331#610694, @mjg wrote:

Any changes here should come with several benchmarks, showcasing longest time to grab the lock and perhaps some fiddling with min/max for backoff.

I don't have specific benchmark details, but I can definitively say we found this change to have a significant impact on throughput and latency in our performance testing (when compared to the BSD10 code which calls cpu_spinwait() directly).
--ap

Sure, it is a tradeoff, helps some workloads and hurts others. Can you share the hw spec? Most notably cores/threads, cpu model and socket count?

I'll add some extra measures here and give you a different patch to test.

In D27331#610723, @mjg wrote:

Can you share the hw spec? Most notably cores/threads, cpu model and socket count?

On the specific test that saw an improvement, we ran a 2-socket, Intel Sandy Bridge, 8-core/socket, 2.1GHz.

I get what you're saying. If you have something different to test out, we can try it.
--ap

So I got myself an Ivy Bridge for testing (among some other boxes). I failed to get benefit from the patch at hand, but I did not worsen the stuff I was worried about.

However, there is still significant unfairness here and making it less of a problem will negatively affect throughput. The real thing to do is to drop backoff altogether, but that's quite a lot of work.

tl;d i think the patch can go in for the time being

This revision is now accepted and ready to land.Nov 25 2020, 9:39 PM