Page MenuHomeFreeBSD

wg: Use plain loads and stores for so_so4 and so_so6.
AbandonedPublic

Authored by jhb on Jul 29 2022, 4:57 PM.
Tags
None
Referenced Files
F108578984: D35989.diff
Sun, Jan 26, 1:50 PM
Unknown Object (File)
Sat, Jan 18, 9:30 PM
Unknown Object (File)
Sat, Jan 18, 7:05 AM
Unknown Object (File)
Nov 25 2024, 4:22 PM
Unknown Object (File)
Nov 24 2024, 9:05 AM
Unknown Object (File)
Nov 23 2024, 11:39 AM
Unknown Object (File)
Nov 22 2024, 5:08 PM
Unknown Object (File)
Nov 22 2024, 4:28 AM
Subscribers

Details

Reviewers
markj
Summary

Atomics are not needed to read the variables after NET_EPOCH_ENTER due
to the embedded acquire barrier in epoch_enter.

Atomics are also not needed to write the variables prior to
NET_EPOCH_WAIT due to the embedded memory fence in epoch_wait.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 46681
Build 43570: arc lint + arc unit

Event Timeline

jhb requested review of this revision.Jul 29 2022, 4:57 PM

Hi @jhb ,
I'm a little confused as there's no sys/dev/wg source directory in current HEAD.

In D35989#817522, @zlei.huang_gmail.com wrote:

Hi @jhb ,
I'm a little confused as there's no sys/dev/wg source directory in current HEAD.

This is a diff I've primarily uploaded for markj@ to review. I have a tree that imports the wg(4) driver from wireguard-freebsd into sys/dev/wg and this change (along with others in the stack) are further changes to the current wireguard-freebsd driver.

This revision is now accepted and ready to land.Aug 8 2022, 5:13 PM

I don't see how the fences have anything to do with it.

The point of atomic load/stores is so that the compiler does not get ideas of splitting them into multiple instructions and does not get ideas about re-reading them, both of which are no longer guaranteed with the patch at hand.

In D35989#838250, @mjg wrote:

I don't see how the fences have anything to do with it.

The point of atomic load/stores is so that the compiler does not get ideas of splitting them into multiple instructions and does not get ideas about re-reading them, both of which are no longer guaranteed with the patch at hand.

We always assume that a load of an aligned pointer is atomic.

In D35989#838250, @mjg wrote:

I don't see how the fences have anything to do with it.

The point of atomic load/stores is so that the compiler does not get ideas of splitting them into multiple instructions and does not get ideas about re-reading them, both of which are no longer guaranteed with the patch at hand.

We always assume that a load of an aligned pointer is atomic.

That assumption is not valid, although it almost always works out.

What is a serious problem is allowing the compiler to decide to re-read these values, meaning multiple uses of 'so4' in the same func can end up with different values. While the sample above is so short I don't believe any compiler would do it, I see 0 reason to make the proposed change. If anything, it is better to add these loads if in any doubt than to skip them.

In D35989#839387, @mjg wrote:
In D35989#838250, @mjg wrote:

I don't see how the fences have anything to do with it.

The point of atomic load/stores is so that the compiler does not get ideas of splitting them into multiple instructions and does not get ideas about re-reading them, both of which are no longer guaranteed with the patch at hand.

We always assume that a load of an aligned pointer is atomic.

That assumption is not valid, although it almost always works out.

It's valid for the platforms FreeBSD runs on, this assumption is pervasive in the kernel. (Try enabling KCSAN.)

Above you say that the purpose of atomic_load/store is to provide atomicity of operations, while below you say it's to ensure that values are read or written exactly once so as to satisfy assumptions of some lockfree code. A third reason is to provide sanitizers with a list of "known" unserialized memory accesses.

What is a serious problem is allowing the compiler to decide to re-read these values, meaning multiple uses of 'so4' in the same func can end up with different values. While the sample above is so short I don't believe any compiler would do it, I see 0 reason to make the proposed change. If anything, it is better to add these loads if in any doubt than to skip them.

Looking at the code again, I agree. The accesses in wg_send() should use atomic_load to ensure that the compiler does not reload a socket pointer after it's compared with NULL. The accesses in wg_socket_set() are serialized by the softc mutex, so there we are merely assuming that the stores are atomic and atomic_store does not accomplish anything except educating KCSAN. Clearly I didn't look carefully enough when I reviewed the first time, sorry about that.

We do not have a consistent policy on whether to use atomics when performing serialized stores to memory which is accessed in unserialized contexts, such as here. KCSAN (as it is implemented in FreeBSD, though it's easy to change) and Linux require both sides to use atomics (READ_ONCE/WRITE_ONCE). So, what should we do in general?

In D35989#839387, @mjg wrote:
In D35989#838250, @mjg wrote:

I don't see how the fences have anything to do with it.

The point of atomic load/stores is so that the compiler does not get ideas of splitting them into multiple instructions and does not get ideas about re-reading them, both of which are no longer guaranteed with the patch at hand.

We always assume that a load of an aligned pointer is atomic.

That assumption is not valid, although it almost always works out.

It's valid for the platforms FreeBSD runs on, this assumption is pervasive in the kernel. (Try enabling KCSAN.)

I see part of the point was lost, so I'm going to quote myself:

The point of atomic load/stores is so that the compiler does not get ideas of splitting them into multiple instructions

(point one of two)

So it is not about architectures getting torn loads/stores when the compiler emitted a load/store to a properly aligned var. It is about a hypothetical possibility of the compiler splitting the op -- for example generating 8 one-byte loads instead of 1 eight-byte load. While I never ran into such insanity myself, I do know it happened in the Linux land.

We do not have a consistent policy on whether to use atomics when performing serialized stores to memory which is accessed in unserialized contexts, such as here. KCSAN (as it is implemented in FreeBSD, though it's easy to change) and Linux require both sides to use atomics (READ_ONCE/WRITE_ONCE). So, what should we do in general?

What we should do is play it safe and make it a policy that unlocked loads and stores always use atomic_* primitives. To that end type checking needs to be added with atomic vars becoming opaque so that accidental direct access fails to compile.

This is easy for all types except for pointers, which will perhaps require some hackery to retain checking for the intended pointed-to-type apart from it being accessed with atomic.

In D35989#839724, @mjg wrote:
In D35989#839387, @mjg wrote:
In D35989#838250, @mjg wrote:

I don't see how the fences have anything to do with it.

The point of atomic load/stores is so that the compiler does not get ideas of splitting them into multiple instructions and does not get ideas about re-reading them, both of which are no longer guaranteed with the patch at hand.

We always assume that a load of an aligned pointer is atomic.

That assumption is not valid, although it almost always works out.

It's valid for the platforms FreeBSD runs on, this assumption is pervasive in the kernel. (Try enabling KCSAN.)

I see part of the point was lost, so I'm going to quote myself:

The point of atomic load/stores is so that the compiler does not get ideas of splitting them into multiple instructions

(point one of two)

So it is not about architectures getting torn loads/stores when the compiler emitted a load/store to a properly aligned var. It is about a hypothetical possibility of the compiler splitting the op -- for example generating 8 one-byte loads instead of 1 eight-byte load. While I never ran into such insanity myself, I do know it happened in the Linux land.

We do not have a consistent policy on whether to use atomics when performing serialized stores to memory which is accessed in unserialized contexts, such as here. KCSAN (as it is implemented in FreeBSD, though it's easy to change) and Linux require both sides to use atomics (READ_ONCE/WRITE_ONCE). So, what should we do in general?

What we should do is play it safe and make it a policy that unlocked loads and stores always use atomic_* primitives. To that end type checking needs to be added with atomic vars becoming opaque so that accidental direct access fails to compile.

This is easy for all types except for pointers, which will perhaps require some hackery to retain checking for the intended pointed-to-type apart from it being accessed with atomic.

Actually, this is not hard if we just use C11 atomics and use _Atomic directly, even for pointers.

sys/dev/wg/if_wg.c
790

The atomics here do not seem to be needed. The fence in NET_EPOCH_WAIT means the compiler cannot reorder the writes of the new values into the socket later than the calls to soclose(), and the fence in NET_EPOCH_ENTER below in the read should ensure that the reading code below can't read the old values?

I can drop this diff, but these specific atomics do seem over-much.

894

READ_ONCE semantics make sense here, yes.