Page MenuHomeFreeBSD

arm64: emulate swp/swpb instructions
ClosedPublic

Authored by kevans on Apr 18 2023, 6:24 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Oct 29, 1:05 PM
Unknown Object (File)
Oct 5 2024, 3:36 AM
Unknown Object (File)
Oct 4 2024, 7:13 PM
Unknown Object (File)
Oct 2 2024, 7:36 PM
Unknown Object (File)
Oct 2 2024, 5:32 AM
Unknown Object (File)
Oct 2 2024, 4:16 AM
Unknown Object (File)
Oct 2 2024, 12:17 AM
Unknown Object (File)
Sep 30 2024, 11:17 PM

Details

Summary

Add another undefined instruction handler for compat32 and watch out for
swp/swpb instructions. The atomic swap is done in assembly for simpler
fault handling that we can detect; atomic_fcmpset_{8,32} could have been
used, but it's harder to handle a fault as we have to do an atomic load
of the old value for the compare-and-swap operation. atomic_swap_8 is
easy to implement, but it suffers from the problem that there's no way
to set a fault handler and communicate the error back properly

swp_emulate_atomic just uses ll/sc to make for a simpler implementation.

SWP/SWPB were deprecated in ARMv6 and declared obsolete in ARMv7, but
this implementation is motivated by some proprietary software that still
uses SWP/SWPB. Because it's deprecated, emulation is pushed back behind
a sysctl that defaults to OFF in GENERIC so that it doesn't potentially
adversely affect package builds; it's unknown whether software may test
for a functional swp/swpb instruction with the desire of using it later,
so we err on the side of caution to ensure we don't end up with swp/swpb
use in freebsd/arm packages (which are built on aarch64).

The EMUL_SWP config option may be used to enable emulation by default in
environments where emulation is desired and won't really be turned off.

Sponsored by: Stormshield
Sponsored by: Klara, Inc.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 51224
Build 48115: arc lint + arc unit

Event Timeline

A few suggestions, feel free to follow/ignore.

sys/arm64/arm64/support.S
66 ↗(On Diff #120582)

I think you can use w7 rather than w1 & remove this instruction (and store through x1 below)

77 ↗(On Diff #120582)

You could move this & the identical mov below after 3:

sys/arm64/arm64/undefined.c
214

We could exit early if in Thumb mode as there is no Thumb swp or swpb instructions

This revision is now accepted and ready to land.Apr 20 2023, 9:17 AM

Shouldn't we check the condition flags of the instructions (bits 31-28) ?

sys/arm64/arm64/support.S
75 ↗(On Diff #120582)

This loop may goes for a long time. It may be a good idea to schedule other waiting threads in case it fails too many times.

This comment was removed by kevans.

Shouldn't we check the condition flags of the instructions (bits 31-28) ?

Whoops, will fix; thanks!

sys/arm64/arm64/support.S
75 ↗(On Diff #120582)

It shouldn't go for that long? This is a simple swap rather than compare-and-swap; I can't imagine we'd see enough contention in writing [x0] to worry about this.

sys/arm64/arm64/support.S
75 ↗(On Diff #120582)

In normal cases yes. But this might leave room for DOS attacks, in particular on CPUs that don't implement global monitors, in particular on non-cacheable memory regions. Details are given in the ARMv7 architecture reference manual, chapter 3.4.

I didn't dig more on this but I'm aware of a Linux kernel patch specifically taking care about that : https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1c5b51dfb7b4564008e0cadec5381a69e88b0d21

sys/arm64/arm64/support.S
75 ↗(On Diff #120582)

Another thread could be changing the memory causing the loop to not exit for a long time, e.g. https://xenbits.xen.org/xsa/advisory-295.txt

sys/arm64/arm64/support.S
75 ↗(On Diff #120582)

Ah, thanks for all of the references! Will fix and update shortly (along w/ sysctl and condition code bits)

Address review comments:

  • Avoid potential DoS, maybe_yield() after a couple failures
  • Add a sysctl, default OFF, for swp emulation
  • Add a kernel config(8) option so that some can turn it on by default
  • Actually look at condition codes and do the right thing (partially obtained from NetBSD, but massaged a bit to make it easier (for me) to audit)
This revision now requires review to proceed.Apr 25 2023, 8:12 PM
This revision is now accepted and ready to land.Apr 26 2023, 7:02 AM
sys/arm64/arm64/support.S
94 ↗(On Diff #121038)

Counting down from EMUL_SWP_ATTEMPTS and using cb[n]z could save us one instruction.

96 ↗(On Diff #121038)

It may be a good idea to avoid duplicating this attempt loop with the one below ending at l106

kevans added inline comments.
sys/arm64/arm64/support.S
96 ↗(On Diff #121038)

It's not clear to me if it's worth the additional complexity we'd need to jump back the correct sequence (swp or swpb)? I may be overlooking an 'obviously simple' solution to this, though.

sys/arm64/arm64/support.S
94 ↗(On Diff #121038)

I've made this change locally, but I don't know that it's necessarily worth updating the review just for that -- it's the obvious change up at the initial w5 assignment, add -> sub and cmp+b.eq -> cbz. This will be rolled into the final commit or next update if we have more changes worth another iteration...

sys/arm64/arm64/support.S
121 ↗(On Diff #121038)

You need EXIT_USER_ACCESS and SET_FAULT_HANDLER in this path

Per discussion on IRC, D39837 adds swapuword8/swapuword32 instead that will replace swp_emulate_atomic in this one (with the added bonus that it will use LSE swp/swpb when available)

Update to use swapuword* instead; retries implemented in C instead.

This revision now requires review to proceed.Apr 26 2023, 8:07 PM

Adjust for swapuword* -> swapueword* rename

I'd like to commit this within the next couple of days, if there are no further
objections...

This revision was not accepted when it landed; it landed in state Needs Review.May 15 2023, 3:42 PM
This revision was automatically updated to reflect the committed changes.