arm64: set FPEN if we're stuck with HCR_EL2.E2H
ClosedPublic
Actions

Authored by kevans on Feb 28 2023, 7:13 AM.

Details

Reviewers

andrew
manu

Group Reviewers

arm64

Commits

rGdc8616edc580: arm64: set FPEN if we're stuck with HCR_EL2.E2H

Summary

On Apple Silicon systems, E2H can't actually be cleared; we're stuck
with it. Check it again when we're setting up CPTR_EL2 and set FPEN
appropriately to avoid later trapping to EL2 on writes to SIMD
registers.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

kevans created this revision.Feb 28 2023, 7:13 AM

Herald added a reviewer: manu. · View Herald TranscriptFeb 28 2023, 7:13 AM

Herald added subscribers: emaste, imp. · View Herald Transcript

kevans requested review of this revision.Feb 28 2023, 7:13 AM

Harbormaster completed remote builds in B50058: Diff 118054.Feb 28 2023, 7:13 AM

andrew added inline comments.Feb 28 2023, 8:26 AM

sys/arm64/arm64/locore.S
264	I think we'll need an `isb` between setting `hcr_el2` and reading it to ensure the change is visible to instructions after it (i.e. the `mrs`)
sys/arm64/include/hypervisor.h
51	We should have two sets of macros. The current ones are only valid when `HCR_EL2.E2H == 0`. `TTA` and `TCPAC` are the only macros that are shared between the two.
95	This is normally named `..._SHIFT` in this file and `armreg.h`

Address review commentary:

Add an isb
Follow convention for shift naming

I also went back to https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/CPTR-EL2--Architectural-Feature-Trap-Register--EL2-
and discovered where I misunderstood the differences (the second table depicting
HCR_EL2.E2H == 0 didn't stick out on my first read, so there was some confusion
there) -- for E2H == 1, we shouldn't be using CPTR_RES1 at all as there are no
RES1 bits.

Harbormaster completed remote builds in B50066: Diff 118083.Feb 28 2023, 4:19 PM

andrew accepted this revision.Feb 28 2023, 8:39 PM

andrew added inline comments.

sys/arm64/arm64/locore.S
282–297	I think you can write this as (but might have `x3` and `x4` backwards in the `csel` instruction): tst x4, #HCR_E2H mov x3, #CPTR_RES1 mov x4, #CPTR_FPEN csel x2, x3, x4, eq

This revision is now accepted and ready to land.Feb 28 2023, 8:39 PM

kevans marked an inline comment as done.Feb 28 2023, 10:08 PM

kevans added inline comments.

sys/arm64/arm64/locore.S
282–297	Ah, that's much cleaner, thanks! I've confirmed that it still works as expected on both an M1 as well as a Mt. Snow; reverted the HCR_* part of the hypervisor.h changes and will commit soon.