Implement missing logic to allow in-kernel VFP operation for ARMv7 NEON.
The implementation is strongly based on arm64 code.
It introduces a family of fpu_kern_* functions to enable the usage of VFP instructions in kernel.
Apart from that the existing VFP logic was modified, taking into account that the state of the VFP registers can now be modified in the kernel.
Details
- Reviewers
mw imp wma andrew - Commits
- rG6926e2699ae5: arm: Add support for using VFP in kernel
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
Revert some changes in ARMADA38X and std.armv7 conf files.
neon/ossl support for ARMADA38X moved to another commit: https://reviews.freebsd.org/D37441
sys/arm/arm/vfp.c | ||
---|---|---|
421 | This line is a bit too long, please break it as below: ("Mangled pcb_fpusaved %x %p %p", pcb->pcb_fpflags, pcb->pcb_fpusaved, &pcb->pcb_fpustate)); | |
sys/arm/conf/ARMADA38X | ||
60 | Please add comment above or add to a relevant group. Also align spacing to the other entries. | |
61 | Shouldn't this change go to https://reviews.freebsd.org/D37441 ? |
I'm not sure this is correct in the case we have a userspace thread calling into the kernel when the kernel uses the VFP.
When we enter the fpu state we save the userspace vfp registers to pcb->pcb_vfpstate (because pcb->pcb_vfpsaved == NULL). We then enable the VFP, however if a context switch was to occur before restoring the old vfp registers cpu_switch will also try to save the vfp registers to pcb->pcb_vfpstate. This will lead to the kernel registers trashing the userspace values.
You also need to update share/man/man9/fpu_kern.9
@andrew
Thanks for looking into it.
However, isn't that why we always pass ctx to fpu_kern_enter? In the end of that function (line 425) the vfpsaved pointer is swapped with ctx->prev which apparently should provide additional place for storing "old" vfp state. Therefore, in context switch any additional call to vfp_save_state should cause storing vfp state in temporary &ctx->state. Is it a case you were referring to?
I'm just curious as this approach works well on arm64 and I never saw issues with in-kernel vfp, however I admit I might not pushed it hard enough. Anyway, if there is a bug in this code I think it must be resolved in other platforms as well.
Therefore, in context switch any additional call to vfp_save_state should cause storing vfp state in temporary &ctx->state. Is it a case you were referring to?
The problem is cpu_switch doesn't call vfp_save_state on arm so it won't use the pcb->pcb_vfpsaved pointer.
I'm just curious as this approach works well on arm64 and I never saw issues with in-kernel vfp, however I admit I might not pushed it hard enough. Anyway, if there is a bug in this code I think it must be resolved in other platforms as well.
On arm64 we always use pcb->pcb_vfpsaved to find the correct state to use in context switch so make sure it is always valid (except in the savectx special case).
If you make sure the pcb_vfpsaved pointer is always valid, and call vfp_save_state where we are currently calling vfp_store the save side should work the same as on arm64. For restore you'll need to update vfp_bounce to handle a kernel VFP exception.
On arm64 we always use pcb->pcb_vfpsaved to find the correct state to use in context switch so make sure it is always valid (except in the savectx special case).
If you make sure the pcb_vfpsaved pointer is always valid, and call vfp_save_state where we are currently calling vfp_store the save side should work the same as on arm64. For restore you'll need to update vfp_bounce to handle a kernel VFP exception.
Yes, you're right, good point. We will follow up on this.
Fix context switch bug mentioned by @andrew.
Replace vfp_store with vfp_save_state in proper way.
vfp_bounce updated according to comment.
Update fpu_kern man
Move sys/arm/conf/ARMADA38X from https://reviews.freebsd.org/D37419
sys/arm/arm/exec_machdep.c | ||
---|---|---|
123 ↗ | (On Diff #113939) | (set/get)_vfpcontext logic will be modified to closer match aarch64 approach. Patch in progress. |
The changes to get/set_vfpcontext are a userspace ABI change. The layout of mcontext_vfp_t has changed and existing binaries don't know about the _MC_FP_VALID flag.
I also think set/get_fpcontext on arm64 is incorrect. It assumes td == curthread and the critical section could be narrowed to just calling into the vfp functions.
I've created D37994, D37998, and D38000 to clean up VFP handling on arm64. This should be updated in a similar way.
I'm also not sure how adding _MC_FP_VALID helps. It is likely a mistake to have added it on arm64, but is now part of the ABI so will be difficult to change. It's not part of the arm ABI so we sould need a good reason to break ABI when adding it.
- Fix a shallow copy bug in cpu_fork
- Adjust the logic to match the recent changes made to arm64 implementation by andrew@.
- Remove the _MC_FP_VALID, as it's not needed.
sys/arm/include/fpu.h | ||
---|---|---|
7 | We need it to get the declaration of mcontext_vfp_t. |
sys/arm/include/pcb.h | ||
---|---|---|
73 | You can remove PCB_FP_USERMASK |
On the lists, I've provided 3 small C/C++ programs that when used,
sometimes mixed with some use of gdb or lldb, get assert panics
from a debug kernel or corrupted floating point datanow. Look at the
new text in the messages indicated below to see the programs with
instructions in the comments for how to build and test:
"Assertion td == curthread failed"
(debugger backtrace generation for breakpoint hit):
https://lists.freebsd.org/archives/dev-commits-src-main/2023-February/013035.html
"Called fill_fpregs while the kernel is using the VFP"
(core dump generation with multithreading involved):
https://lists.freebsd.org/archives/dev-commits-src-main/2023-February/013050.html
Foating point data corruption across independent threads
(floating point value from wrong thread sometimes show up):
https://lists.freebsd.org/archives/dev-commits-src-main/2023-February/013061.html
As stands, mixing multi-threading and floating point use on armv7
main [so: 14] is problematical after the changes.
The reports also show examples of the failures, with backtraces for the
panic examples.
In a private E-mail, Kornel D. confirmed duplicating the first one listed above.
sys/arm/include/reg.h | ||
---|---|---|
17 ↗ | (On Diff #116496) | The fpr_r naming convention here is unlike anywhere else and is not handled by devel/libunwind . That last leads to 6000+ skipped ports from building for armv7. (No claim that there would not be a next problem in the way, however.) For reference: # grep -r "\<fpr_r\>" /usr/main-src/ | more . . . ignoring debian no such file notices . . . /usr/main-src/sys/arm/arm/machdep_kdb.c: memcpy(regs->fpr_r, pcb->pcb_vfpstate.reg, /usr/main-src/sys/arm/arm/machdep_kdb.c: sizeof(regs->fpr_r)); /usr/main-src/sys/arm/arm/machdep_kdb.c: memcpy(pcb->pcb_vfpstate.reg, regs->fpr_r, sizeof(regs->fpr_r)); /usr/main-src/sys/arm/include/reg.h: __uint64_t fpr_r[32]; Everything else uses fpr (no _r suffix). devel/libunwind expects the fpr spelling ( from http://ampere2.nyi.freebsd.org/data/main-armv7-default/pb73012d372f5_s91b2da1370/logs/errors/libunwind-20211201_1.log ): --- ptrace/_UPT_access_fpreg.lo --- ptrace/_UPT_access_fpreg.c:107:25: error: no member named 'fpr' in 'struct fpreg' memcpy(&fpreg.fpr[reg], val, sizeof(unw_fpreg_t)); ~~~~~ ^ ptrace/_UPT_access_fpreg.c:123:30: error: no member named 'fpr' in 'struct fpreg' memcpy(val, &fpreg.fpr[reg], sizeof(unw_fpreg_t)); ~~~~~ ^ 2 errors generated. *** [ptrace/_UPT_access_fpreg.lo] Error code 1 |
sys/arm/include/reg.h | ||
---|---|---|
17 ↗ | (On Diff #116496) | Sorry for the breakage. I'll take a look at this tomorrow. |
Going to https://reviews.freebsd.org/D37419 and clicking on the link shows the log just fine for me. (I just tried it again.) ampere2.nyi.freebsd.org is a well known FreeBSD builder machine.
I'll note that using https:// does not work for me: that complains about not being able to connect to the server. It has to be http:// from my context.
A list of the FreeBSD servers is maintained at: https://github.com/bdrewery/pkg-status.freebsd.org/blob/master/servers.txt
My guess here is that I have IPv6 access via my ISP (not just IPv4) and that you do not have IPv6 access. I did not find any official listing of the protocols supported but I did find material claiming IPv6-only for some of the servers as an explanation of why some folks in a 2021 discussion could vs. could-not get information from the servers of interest in that discussion.