Page MenuHomeFreeBSD

arm: Add support for using VFP in kernel
ClosedPublic

Authored by kd on Nov 17 2022, 2:47 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jan 24, 5:39 PM
Unknown Object (File)
Tue, Jan 14, 8:34 PM
Unknown Object (File)
Tue, Jan 14, 8:24 PM
Unknown Object (File)
Tue, Jan 14, 6:53 PM
Unknown Object (File)
Tue, Jan 14, 2:13 PM
Unknown Object (File)
Tue, Jan 14, 10:30 AM
Unknown Object (File)
Mon, Jan 13, 7:37 PM
Unknown Object (File)
Thu, Jan 9, 5:42 PM

Details

Summary

Implement missing logic to allow in-kernel VFP operation for ARMv7 NEON.
The implementation is strongly based on arm64 code.
It introduces a family of fpu_kern_* functions to enable the usage of VFP instructions in kernel.
Apart from that the existing VFP logic was modified, taking into account that the state of the VFP registers can now be modified in the kernel.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/arm/conf/ARMADA38X
25 ↗(On Diff #113234)

This file should be a separate review as it adds support for neon/ossl to the specific board.

85 ↗(On Diff #113234)

revert that change

113 ↗(On Diff #113234)

should be removed

sys/arm/conf/std.armv7
69 ↗(On Diff #113234)

that lines should be left intact

mkoz_semihalf.com marked 3 inline comments as done.

Revert some changes in ARMADA38X and std.armv7 conf files.
neon/ossl support for ARMADA38X moved to another commit: https://reviews.freebsd.org/D37441

mw requested changes to this revision.Nov 18 2022, 6:07 PM
mw added inline comments.
sys/arm/arm/vfp.c
423

This line is a bit too long, please break it as below:

	    ("Mangled pcb_fpusaved %x %p %p", pcb->pcb_fpflags, pcb->pcb_fpusaved,
            &pcb->pcb_fpustate));
sys/arm/conf/ARMADA38X
59 ↗(On Diff #113280)

Please add comment above or add to a relevant group. Also align spacing to the other entries.

60 ↗(On Diff #113280)

Shouldn't this change go to https://reviews.freebsd.org/D37441 ?

This revision now requires changes to proceed.Nov 18 2022, 6:07 PM
sys/arm/conf/ARMADA38X
60 ↗(On Diff #113280)

IMO it should be included in D37420.

I'm not sure this is correct in the case we have a userspace thread calling into the kernel when the kernel uses the VFP.

When we enter the fpu state we save the userspace vfp registers to pcb->pcb_vfpstate (because pcb->pcb_vfpsaved == NULL). We then enable the VFP, however if a context switch was to occur before restoring the old vfp registers cpu_switch will also try to save the vfp registers to pcb->pcb_vfpstate. This will lead to the kernel registers trashing the userspace values.

You also need to update share/man/man9/fpu_kern.9

@andrew
Thanks for looking into it.
However, isn't that why we always pass ctx to fpu_kern_enter? In the end of that function (line 425) the vfpsaved pointer is swapped with ctx->prev which apparently should provide additional place for storing "old" vfp state. Therefore, in context switch any additional call to vfp_save_state should cause storing vfp state in temporary &ctx->state. Is it a case you were referring to?

I'm just curious as this approach works well on arm64 and I never saw issues with in-kernel vfp, however I admit I might not pushed it hard enough. Anyway, if there is a bug in this code I think it must be resolved in other platforms as well.

Therefore, in context switch any additional call to vfp_save_state should cause storing vfp state in temporary &ctx->state. Is it a case you were referring to?

The problem is cpu_switch doesn't call vfp_save_state on arm so it won't use the pcb->pcb_vfpsaved pointer.

I'm just curious as this approach works well on arm64 and I never saw issues with in-kernel vfp, however I admit I might not pushed it hard enough. Anyway, if there is a bug in this code I think it must be resolved in other platforms as well.

On arm64 we always use pcb->pcb_vfpsaved to find the correct state to use in context switch so make sure it is always valid (except in the savectx special case).

If you make sure the pcb_vfpsaved pointer is always valid, and call vfp_save_state where we are currently calling vfp_store the save side should work the same as on arm64. For restore you'll need to update vfp_bounce to handle a kernel VFP exception.

On arm64 we always use pcb->pcb_vfpsaved to find the correct state to use in context switch so make sure it is always valid (except in the savectx special case).
If you make sure the pcb_vfpsaved pointer is always valid, and call vfp_save_state where we are currently calling vfp_store the save side should work the same as on arm64. For restore you'll need to update vfp_bounce to handle a kernel VFP exception.

Yes, you're right, good point. We will follow up on this.

mkoz_semihalf.com marked an inline comment as done.

Fix context switch bug mentioned by @andrew.
Replace vfp_store with vfp_save_state in proper way.
vfp_bounce updated according to comment.

mkoz_semihalf.com marked 4 inline comments as done.

Update fpu_kern man
Move sys/arm/conf/ARMADA38X from https://reviews.freebsd.org/D37419

sys/arm/arm/exec_machdep.c
107

remove empty line

109

fill_fpregs?

131

same as above

mkoz_semihalf.com marked 3 inline comments as done.

Fix KASSERT msg

The logic looks correct, just a few style issues.

sys/arm/arm/exec_machdep.c
107

Blank line

sys/arm/conf/ARMADA38X
60 ↗(On Diff #113939)

This still needs to be removed.

sys/arm/include/fpu.h
7

What is ucontext.h needed for?

sys/arm/arm/exec_machdep.c
122

(set/get)_vfpcontext logic will be modified to closer match aarch64 approach. Patch in progress.

(set/get)_vfpcontext logic modified with context flag

wma updated this revision to Diff 114066.
wma edited reviewers, added: mkoz_semihalf.com; removed: wma.

The changes to get/set_vfpcontext are a userspace ABI change. The layout of mcontext_vfp_t has changed and existing binaries don't know about the _MC_FP_VALID flag.

I also think set/get_fpcontext on arm64 is incorrect. It assumes td == curthread and the critical section could be narrowed to just calling into the vfp functions.

wma edited the summary of this revision. (Show Details)

I've created D37994, D37998, and D38000 to clean up VFP handling on arm64. This should be updated in a similar way.

I'm also not sure how adding _MC_FP_VALID helps. It is likely a mistake to have added it on arm64, but is now part of the ABI so will be difficult to change. It's not part of the arm ABI so we sould need a good reason to break ABI when adding it.

kd edited reviewers, added: wma; removed: kd.
kd retitled this revision from arm: support for VFP in kernel to arm: Add support for using VFP in kernel.
kd edited the summary of this revision. (Show Details)
kd edited reviewers, added: andrew; removed: mkoz_semihalf.com.
  • Fix a shallow copy bug in cpu_fork
  • Adjust the logic to match the recent changes made to arm64 implementation by andrew@.
  • Remove the _MC_FP_VALID, as it's not needed.
sys/arm/include/fpu.h
7

We need it to get the declaration of mcontext_vfp_t.
This is then used by get_vfpcontext(struct thread *, mcontext_vfp_t *); inside machine/vfp.h.

andrew added inline comments.
sys/arm/include/pcb.h
73

You can remove PCB_FP_USERMASK

This revision was not accepted when it landed; it landed in state Needs Review.Feb 4 2023, 7:22 PM
This revision was automatically updated to reflect the committed changes.

On the lists, I've provided 3 small C/C++ programs that when used,
sometimes mixed with some use of gdb or lldb, get assert panics
from a debug kernel or corrupted floating point datanow. Look at the
new text in the messages indicated below to see the programs with
instructions in the comments for how to build and test:

"Assertion td == curthread failed"
(debugger backtrace generation for breakpoint hit):
https://lists.freebsd.org/archives/dev-commits-src-main/2023-February/013035.html

"Called fill_fpregs while the kernel is using the VFP"
(core dump generation with multithreading involved):
https://lists.freebsd.org/archives/dev-commits-src-main/2023-February/013050.html

Foating point data corruption across independent threads
(floating point value from wrong thread sometimes show up):
https://lists.freebsd.org/archives/dev-commits-src-main/2023-February/013061.html

As stands, mixing multi-threading and floating point use on armv7
main [so: 14] is problematical after the changes.

The reports also show examples of the failures, with backtraces for the
panic examples.

In a private E-mail, Kornel D. confirmed duplicating the first one listed above.

sys/arm/include/reg.h
17

The fpr_r naming convention here is unlike anywhere else and is not handled by devel/libunwind . That last leads to 6000+ skipped ports from building for armv7. (No claim that there would not be a next problem in the way, however.)

For reference:

# grep -r "\<fpr_r\>" /usr/main-src/ | more
. . . ignoring debian no such file notices . . .
/usr/main-src/sys/arm/arm/machdep_kdb.c:        memcpy(regs->fpr_r, pcb->pcb_vfpstate.reg,
/usr/main-src/sys/arm/arm/machdep_kdb.c:            sizeof(regs->fpr_r));
/usr/main-src/sys/arm/arm/machdep_kdb.c:        memcpy(pcb->pcb_vfpstate.reg, regs->fpr_r, sizeof(regs->fpr_r));
/usr/main-src/sys/arm/include/reg.h:    __uint64_t      fpr_r[32];

Everything else uses fpr (no _r suffix).

devel/libunwind expects the fpr spelling ( from http://ampere2.nyi.freebsd.org/data/main-armv7-default/pb73012d372f5_s91b2da1370/logs/errors/libunwind-20211201_1.log ):

--- ptrace/_UPT_access_fpreg.lo ---
ptrace/_UPT_access_fpreg.c:107:25: error: no member named 'fpr' in 'struct fpreg'
memcpy(&fpreg.fpr[reg], val, sizeof(unw_fpreg_t));
~~~~~ ^
ptrace/_UPT_access_fpreg.c:123:30: error: no member named 'fpr' in 'struct fpreg'
memcpy(val, &fpreg.fpr[reg], sizeof(unw_fpreg_t));
~~~~~ ^
2 errors generated.
*** [ptrace/_UPT_access_fpreg.lo] Error code 1
sys/arm/include/reg.h
17

Sorry for the breakage. I'll take a look at this tomorrow.
I tried looking at the logs you posted, but the domain http://ampere2.nyi.freebsd.org/, can't be resolved for me.
Have you perhaps made a typo?

Going to https://reviews.freebsd.org/D37419 and clicking on the link shows the log just fine for me. (I just tried it again.) ampere2.nyi.freebsd.org is a well known FreeBSD builder machine.

I'll note that using https:// does not work for me: that complains about not being able to connect to the server. It has to be http:// from my context.

A list of the FreeBSD servers is maintained at: https://github.com/bdrewery/pkg-status.freebsd.org/blob/master/servers.txt

My guess here is that I have IPv6 access via my ISP (not just IPv4) and that you do not have IPv6 access. I did not find any official listing of the protocols supported but I did find material claiming IPv6-only for some of the servers as an explanation of why some folks in a 2021 discussion could vs. could-not get information from the servers of interest in that discussion.

A list of the FreeBSD servers is maintained at: https://github.com/bdrewery/pkg-status.freebsd.org/blob/master/servers.txt

My guess here is that I have IPv6 access via my ISP (not just IPv4) and that you do not have IPv6 access. I did not find any official listing of the protocols supported but I did find material claiming IPv6-only for some of the servers as an explanation of why some folks in a 2021 discussion could vs. could-not get information from the servers of interest in that discussion.

Oh, that explains it.
My ISP only offers IPv4...
Thanks for looking into this!

mmel added inline comments.
sys/arm/arm/exec_machdep.c
195

You should keep spare field cleared..

201

Who is responsible for setting mc_vfp_size? The entire ucontext (including mcontext) is cleared for all uses in kern_context.c and in sendsig().

Also mcp->mc_vfp_ptr is always NULL.

sys/arm/arm/machdep_kdb.c
113

Should not be PCB_FP_NOSAVE checked here ?

sys/arm/arm/vfp.c
407

You cannot simply enable VFP here. It can be in a previous state, for example in the middle of a vectorized instruction of another thread. Before enabling it, a new empty context must be loaded.

426

I'm not sure, but if this is called from an interruptible context then we may end with inconsistent VFP state, flags and ctx. Imho this should be covered by the critical section. Same should be applied also to fpu_kern_leave()

sys/arm/include/pcb.h
69

The handling of the PCB_FP_STARTED flag is clearly incorrect, it should very likely be modified inside the vfp_restore() . However, the exact meaning of the PCB_FP_* flags is not clear, some brief commentary would be useful.

Should the PCB_FP_STARTED flag be set when loading any or only kernel context?

Hi,

Thanks for the comments. I don't have an armv7 with VFP anymore. I'll try to borrow one from @mw, but it'll probably be another week before I get my hands on it and look deeper into this.