Details

sys/arm/arm/generic_timer.c
877–878	This isn't safe when different CPUs have different levels of support. You can use `get_kernel_reg` to get a common view of the CPUs, however the overhead may be large for a short delay, and the value could change when secondary CPUs are enabled. You could use the CPU feature framework in D47812 to handle this by adding a CPU_FEAT_SYSTEM check that uses `get_kernel_reg` to see if all CPUs can use the instructions.
879	What if the CPU receives an exception before the timeout? Won't that mean DELAY would return early?

andrew added inline comments.Jan 22 2025, 12:29 PM

sys/arm/arm/generic_timer.c
879	The timeout value passed to the `wfet` and `wfit` instructions is the value the virtual timer should be when a local wakeup event is generated. This means we'll need to calculate this value. The `counts` calculation in `arm_tmr_do_delay` appears to be what we need to find how far into the future to wake up. If you extract that to a new function you could do something like the following: counts = <calculate counts>; start = sc->get_cntxct(false); end = start + count; while (counts > 0) { wfet(end); last = sc->get_cntxct(sc->physical_sys); counts -= (int32_t)(last - first); first = last; }

harry.moulton_arm.com added inline comments.Jan 24 2025, 6:25 PM

sys/arm/arm/generic_timer.c
877–878	We'd need to add the `wfxt_check` and `wfxt_enable` functions. I've done this in machdep.c. I wasn't sure how to use the cpu_feat framework to check if a feature is enabled on the fly, so I looked at what feat_pan does, added a bool in machdep.c and an `is_wfxt_en()` function that can be used elsewhere in the kernel. This appears to work okay when testing.
879	What is the reason for putting the `wfet()` inside the loop, as opposed to just calculating the CNTVCT_EL0 value for the timeout? To avoid the previous comment about an exception accidentally ending the delay early?

andrew added inline comments.Jan 27 2025, 9:36 AM

sys/arm/arm/generic_timer.c
877–878	You can put them in this file so we can hide the details from the rest of the kernel that doesn't need to know about them. Because it uses a `DATA_SET` the functions and struct can be static. The DATA_SET macro creates an array in a linker section we can find and iterate over. You can also look at `ptrauth.c`. It sets a variable, but most of the details are hidden in a single file (other than `elf64_addr_mask`). If you use `CPU_FEAT_AFTER_DEV \| CPU_FEAT_SYSTEM` for the flags the check and enable functions will onlt run on a single CPU, and late enough that other CPUs are online. You can then use `get_kernel_reg` to read the common view of the ID register. It will only report wfxt support if all CPUs support it making the instruction safe to use when the function may be scheduled on different CPUs.
879	`wfet` can finish early if the CPU receives an event, e.g. another CPU executes an `sev` instruction.

Use the CPU feature framework to determine whether the WFxT instructions are available on all CPUs, and set enable_wfxt accordingly. When the timer is used for DELAY(), it'll check whether it can use wfet().

Harbormaster completed remote builds in B62089: Diff 150100.Jan 29 2025, 3:19 PM

andrew added inline comments.Jan 29 2025, 3:58 PM

sys/arm/arm/generic_timer.c
841–850	`last` the last value read in the loop. The above change should be ok to cause `wfet` to wait with the correct timeout.
885	We need to use `get_kernel_reg` to get the common view of the ID registers. If one CPU has the extension and another doesn't then we shouldn't enable it as `DELAY` may be scheduled on either.

Amended arm_tmr_do_delay() to use end for wfet(), used get_kernel_reg() rather than READ_SPECIALREG(), and added CPU_FEAT_AFTER_DEV flag, so support is checked after secondary cpus are brought up.