Page MenuHomeFreeBSD

riscv vmm: add SSTC check
ClosedPublic

Authored by br on Dec 12 2024, 4:52 PM.
Tags
None
Referenced Files
F107495175: D48058.diff
Wed, Jan 15, 12:19 AM
Unknown Object (File)
Sun, Dec 29, 12:56 AM
Unknown Object (File)
Sat, Dec 28, 3:14 AM
Unknown Object (File)
Fri, Dec 27, 12:03 PM
Unknown Object (File)
Fri, Dec 27, 10:53 AM
Unknown Object (File)
Fri, Dec 27, 10:38 AM
Unknown Object (File)
Fri, Dec 27, 10:20 AM
Unknown Object (File)
Fri, Dec 27, 7:21 AM

Details

Summary

Check for SSTC extension support and pass to the guest.

This is needed for EIC7700 that does not have SSTC.

The check is done using standard VM_GET_CAPABILITY feature

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

br requested review of this revision.Dec 12 2024, 4:52 PM

Do we really want one vm_cap_type entry per extension?

Also, please upload with full context

Do we really want one vm_cap_type entry per extension?

not sure but since the vm_cap_type is completely empty may be fine ?

usr.sbin/bhyve/riscv/fdt.c
92

Would it be a bit cleaner to compute this string bhyve_init_platform() and pass it directly? If we grow additional optional extensions, this will get unwieldy.

Per @markj suggestion, construct the ISA string in the bhyve_init_platform() and then pass ready-to-use string to FDT code

usr.sbin/bhyve/riscv/fdt.c
131

This should be const char *isa everywhere.

add const qualifier to char *isa

markj added inline comments.
usr.sbin/bhyve/riscv/bhyverun_machdep.c
329

I'd rather not have these unchecked buffer writes, even though it's safe for now.

Maybe something like (truncation checking omitted):

assert(error == 0);
if (retval == 1)
    snprintf(isa, sizeof(isa), "%s%s", "rv64imafdc", retval == 1 ? "_sstc" : "");

or use an sbuf or use asprintf().

This revision is now accepted and ready to land.Mon, Dec 16, 5:04 PM

use snprintf() to ensure we don't overwrite buffer available

This revision now requires review to proceed.Mon, Dec 16, 5:22 PM

remove extra retval check

I have no objection to the current revision, although we can say a few things about how the code may look/change in the future.

Obviously, the feature-to-ISA-string code will continue to expand/balloon, as we will want to advertise the availability of the many unprivileged ISA extensions to the guest as well.

The problem is that we have no mechanism for reporting extension presence from the kernel to userspace, therefore you are using vm_cap_type here. I am working on something "official" for this, most likely it will be a sysarch exporting a bitmap, with bit definitions compatible to what can be found in Linux. Therefore one call can obtain the full set of supported extensions (sanitized), and we will be able to use this in bhyve to construct the ISA string for the guest.

So I don't think we will need to rely on vm_cap_type in the long run. We also don't need to be precious about the ABI, given the experimental state of RISC-V vmm and hypervisor hardware, and since I intend for this extension reporting stuff to be resolved before the release of 15.0. So this VM_CAP_SSTC can easily be removed later.

Here's a list of what the Linux kernel tracks internally, for reference. You can see the sheer number of extensions already.
https://github.com/torvalds/linux/blob/master/arch/riscv/include/asm/hwcap.h


Some meta-commentary on the larger situation:

I've commented in a few places about the underbaked status of RISC-V ISA, the extension reporting mechanisms being just one of aspect of this. The choice not to define ISA feature registers which are available to supervisor or user mode (even emulated) has certainly made things more complex on the software side.

Instead, the spec defines an official format for the ISA string, which gets encoded in the device-tree as the source of truth. Fine, this allows some greater flexibility, but leads to the situation we have here:

To determine the set of available ISA features on a given hart, firmware must parse this string. OpenSBI and u-boot do this independently. Next, we get to the kernel, where we must do the string parsing again, turn it into whatever kernel state makes sense, and export it to userspace in a new representation (bitmap). Then, in order to provide the features to the hypervisor guest, we translate it back into an ISA string, only for the guest kernel to do the parsing once more.

So much software is built upon this type of excessive/redundant translation of data from one format to another, but the insanity of this feels especially evident in this case. With all the functionality that has been stuffed into the SBI so far, I am really surprised that an "ISA features" function is not one of them, as this would allow the kernel (real or virtual) to query the information from the firmware/hypervisor and be more-or-less done with it. No error-prone string parsing required. Sadly, it is a bit late to add such a thing now, as we will still be stuck supporting the legacy case for a long time.

This revision is now accepted and ready to land.Mon, Dec 16, 6:22 PM

I have no objection to the current revision, although we can say a few things about how the code may look/change in the future.

Obviously, the feature-to-ISA-string code will continue to expand/balloon, as we will want to advertise the availability of the many unprivileged ISA extensions to the guest as well.

The problem is that we have no mechanism for reporting extension presence from the kernel to userspace, therefore you are using vm_cap_type here. I am working on something "official" for this, most likely it will be a sysarch exporting a bitmap, with bit definitions compatible to what can be found in Linux. Therefore one call can obtain the full set of supported extensions (sanitized), and we will be able to use this in bhyve to construct the ISA string for the guest.

So I don't think we will need to rely on vm_cap_type in the long run. We also don't need to be precious about the ABI, given the experimental state of RISC-V vmm and hypervisor hardware, and since I intend for this extension reporting stuff to be resolved before the release of 15.0. So this VM_CAP_SSTC can easily be removed later.

I'll just comment that we will probably still want some kind of indirection through vmm to fetch the set of supported extensions, as vmm might want to "mask" some extensions, e.g., because they require some specific support in the hypervisor.

usr.sbin/bhyve/riscv/bhyverun_machdep.c
329

We should still check for truncation.

I have no objection to the current revision, although we can say a few things about how the code may look/change in the future.

Obviously, the feature-to-ISA-string code will continue to expand/balloon, as we will want to advertise the availability of the many unprivileged ISA extensions to the guest as well.

The problem is that we have no mechanism for reporting extension presence from the kernel to userspace, therefore you are using vm_cap_type here. I am working on something "official" for this, most likely it will be a sysarch exporting a bitmap, with bit definitions compatible to what can be found in Linux. Therefore one call can obtain the full set of supported extensions (sanitized), and we will be able to use this in bhyve to construct the ISA string for the guest.

So I don't think we will need to rely on vm_cap_type in the long run. We also don't need to be precious about the ABI, given the experimental state of RISC-V vmm and hypervisor hardware, and since I intend for this extension reporting stuff to be resolved before the release of 15.0. So this VM_CAP_SSTC can easily be removed later.

I'll just comment that we will probably still want some kind of indirection through vmm to fetch the set of supported extensions, as vmm might want to "mask" some extensions, e.g., because they require some specific support in the hypervisor.

Indeed. But if bhyve itself can't be trusted to do this masking, we should just move the string generation into the vmm module, where it already has full access to required kernel state.

I have no objection to the current revision, although we can say a few things about how the code may look/change in the future.

Obviously, the feature-to-ISA-string code will continue to expand/balloon, as we will want to advertise the availability of the many unprivileged ISA extensions to the guest as well.

The problem is that we have no mechanism for reporting extension presence from the kernel to userspace, therefore you are using vm_cap_type here. I am working on something "official" for this, most likely it will be a sysarch exporting a bitmap, with bit definitions compatible to what can be found in Linux. Therefore one call can obtain the full set of supported extensions (sanitized), and we will be able to use this in bhyve to construct the ISA string for the guest.

I'd prefer to export the newer standard __riscv(_vendor)_feature_bits as specified in https://github.com/riscv-non-isa/riscv-c-api-doc/blob/main/src/c-api.adoc#function-multi-version and then shim the Linux hwprobe API for compatibility if needed, though ideally we'd not need that and move software over to the standard. Otherwise we can never add an extension that Linux lacks support for, since it won't have an encoding (and with CHERI's RISC-V standardisation on the horizon and Linux support being many years behind FreeBSD that's going to be a very real problem). It doesn't cover everything that Linux's hwprobe does, but we can ignore or special-case those, or improve the standard API spec if it's something software widely needs.

So I don't think we will need to rely on vm_cap_type in the long run. We also don't need to be precious about the ABI, given the experimental state of RISC-V vmm and hypervisor hardware, and since I intend for this extension reporting stuff to be resolved before the release of 15.0. So this VM_CAP_SSTC can easily be removed later.

If we're happy to break ABI down the road here then I am less concerned about the shortcut.

If you want to discuss extension reporting APIs in more detail some time in the new year I'd be more than happy to. It's something I'd have liked to have done a while ago but have never had the time to put my thoughts into code.

I've commented in a few places about the underbaked status of RISC-V ISA, the extension reporting mechanisms being just one of aspect of this. The choice not to define ISA feature registers which are available to supervisor or user mode (even emulated) has certainly made things more complex on the software side.

Instead, the spec defines an official format for the ISA string, which gets encoded in the device-tree as the source of truth. Fine, this allows some greater flexibility, but leads to the situation we have here:

To determine the set of available ISA features on a given hart, firmware must parse this string. OpenSBI and u-boot do this independently. Next, we get to the kernel, where we must do the string parsing again, turn it into whatever kernel state makes sense, and export it to userspace in a new representation (bitmap). Then, in order to provide the features to the hypervisor guest, we translate it back into an ISA string, only for the guest kernel to do the parsing once more.

So much software is built upon this type of excessive/redundant translation of data from one format to another, but the insanity of this feels especially evident in this case. With all the functionality that has been stuffed into the SBI so far, I am really surprised that an "ISA features" function is not one of them, as this would allow the kernel (real or virtual) to query the information from the firmware/hypervisor and be more-or-less done with it. No error-prone string parsing required. Sadly, it is a bit late to add such a thing now, as we will still be stuck supporting the legacy case for a long time.

I won't add my own criticisms of RISC-V... but yes, this is one of the big deficiencies in it. SBI calls might be nicer, though they're still not ideal, and have a history of being used to paper over poor architecture design, so I'm not sure they're "the" solution.

This revision was automatically updated to reflect the committed changes.