pci open unnecessarily
ClosedPublic
Actions

Authored by markj on Feb 10 2025, 3:18 AM.

Details

Reviewers

corvink
jhb

Group Reviewers

bhyve

Commits

rGbec64e75a5a8: bhyve: Avoid holding /dev/pci open unnecessarily
rG649a910e34b0: bhyve: Avoid holding /dev/pci open unnecessarily

Summary

Some device models will call pci_host_read_config() when probing for
devices. Currently this results in pcifd_init() opening /dev/pci, and
thus bhyve holds the fd open even when it's not needed.

Modify pci_host_{read,write}_config() to use the global pcifd only if
it's already open, otherwise just open the device and close it
immediately after we're done. This works fine so long as bhyve is still
initializing and isn't yet in capability mode, which appears to be the
case for existing calls that aren't in the passthru code.

Fixes: 563fd2240e13 ("bhyve: export funcs for read/write pci config")

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Feb 10 2025, 3:18 AM

Herald added a reviewer: bhyve. · View Herald TranscriptFeb 10 2025, 3:18 AM

Herald added subscribers: bcran, rgrimes, imp. · View Herald Transcript

markj requested review of this revision.Feb 10 2025, 3:18 AM

Harbormaster completed remote builds in B62326: Diff 150747.Feb 10 2025, 3:18 AM

Would it be simpler to just avoid using pcifd at all?

In D48908#1115607, @jhb wrote:

Would it be simpler to just avoid using pcifd at all?

It would be simpler, but would give these interfaces a weird quirk where they don't work after bhyve enters capability mode. I'm not sure whether that would matter today, but in general it seems like a surprising property.

In D48908#1115608, @markj wrote:

In D48908#1115607, @jhb wrote:

Would it be simpler to just avoid using pcifd at all?

It would be simpler, but would give these interfaces a weird quirk where they don't work after bhyve enters capability mode. I'm not sure whether that would matter today, but in general it seems like a surprising property.

I guess I'm not quite understanding why we want this change then? That is, what are the use cases where this leaks (I guess a device model that uses this but there isn't a pptX device for the guest) and what is the case that benefits from using pcifd (a non-passthrough device model that uses these routines outside of initialization in which case this change will break that device model if a pptX device isn't also active in the guest?) It's more surprising to me that whether or not this works post-initialization depends on whether or not a passthrough device is also present in the guest. A device model that uses this either only needs it during initialization, or it needs it post-initialization always regardless of if a passthrough device is present?

In D48908#1115686, @jhb wrote:

In D48908#1115608, @markj wrote:

In D48908#1115607, @jhb wrote:

Would it be simpler to just avoid using pcifd at all?

It would be simpler, but would give these interfaces a weird quirk where they don't work after bhyve enters capability mode. I'm not sure whether that would matter today, but in general it seems like a surprising property.

I guess I'm not quite understanding why we want this change then? That is, what are the use cases where this leaks (I guess a device model that uses this but there isn't a pptX device for the guest) and what is the case that benefits from using pcifd (a non-passthrough device model that uses these routines outside of initialization in which case this change will break that device model if a pptX device isn't also active in the guest?) It's more surprising to me that whether or not this works post-initialization depends on whether or not a passthrough device is also present in the guest. A device model that uses this either only needs it during initialization, or it needs it post-initialization always regardless of if a passthrough device is present?

The problem is that there are some non-passthrough device models which use these routines during initialization (but not after). Specifically, pci_lpc_init() via pci_lpc_get_sel(). It does that since commit f4ceaff56ddaa, since we apparently need the LPC device IDs when passing through an intel GPU. However, the implementation means that we always leave /dev/pci open, including when there's no passthrough device, and I'd like to fix that.

I had thought there were other device models which do this, but it's just LPC. We could instead modify it to use the host IDs only when a GPU is passed through, but that's kind of fragile: any call to pci_host_config_read() means that the pcifd is leaked. This way, we keep the global pcifd open only so long as passthru_init() is called.

In D48908#1115686, @jhb wrote:

In D48908#1115608, @markj wrote:

In D48908#1115607, @jhb wrote:

Would it be simpler to just avoid using pcifd at all?

It would be simpler, but would give these interfaces a weird quirk where they don't work after bhyve enters capability mode. I'm not sure whether that would matter today, but in general it seems like a surprising property.

I guess I'm not quite understanding why we want this change then? That is, what are the use cases where this leaks (I guess a device model that uses this but there isn't a pptX device for the guest) and what is the case that benefits from using pcifd (a non-passthrough device model that uses these routines outside of initialization in which case this change will break that device model if a pptX device isn't also active in the guest?) It's more surprising to me that whether or not this works post-initialization depends on whether or not a passthrough device is also present in the guest. A device model that uses this either only needs it during initialization, or it needs it post-initialization always regardless of if a passthrough device is present?

The problem is that there are some non-passthrough device models which use these routines during initialization (but not after). Specifically, pci_lpc_init() via pci_lpc_get_sel(). It does that since commit f4ceaff56ddaa, since we apparently need the LPC device IDs when passing through an intel GPU. However, the implementation means that we always leave /dev/pci open, including when there's no passthrough device, and I'd like to fix that.

I had thought there were other device models which do this, but it's just LPC. We could instead modify it to use the host IDs only when a GPU is passed through, but that's kind of fragile: any call to pci_host_config_read() means that the pcifd is leaked. This way, we keep the global pcifd open only so long as passthru_init() is called.

That is the case I was aware of (LPC), but it is why I wonder why we can't just always open /dev/pci in these routines and never use pcifd? The idea would be that they only ever work during initialization. If we ever needed them to work post-init, then we would revert that change so that they always use the global pcifd. Mixing the two is what I find a bit odd in that it provides inconsistent semantics.

Provide two wrappers for pci_host_{read,write}_config(): one which
is externally visible and always opens the pci device node, intended
for use only during initialization, and one for internal use in the
passthru code which uses the global pcifd.

Harbormaster completed remote builds in B62370: Diff 150845.Feb 11 2025, 3:07 PM

In D48908#1116148, @jhb wrote:

In D48908#1115686, @jhb wrote:

In D48908#1115608, @markj wrote:

In D48908#1115607, @jhb wrote:

Would it be simpler to just avoid using pcifd at all?

It would be simpler, but would give these interfaces a weird quirk where they don't work after bhyve enters capability mode. I'm not sure whether that would matter today, but in general it seems like a surprising property.

I guess I'm not quite understanding why we want this change then? That is, what are the use cases where this leaks (I guess a device model that uses this but there isn't a pptX device for the guest) and what is the case that benefits from using pcifd (a non-passthrough device model that uses these routines outside of initialization in which case this change will break that device model if a pptX device isn't also active in the guest?) It's more surprising to me that whether or not this works post-initialization depends on whether or not a passthrough device is also present in the guest. A device model that uses this either only needs it during initialization, or it needs it post-initialization always regardless of if a passthrough device is present?

The problem is that there are some non-passthrough device models which use these routines during initialization (but not after). Specifically, pci_lpc_init() via pci_lpc_get_sel(). It does that since commit f4ceaff56ddaa, since we apparently need the LPC device IDs when passing through an intel GPU. However, the implementation means that we always leave /dev/pci open, including when there's no passthrough device, and I'd like to fix that.

I had thought there were other device models which do this, but it's just LPC. We could instead modify it to use the host IDs only when a GPU is passed through, but that's kind of fragile: any call to pci_host_config_read() means that the pcifd is leaked. This way, we keep the global pcifd open only so long as passthru_init() is called.

That is the case I was aware of (LPC), but it is why I wonder why we can't just always open /dev/pci in these routines and never use pcifd? The idea would be that they only ever work during initialization. If we ever needed them to work post-init, then we would revert that change so that they always use the global pcifd. Mixing the two is what I find a bit odd in that it provides inconsistent semantics.

I did it that way since the passthru code also uses these routines internally. My latest patch introduces separate helpers for the two cases, so we get consistent semantics at the expense of a bit more complexity.

Ah, ok, I guess I missed the detail about the dual use of the API.

This revision is now accepted and ready to land.Feb 11 2025, 4:28 PM

Closed by commit rG649a910e34b0: bhyve: Avoid holding /dev/pci open unnecessarily (authored by markj). · Explain WhyFeb 14 2025, 3:34 PM

This revision was automatically updated to reflect the committed changes.

markj added a commit: rG649a910e34b0: bhyve: Avoid holding /dev/pci open unnecessarily.

markj added a commit: rGbec64e75a5a8: bhyve: Avoid holding /dev/pci open unnecessarily.Mar 8 2025, 8:14 AM