APIC ID 255 and above require x2APIC and DMAR interrupt remapping.
Details
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
We were asked to take a quick look at 13.2 and 14.0 failing on a system that was to be deployed in production. 13.2 fails for a few reasons, while 14.0 successfully boots with hw.dmar_enable set to 1 (and fails to find its NVMe devices with dmar_enable at the default of 0).
This would cause a huge (~10x) hit on workloads that use busdma intensively, like network drivers.
To be practical, dmar should be enabled, busdma disabled, ir enabled.
It looks like it would sufficient to make hw.iommu.dma default to 0 instead of 1, is that right?
It does occur to me (after uploading this update) that I'm enabling Intel-specific DMAR by default, but the hw.iommu.dma that I'm switching to default off is global (moved out of DMAR by @br in f2b2f31707bce25e3fdee9fdfcb75ddbd1ff3338). At a minimum I'll need to note this in the commit message.
I would probably split this up into two commits (even though they are so tiny) since disabling IOMMU for bus_dma by default is a MI setting.
This crashes my laptop with a null pointer dereference unless I do hw.dmar.enable=0 in the boot loader
trap
dmar_match_by_path()+0x20 <---- null pointer dereference on unit, which is somehow NULL)
dmar_find()
iommu_get_dma_tag()
acpi_pci_get_dma_tag()
xhci_init()
...
Hacking dmar_match_by_path to return if unit == NULL results in a system where the interrupts fail in various ways:
CPU0: local APIC error 0x40
nvme0: System interrupt issues?
CPU0: local APIC error 0x40
<repeats 100s of times>
So this isn't fully baked.