bhyvectl --get-stat should output IPI counters like this:
ipis sent to vcpu[0] 221856 ipis sent to vcpu[1] 209804 ipis sent to vcpu[2] 1168683 ipis sent to vcpu[3] 211410 ipis sent to vcpu[4] 147687 ipis sent to vcpu[5] 224145 ipis sent to vcpu[6] 154270 ipis sent to vcpu[7] 134241 ipis sent to vcpu[8] 0 ipis sent to vcpu[9] 0 ipis sent to vcpu[10] 0 ipis sent to vcpu[11] 0 ipis sent to vcpu[12] 0 ipis sent to vcpu[13] 0 ipis sent to vcpu[14] 0 ipis sent to vcpu[15] 0 ipis sent to vcpu[16] 0 ipis sent to vcpu[17] 0 ipis sent to vcpu[18] 0 ipis sent to vcpu[19] 0 ipis sent to vcpu[20] 0 ipis sent to vcpu[21] 0 ipis sent to vcpu[22] 0 ipis sent to vcpu[23] 0 ipis sent to vcpu[24] 0 ipis sent to vcpu[25] 0 ipis sent to vcpu[26] 0 ipis sent to vcpu[27] 0 ipis sent to vcpu[28] 0 ipis sent to vcpu[29] 0 ipis sent to vcpu[30] 0 ipis sent to vcpu[31] 0
However, IPI counters are missed in bhyvectl's output:
bhyvectl --get-stat --vm=$VM |grep ipi is empty due to this change:
diff --git a/sys/amd64/vmm/vmm_stat.c b/sys/amd64/vmm/vmm_stat.c index 168a380b221b..2750982185aa 100644 --- a/sys/amd64/vmm/vmm_stat.c +++ b/sys/amd64/vmm/vmm_stat.c @@ -71,7 +71,7 @@ vmm_stat_register(void *arg) return; if (vst->nelems == VMM_STAT_NELEMS_VCPU) - vst->nelems = VM_MAXCPU; + vst->nelems = vm_maxcpu;
vm_maxcpu is 0 here because vmm_stat_register() is executed before vmm_init().
Instead of directly fix that there is a better solution in illumos:
65a3bc83734e5fb0fc2c19df3e5112b87dcdc3f8 "12921 bhyve IPI statistics should not be a matrix"
That replaced matrix statistic into per vcpu two counters ipis sent from vcpu and ipis received by vcpu
Indeed:
- Matrix statistic becomes huge in case of 32 or more vCPU-s (seee above)
- Using matrix brings problem because MAX_VMM_STAT_ELEMS limit reached even for 32 vCPU-s
- MAX_VM_STATS also should be increased somehow.
- Two counters are enough in most cases. If something is wrong Dtrace can be used to get {sent-vcpu; recv-vcpu} pair counters .
- Waste memory usage. VM can use 1 vcpu, but memory is allocated for vm_maxcpu * sizeof (uint64_t) extra bytes for each allocated vcpu, but just 8 bytes are used. If vm_maxcpu is 64, then VM 1 vcpu allocates extra 512 bytes (used 8 byte), VM with 8 vcpu-s allocates extra 4KB (really used 64 bytes), etc.
So cherry-pick that Illumos commit with some cleanups.
Sponsored by: vStack