Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR GT0: IOV: Unable to confirm version 1.4 #186

Open
Aliang-code opened this issue Jul 28, 2024 · 25 comments
Open

ERROR GT0: IOV: Unable to confirm version 1.4 #186

Aliang-code opened this issue Jul 28, 2024 · 25 comments

Comments

@Aliang-code
Copy link

kernel: 6.8.8-3-pve
cpu: i5-12600T (UHD Graphics 770)

I noticed that my system log has the following output, however it looks like 3 virtualized graphics cards were created, I'm not sure if this is correct..

0000:00:02.0 VGA compatible controller: Intel Corporation Alder Lake-S GT1 [UHD Graphics 770] (rev 0c)
0000:00:02.1 VGA compatible controller: Intel Corporation Alder Lake-S GT1 [UHD Graphics 770] (rev 0c)
0000:00:02.2 VGA compatible controller: Intel Corporation Alder Lake-S GT1 [UHD Graphics 770] (rev 0c)
0000:00:02.3 VGA compatible controller: Intel Corporation Alder Lake-S GT1 [UHD Graphics 770] (rev 0c)
Jul 28 19:50:58 pve kernel: pci 0000:00:02.1: [8086:4690] type 00 class 0x030000 PCIe Root Complex Integrated Endpoint
Jul 28 19:50:58 pve kernel: pci 0000:00:02.1: DMAR: Skip IOMMU disabling for graphics
Jul 28 19:50:58 pve kernel: pci 0000:00:02.1: Adding to iommu group 42
Jul 28 19:50:58 pve kernel: pci 0000:00:02.1: vgaarb: bridge control possible
Jul 28 19:50:58 pve kernel: pci 0000:00:02.1: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
Jul 28 19:50:58 pve kernel: i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=io+mem
Jul 28 19:50:58 pve kernel: xe 0000:00:02.1: Your graphics device 4690 is not officially supported
by xe driver in this kernel version. To force Xe probe,
use xe.force_probe='4690' and i915.force_probe='!4690'
module parameters or CONFIG_DRM_XE_FORCE_PROBE='4690' and
CONFIG_DRM_I915_FORCE_PROBE='!4690' configuration options.
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: enabling device (0000 -> 0002)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: Running in SR-IOV VF mode
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.4 (0000000000000000)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Found interface version 0.1.4.1
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: [drm] VT-d active for gfx access
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: [drm] Using Transparent Hugepages
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.4 (0000000000000000)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Found interface version 0.1.4.1
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: HuC firmware PRELOADED
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: [drm] Protected Xe Path (PXP) protected content support initialized
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: [drm] PMU not supported for this GPU.
Jul 28 19:50:58 pve kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.1 on minor 0
Jul 28 19:50:58 pve kernel: pci 0000:00:02.2: [8086:4690] type 00 class 0x030000 PCIe Root Complex Integrated Endpoint
Jul 28 19:50:58 pve kernel: pci 0000:00:02.2: DMAR: Skip IOMMU disabling for graphics
Jul 28 19:50:58 pve kernel: pci 0000:00:02.2: Adding to iommu group 43
Jul 28 19:50:58 pve kernel: pci 0000:00:02.2: vgaarb: bridge control possible
Jul 28 19:50:58 pve kernel: pci 0000:00:02.2: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
Jul 28 19:50:58 pve kernel: i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
Jul 28 19:50:58 pve kernel: xe 0000:00:02.2: Your graphics device 4690 is not officially supported
by xe driver in this kernel version. To force Xe probe,
use xe.force_probe='4690' and i915.force_probe='!4690'
module parameters or CONFIG_DRM_XE_FORCE_PROBE='4690' and
CONFIG_DRM_I915_FORCE_PROBE='!4690' configuration options.
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: enabling device (0000 -> 0002)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: Running in SR-IOV VF mode
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.4 (0000000000000000)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: [drm] *ERROR* GT0: IOV: Found interface version 0.1.4.1
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: [drm] VT-d active for gfx access
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: [drm] Using Transparent Hugepages
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.4 (0000000000000000)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: [drm] *ERROR* GT0: IOV: Found interface version 0.1.4.1
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: HuC firmware PRELOADED
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: [drm] Protected Xe Path (PXP) protected content support initialized
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: [drm] PMU not supported for this GPU.
Jul 28 19:50:58 pve kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.2 on minor 2
Jul 28 19:50:58 pve kernel: pci 0000:00:02.3: [8086:4690] type 00 class 0x030000 PCIe Root Complex Integrated Endpoint
Jul 28 19:50:58 pve kernel: pci 0000:00:02.3: DMAR: Skip IOMMU disabling for graphics
Jul 28 19:50:58 pve kernel: pci 0000:00:02.3: Adding to iommu group 44
Jul 28 19:50:58 pve kernel: pci 0000:00:02.3: vgaarb: bridge control possible
Jul 28 19:50:58 pve kernel: pci 0000:00:02.3: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
Jul 28 19:50:58 pve kernel: i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
Jul 28 19:50:58 pve kernel: i915 0000:00:02.1: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
Jul 28 19:50:58 pve kernel: i915 0000:00:02.2: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
Jul 28 19:50:58 pve kernel: xe 0000:00:02.3: Your graphics device 4690 is not officially supported
by xe driver in this kernel version. To force Xe probe,
use xe.force_probe='4690' and i915.force_probe='!4690'
module parameters or CONFIG_DRM_XE_FORCE_PROBE='4690' and
CONFIG_DRM_I915_FORCE_PROBE='!4690' configuration options.
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: enabling device (0000 -> 0002)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: Running in SR-IOV VF mode
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.4 (0000000000000000)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: [drm] *ERROR* GT0: IOV: Found interface version 0.1.4.1
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: [drm] VT-d active for gfx access
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: [drm] Using Transparent Hugepages
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.4 (0000000000000000)
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: [drm] *ERROR* GT0: IOV: Found interface version 0.1.4.1
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: HuC firmware PRELOADED
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: [drm] Protected Xe Path (PXP) protected content support initialized
Jul 28 19:50:58 pve kernel: i915 0000:00:02.3: [drm] PMU not supported for this GPU.
Jul 28 19:50:58 pve kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.3 on minor 3
Jul 28 19:50:58 pve kernel: i915 0000:00:02.0: Enabled 3 VFs
@pasbec
Copy link
Contributor

pasbec commented Jul 28, 2024

The firmware version in the code has recently been bumped to 1.9 since 1.4 is quite old.

But my experience is that the dkms module seems to "work" even if the firmware version does not match. Still I'm wondering why Proxmox VE is not providing you the latest firmware. Guess the package pve-firmware is a dependency of the kernel anyway. On my hosts, dpkg-query -L pve-firmware | grep i915 gives the following output:

/lib/firmware/i915
/lib/firmware/i915/adlp_dmc.bin
/lib/firmware/i915/adlp_dmc_ver2_12.bin
/lib/firmware/i915/adlp_dmc_ver2_16.bin
/lib/firmware/i915/adlp_guc_62.0.3.bin
/lib/firmware/i915/adlp_guc_69.0.3.bin
/lib/firmware/i915/adlp_guc_70.1.1.bin
/lib/firmware/i915/adlp_guc_70.bin
/lib/firmware/i915/adls_dmc_ver2_01.bin
/lib/firmware/i915/bxt_dmc_ver1_07.bin
/lib/firmware/i915/bxt_guc_62.0.0.bin
/lib/firmware/i915/bxt_guc_70.1.1.bin
/lib/firmware/i915/bxt_huc_2.0.0.bin
/lib/firmware/i915/cml_guc_62.0.0.bin
/lib/firmware/i915/cml_guc_70.1.1.bin
/lib/firmware/i915/cml_huc_4.0.0.bin
/lib/firmware/i915/dg1_dmc_ver2_02.bin
/lib/firmware/i915/dg1_guc_70.bin
/lib/firmware/i915/dg1_huc.bin
/lib/firmware/i915/dg2_dmc_ver2_08.bin
/lib/firmware/i915/dg2_guc_70.bin
/lib/firmware/i915/dg2_huc_gsc.bin
/lib/firmware/i915/ehl_guc_62.0.0.bin
/lib/firmware/i915/ehl_guc_70.1.1.bin
/lib/firmware/i915/ehl_huc_9.0.0.bin
/lib/firmware/i915/glk_dmc_ver1_04.bin
/lib/firmware/i915/glk_guc_62.0.0.bin
/lib/firmware/i915/glk_guc_70.1.1.bin
/lib/firmware/i915/glk_huc_4.0.0.bin
/lib/firmware/i915/icl_dmc_ver1_09.bin
/lib/firmware/i915/icl_guc_62.0.0.bin
/lib/firmware/i915/icl_guc_70.1.1.bin
/lib/firmware/i915/icl_huc_9.0.0.bin
/lib/firmware/i915/kbl_dmc_ver1_04.bin
/lib/firmware/i915/kbl_guc_62.0.0.bin
/lib/firmware/i915/kbl_guc_70.1.1.bin
/lib/firmware/i915/kbl_huc_4.0.0.bin
/lib/firmware/i915/mtl_dmc.bin
/lib/firmware/i915/mtl_gsc_1.bin
/lib/firmware/i915/mtl_guc_70.bin
/lib/firmware/i915/mtl_huc_gsc.bin
/lib/firmware/i915/rkl_dmc_ver2_03.bin
/lib/firmware/i915/skl_dmc_ver1_27.bin
/lib/firmware/i915/skl_guc_62.0.0.bin
/lib/firmware/i915/skl_guc_70.1.1.bin
/lib/firmware/i915/skl_huc_2.0.0.bin
/lib/firmware/i915/tgl_dmc_ver2_12.bin
/lib/firmware/i915/tgl_guc_62.0.0.bin
/lib/firmware/i915/tgl_guc_69.0.3.bin
/lib/firmware/i915/tgl_guc_70.1.1.bin
/lib/firmware/i915/tgl_guc_70.bin
/lib/firmware/i915/tgl_huc.bin
/usr/share/doc/pve-firmware/licenses/LICENSE.i915
/lib/firmware/i915/tgl_huc_7.9.3.bin

Is this the same for you and do also have pve-firmware version 3.12-1 installed (check with apt policy pve-firmware)?
Btw, how did you get the kernel 6.8.8-3-pve? I can only see 6.8.8-2-pve.

If you want to test the previously allowed guc firmware minor version 4 instead of the updated minor version 9. You could just run the following in the root of your the i915-sriov-dkms repo:

sed -i 's/GUCFIRMWARE_MINOR:-9/GUCFIRMWARE_MINOR:-4/' Makefile
rm -rf /usr/src/i915-sriov-dkms-* /var/lib/dkms/i915-sriov-dkms
dkms add .
dkms install -m i915-sriov-dkms -v $(cat VERSION) -k $(uname -r) --force

This will change GUCFIRMWARE_MINOR=9 to GUCFIRMWARE_MINOR=4 in the Makefile.

Just for testing it would be enough to run just following:

GUCFIRMWARE_MINOR=4 dkms install -m i915-sriov-dkms -v $(cat VERSION) -k $(uname -r) --force

But this would not survive system updates in the future.

@Aliang-code
Copy link
Author

Aliang-code commented Jul 28, 2024

The firmware version in the code has recently been bumped to 1.9 since 1.4 is quite old.

But my experience is that the dkms module seems to "work" even if the firmware version does not match. Still I'm wondering why Proxmox VE is not providing you the latest firmware. Guess the package pve-firmware is a dependency of the kernel anyway. On my hosts, dpkg-query -L pve-firmware | grep i915 gives the following output:

/lib/firmware/i915
/lib/firmware/i915/adlp_dmc.bin
/lib/firmware/i915/adlp_dmc_ver2_12.bin
/lib/firmware/i915/adlp_dmc_ver2_16.bin
/lib/firmware/i915/adlp_guc_62.0.3.bin
/lib/firmware/i915/adlp_guc_69.0.3.bin
/lib/firmware/i915/adlp_guc_70.1.1.bin
/lib/firmware/i915/adlp_guc_70.bin
/lib/firmware/i915/adls_dmc_ver2_01.bin
/lib/firmware/i915/bxt_dmc_ver1_07.bin
/lib/firmware/i915/bxt_guc_62.0.0.bin
/lib/firmware/i915/bxt_guc_70.1.1.bin
/lib/firmware/i915/bxt_huc_2.0.0.bin
/lib/firmware/i915/cml_guc_62.0.0.bin
/lib/firmware/i915/cml_guc_70.1.1.bin
/lib/firmware/i915/cml_huc_4.0.0.bin
/lib/firmware/i915/dg1_dmc_ver2_02.bin
/lib/firmware/i915/dg1_guc_70.bin
/lib/firmware/i915/dg1_huc.bin
/lib/firmware/i915/dg2_dmc_ver2_08.bin
/lib/firmware/i915/dg2_guc_70.bin
/lib/firmware/i915/dg2_huc_gsc.bin
/lib/firmware/i915/ehl_guc_62.0.0.bin
/lib/firmware/i915/ehl_guc_70.1.1.bin
/lib/firmware/i915/ehl_huc_9.0.0.bin
/lib/firmware/i915/glk_dmc_ver1_04.bin
/lib/firmware/i915/glk_guc_62.0.0.bin
/lib/firmware/i915/glk_guc_70.1.1.bin
/lib/firmware/i915/glk_huc_4.0.0.bin
/lib/firmware/i915/icl_dmc_ver1_09.bin
/lib/firmware/i915/icl_guc_62.0.0.bin
/lib/firmware/i915/icl_guc_70.1.1.bin
/lib/firmware/i915/icl_huc_9.0.0.bin
/lib/firmware/i915/kbl_dmc_ver1_04.bin
/lib/firmware/i915/kbl_guc_62.0.0.bin
/lib/firmware/i915/kbl_guc_70.1.1.bin
/lib/firmware/i915/kbl_huc_4.0.0.bin
/lib/firmware/i915/mtl_dmc.bin
/lib/firmware/i915/mtl_gsc_1.bin
/lib/firmware/i915/mtl_guc_70.bin
/lib/firmware/i915/mtl_huc_gsc.bin
/lib/firmware/i915/rkl_dmc_ver2_03.bin
/lib/firmware/i915/skl_dmc_ver1_27.bin
/lib/firmware/i915/skl_guc_62.0.0.bin
/lib/firmware/i915/skl_guc_70.1.1.bin
/lib/firmware/i915/skl_huc_2.0.0.bin
/lib/firmware/i915/tgl_dmc_ver2_12.bin
/lib/firmware/i915/tgl_guc_62.0.0.bin
/lib/firmware/i915/tgl_guc_69.0.3.bin
/lib/firmware/i915/tgl_guc_70.1.1.bin
/lib/firmware/i915/tgl_guc_70.bin
/lib/firmware/i915/tgl_huc.bin
/usr/share/doc/pve-firmware/licenses/LICENSE.i915
/lib/firmware/i915/tgl_huc_7.9.3.bin

Is this the same for you and do also have pve-firmware version 3.12-1 installed (check with apt policy pve-firmware)? Btw, how did you get the kernel 6.8.8-3-pve? I can only see 6.8.8-2-pve.

If you want to test the previously allowed guc firmware minor version 4 instead of the updated minor version 9. You could just run the following in the root of your the i915-sriov-dkms repo:

sed -i 's/GUCFIRMWARE_MINOR:-9/GUCFIRMWARE_MINOR:-4/' Makefile
rm -rf /usr/src/i915-sriov-dkms-* /var/lib/dkms/i915-sriov-dkms
dkms add .
dkms install -m i915-sriov-dkms -v $(cat VERSION) -k $(uname -r) --force

This will change GUCFIRMWARE_MINOR=9 to GUCFIRMWARE_MINOR=4 in the Makefile.

Just for testing it would be enough to run just following:

GUCFIRMWARE_MINOR=4 dkms install -m i915-sriov-dkms -v $(cat VERSION) -k $(uname -r) --force

But this would not survive system updates in the future.

I just update by apt update. Here is my output, without any diff:

root@pve:~# dpkg-query -L pve-firmware | grep i915
/lib/firmware/i915
/lib/firmware/i915/adlp_dmc.bin
/lib/firmware/i915/adlp_dmc_ver2_12.bin
/lib/firmware/i915/adlp_dmc_ver2_16.bin
/lib/firmware/i915/adlp_guc_62.0.3.bin
/lib/firmware/i915/adlp_guc_69.0.3.bin
/lib/firmware/i915/adlp_guc_70.1.1.bin
/lib/firmware/i915/adlp_guc_70.bin
/lib/firmware/i915/adls_dmc_ver2_01.bin
/lib/firmware/i915/bxt_dmc_ver1_07.bin
/lib/firmware/i915/bxt_guc_62.0.0.bin
/lib/firmware/i915/bxt_guc_70.1.1.bin
/lib/firmware/i915/bxt_huc_2.0.0.bin
/lib/firmware/i915/cml_guc_62.0.0.bin
/lib/firmware/i915/cml_guc_70.1.1.bin
/lib/firmware/i915/cml_huc_4.0.0.bin
/lib/firmware/i915/dg1_dmc_ver2_02.bin
/lib/firmware/i915/dg1_guc_70.bin
/lib/firmware/i915/dg1_huc.bin
/lib/firmware/i915/dg2_dmc_ver2_08.bin
/lib/firmware/i915/dg2_guc_70.bin
/lib/firmware/i915/dg2_huc_gsc.bin
/lib/firmware/i915/ehl_guc_62.0.0.bin
/lib/firmware/i915/ehl_guc_70.1.1.bin
/lib/firmware/i915/ehl_huc_9.0.0.bin
/lib/firmware/i915/glk_dmc_ver1_04.bin
/lib/firmware/i915/glk_guc_62.0.0.bin
/lib/firmware/i915/glk_guc_70.1.1.bin
/lib/firmware/i915/glk_huc_4.0.0.bin
/lib/firmware/i915/icl_dmc_ver1_09.bin
/lib/firmware/i915/icl_guc_62.0.0.bin
/lib/firmware/i915/icl_guc_70.1.1.bin
/lib/firmware/i915/icl_huc_9.0.0.bin
/lib/firmware/i915/kbl_dmc_ver1_04.bin
/lib/firmware/i915/kbl_guc_62.0.0.bin
/lib/firmware/i915/kbl_guc_70.1.1.bin
/lib/firmware/i915/kbl_huc_4.0.0.bin
/lib/firmware/i915/mtl_dmc.bin
/lib/firmware/i915/mtl_gsc_1.bin
/lib/firmware/i915/mtl_guc_70.bin
/lib/firmware/i915/mtl_huc_gsc.bin
/lib/firmware/i915/rkl_dmc_ver2_03.bin
/lib/firmware/i915/skl_dmc_ver1_27.bin
/lib/firmware/i915/skl_guc_62.0.0.bin
/lib/firmware/i915/skl_guc_70.1.1.bin
/lib/firmware/i915/skl_huc_2.0.0.bin
/lib/firmware/i915/tgl_dmc_ver2_12.bin
/lib/firmware/i915/tgl_guc_62.0.0.bin
/lib/firmware/i915/tgl_guc_69.0.3.bin
/lib/firmware/i915/tgl_guc_70.1.1.bin
/lib/firmware/i915/tgl_guc_70.bin
/lib/firmware/i915/tgl_huc.bin
/usr/share/doc/pve-firmware/licenses/LICENSE.i915
/lib/firmware/i915/tgl_huc_7.9.3.bin

@pasbec
Copy link
Contributor

pasbec commented Jul 30, 2024

The file list seems similar but the exact version is not shown.

What is the version of your installed pve-firmware? You can find it with apt policy pve-firmware
Have you tried changing GUCFIRMWARE_MINOR to 4 as explained above?

@linux40
Copy link

linux40 commented Sep 1, 2024

I encounter the same error, and my desktop on the host does not post. Will try setting GUCFIRMWARE_MINOR later.

@scyto
Copy link

scyto commented Sep 18, 2024

I also get the error, these are my listed huc, guc and dmc firmwares - what version numbers should these have to avoid the error?

[    4.368984] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/adlp_dmc.bin (v2.20)
[    4.380938] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/adlp_guc_70.bin version 70.13.1
[    4.380942] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3

@scyto
Copy link

scyto commented Sep 18, 2024

What is the version of your installed pve-firmware?

root@pve1:~# apt policy pve-firmware
pve-firmware:
  Installed: 3.13-1
  Candidate: 3.13-2
  Version table:
     3.13-2 500
        500 http://download.proxmox.com/debian/pve bookworm/pve-no-subscription amd64 Packages
 *** 3.13-1 500
        500 http://download.proxmox.com/debian/pve bookworm/pve-no-subscription amd64 Packages
        100 /var/lib/dpkg/status
     3.12-1 500
        500 http://download.proxmox.com/debian/pve bookworm/pve-no-subscription amd64 Packages
 etc

edit
i updated to 3.13-2 did help, i don't think 1.4 in the error is referring to the dmc, guc or huc minor version at all....

@scyto
Copy link

scyto commented Sep 18, 2024

umm maybe i am a numbnuts.... but is the issue the format?

[    5.005904] i915 0000:00:02.6: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.4 (0000000000000000)
[    5.005917] i915 0000:00:02.6: [drm] *ERROR* GT0: IOV: Found interface version 0.1.4.1

I.E. its looking for 1.4 but it got 0.1.4.1 - i don't think that's older than 1.4 - i think it is newer and now is prepended with a 0. ?

@MacharaStormwing
Copy link

So do I understand this correctly:
The drivers have a certain interface version (mine 0.1.13.4) but the firmware has a different one and that is the issue?

I don't remember having the issue before updating just today (kernel 6.8.12-2) or I did not notice it.

Is there a solution to this?

[ 5.534849] i915 0000:00:02.1: [drm] ERROR GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[ 5.534868] i915 0000:00:02.1: [drm] ERROR GT0: IOV: Found interface version 0.1.13.4

Thats the whole log: https://pastebin.com/Ra9C5HUD

@scyto
Copy link

scyto commented Sep 22, 2024

I don't think that's the issue, the number mismatch issue is the same for different versions. My thesis is 1.13 and 0.1.13.4 are the same, and the error likely indicates a bug in assessing the number format for some reason. Given this is a non blocking bug, ignore it for now.

@scyto
Copy link

scyto commented Sep 22, 2024

Confirmed all that is needed is the firmware version command, this is also need if you have version 1.13.4 (with number 13 instead of 4_ this isn't about the driver not being there, just the difference in version strings i think?

So this fixed it for me:

GUCFIRMWARE_MINOR=4 dkms install -m i915-sriov-dkms -v 2024.08.09 --force --kernelsourcedir /usr/src/linux-headers-6.8.12-2-pve/

As i was getting a looking for 1.4 found 0.1.4.1

---Edit---

copying the latest firmwares made ZERO difference to the interface version reported in the error, even after copying firmware from kernel.org using
wget -r -nd -e robots=no -A '*.bin' --accept-regex '/plain/' https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915/ made no difference

[    4.330630] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/adlp_dmc.bin (v2.20)
[    4.340183] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/adlp_guc_70.bin version 70.13.1
[    4.340188] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
[    4.942723] i915 0000:00:02.1: [drm] Using Transparent Hugepages
[    4.943225] i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.4 (0000000000000000)
[    4.943260] i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Found interface version 0.1.4.1

going one step further, the difference between 0.1.4.1 and 0.1.13.4 is whether your machine loads
i915/adls_dmc_ver2_01.bin or i915/adlp_dmc.bin my 13th gen nuc loads the latter even if the former is in the firmware directory.

As such i would argue the notion that 1.4 has been superseded by 1.9 or 1.13 is wrong, they are all valid and that is why upgrading to later version of promox firmwares / pulling from intel's github repo / pulling from kernel.org doesn't help.

@pasbec
Copy link
Contributor

pasbec commented Sep 23, 2024

@MacharaStormwin: Have you tried to use GUCFIRMWARE_MINOR=13 for the 1.13.x firmware?

@johntdavis84
Copy link

johntdavis84 commented Sep 23, 2024 via email

@bolzerrr
Copy link

@MacharaStormwin: Have you tried to use GUCFIRMWARE_MINOR=13 for the 1.13.x firmware?

I run into the same issue. Tried:

`GUCFIRMWARE_MINOR=13 dkms install -m i915-sriov-dkms -v $(cat VERSION) --force
Module i915-sriov-dkms-2024.09.21 for kernel 6.8.12-1-pve (x86_64).
Before uninstall, this module version was ACTIVE on this kernel.

i915.ko:

  • Uninstallation
    • Deleting from: /lib/modules/6.8.12-1-pve/updates/dkms/
  • Original module
    • No original module was found for this module on this kernel.
    • Use the dkms install command to reinstall any previous module version.
      depmod....

i915.ko:
Running module version sanity check.

  • Original module
  • Installation
    • Installing to /lib/modules/6.8.12-1-pve/updates/dkms/
      depmod...`

But still:

[ 10.214385] i915 0000:00:02.3: [drm] ERROR GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[ 10.214425] i915 0000:00:02.3: [drm] ERROR GT0: IOV: Found interface version 0.1.13.4
[ 10.215489] i915 0000:00:02.3: [drm] VT-d active for gfx access
[ 10.215526] i915 0000:00:02.3: [drm] Using Transparent Hugepages
[ 10.216493] i915 0000:00:02.3: [drm] ERROR GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[ 10.216538] i915 0000:00:02.3: [drm] ERROR GT0: IOV: Found interface version 0.1.13.4
[ 10.217056] i915 0000:00:02.3: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
[ 10.217065] i915 0000:00:02.3: HuC firmware PRELOADED
[ 10.217516] i915 0000:00:02.3: [drm] ERROR GT0: GUC: mmio request 0x4100: failure 201/0
[ 10.217523] i915 0000:00:02.3: [drm] ERROR GT0: Failed to retrieve hwconfig table: -ENOENT
[ 10.219976] i915 0000:00:02.3: [drm] Protected Xe Path (PXP) protected content support initialized
[ 10.219990] i915 0000:00:02.3: [drm] PMU not supported for this GPU.

@scyto
Copy link

scyto commented Sep 24, 2024

Before uninstall, this module version was ACTIVE on this kernel.

Did you did both uninstall and remove before creating the new one?
After installing the new one did you run these 3 commands:

update-grub 
update-initramfs -u 
proxmox-boot-tool refresh
reboot now

for reference this was my command for the error with 1.4

GUCFIRMWARE_MINOR=4 dkms install -m i915-sriov-dkms -v 2024.08.09 --force --kernelsourcedir /usr/src/linux-headers-6.8.12-2-pve/

I needed to do the explicit kernel source path, not sure why

@pasbec
Copy link
Contributor

pasbec commented Oct 3, 2024

Here are some more findings:

The GuC firmware can be retrieved from dmesg | grep "GT0: GuC firmware" | awk '{print $$NF}' (reboot required after firmware update).

For Proxmox VE and pve-firmware this probably means the following from comparing the submodule update dates:

  • pve-firmware >= 3.13-2: GUCFIRMWARE_MAJOR=1, GUCFIRMWARE_MINOR=13
  • pve-firmware >= 3.10-1: GUCFIRMWARE_MAJOR=1, GUCFIRMWARE_MINOR=9
  • pve-firmware >= 3.8-4: GUCFIRMWARE_MAJOR=1, GUCFIRMWARE_MINOR=4

Please note that calling e.g. GUCFIRMWARE_MINOR=13 dkms install ... is only for testing and won't survive any updates via apt. To make the change persistent, either change the Makefile or edit dkms.conf after cloning the repo. Be also aware that after each change in the repo you need to remove /usr/src/i915-sriov-dkms-* and add the repo again with dkms add ... or edit the files in /usr/src/i915-sriov-dkms-* directly.

I've modified the Makefile in my own fork (branch) with an attempt to autodetect the correct version. But it currently requires a reboot after each firmware upgrade to work which I don't like very much. Have a look at the modifications if you are interested.

Result with latest firmware:
grafik

@bbaa-bbaa
Copy link
Contributor

bbaa-bbaa commented Oct 4, 2024

I've modified the Makefile in my own fork (branch) with an attempt to autodetect the correct version. But it currently requires a reboot after each firmware upgrade to work which I don't like very much.

In fact, intel's upstream repo only compares the MAJOR version, so we can maybe safely remove the check.
https://github.com/intel/linux-intel-lts/blame/5404e6dd8524e9cc698099c8780a6889e24ecbfd/drivers/gpu/drm/i915/gt/iov/intel_iov_query.c#L97-L112

I fear to get rid of this problem the whole repo would have to be updated from a more recent version of Intels LTS repo.

Porting this dkms module to the most recent 6.6 kernel branch means that we need to remove support for kernels before 6.6 and rebuild the 6.8~6.11 (Intel's lts kernel seems to be backported based on 6.8) adaptation code. I may try that when I have time.

EDIT: The i915 driver in the linux-intel-lts 6.6 branch depends on a backport from a newer version of the kernel. Migrating to the 6.6 branch is difficult.

@pasbec
Copy link
Contributor

pasbec commented Oct 5, 2024

I guess there is still enough hope the new xe driver will support SR-IOV at some point.

At the moment I'm running stable on my i9-14900K (GPU id a780)

               use xe.force_probe='a780' and i915.force_probe='!a780'
[   32.971076] i915 0000:00:02.0: Running in SR-IOV PF mode
[   32.972270] i915 0000:00:02.0: [drm] VT-d active for gfx access
[   32.972918] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[   32.973626] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=io+mem
[   32.974226] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [i915])
[   32.979801] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/adls_dmc_ver2_01.bin (v2.1)
[   32.983420] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.29.2
[   32.983526] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
[   32.989282] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads!
[   32.989802] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[   32.989909] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[   32.990620] i915 0000:00:02.0: [drm] GuC RC: enabled
[   32.991451] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
[   32.991782] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[   32.992672] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 2
[   32.994483] i915 0000:00:02.0: 7 VFs could be associated with this PF
[   32.994932] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[   32.995972] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[   34.439249] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
               use xe.force_probe='a780' and i915.force_probe='!a780'
[   34.442422] i915 0000:00:02.1: enabling device (0000 -> 0002)
[   34.442936] i915 0000:00:02.1: Running in SR-IOV VF mode
[   34.444407] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.13.4
[   34.445733] i915 0000:00:02.1: [drm] VT-d active for gfx access
[   34.446527] i915 0000:00:02.1: [drm] Using Transparent Hugepages
[   34.447912] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.13.4
[   34.449753] i915 0000:00:02.1: GuC firmware PRELOADED version 1.13 submission:SR-IOV VF
[   34.450116] i915 0000:00:02.1: HuC firmware PRELOADED
[   34.452736] i915 0000:00:02.1: [drm] Protected Xe Path (PXP) protected content support initialized
[   34.453124] i915 0000:00:02.1: [drm] PMU not supported for this GPU.
[   34.453962] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.1 on minor 3
[   34.457075] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[   34.457555] i915 0000:00:02.1: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none

But I see some errors on my N100 (GPU id 46d1) with the latest firmware:

               use xe.force_probe='46d1' and i915.force_probe='!46d1'
[    7.605500] i915 0000:00:02.0: Running in SR-IOV PF mode
[    7.608067] i915 0000:00:02.0: [drm] VT-d active for gfx access
[    7.625413] i915 0000:00:02.0: vgaarb: deactivate vga console
[    7.625596] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[    7.626127] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    7.634587] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/adlp_dmc.bin (v2.20)
[    7.641087] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/adlp_guc_70.bin version 70.29.2
[    7.641107] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
[    7.669675] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads!
[    7.671535] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[    7.671539] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[    7.672100] i915 0000:00:02.0: [drm] GuC RC: enabled
[    7.672200] i915 0000:00:02.0: [drm] *ERROR* GT0: GUC: mmio request 0x4100: failure 201/0
[    7.672206] i915 0000:00:02.0: [drm] *ERROR* GT0: Failed to retrieve hwconfig table: -ENOENT
[    7.674541] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[    7.697202] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 1
[    7.699852] i915 0000:00:02.0: 7 VFs could be associated with this PF
[    7.728000] fbcon: i915drmfb (fb0) is primary device
[    7.800611] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
[   11.146822] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=io+mem
               use xe.force_probe='46d1' and i915.force_probe='!46d1'
[   11.155042] i915 0000:00:02.1: enabling device (0000 -> 0002)
[   11.156436] i915 0000:00:02.1: Running in SR-IOV VF mode
[   11.158592] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.13.4
[   11.161534] i915 0000:00:02.1: [drm] VT-d active for gfx access
[   11.162950] i915 0000:00:02.1: [drm] Using Transparent Hugepages
[   11.165338] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.13.4
[   11.167754] i915 0000:00:02.1: GuC firmware PRELOADED version 1.13 submission:SR-IOV VF
[   11.169076] i915 0000:00:02.1: HuC firmware PRELOADED
[   11.170931] i915 0000:00:02.1: [drm] *ERROR* GT0: GUC: mmio request 0x4100: failure 201/0
[   11.172232] i915 0000:00:02.1: [drm] *ERROR* GT0: Failed to retrieve hwconfig table: -ENOENT
[   11.176686] i915 0000:00:02.1: [drm] Protected Xe Path (PXP) protected content support initialized
[   11.178071] i915 0000:00:02.1: [drm] PMU not supported for this GPU.
[   11.178457] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.1 on minor 0
[   11.188994] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[   11.190181] i915 0000:00:02.1: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
...
[123672.791627] i915 0000:00:02.0: [drm] *ERROR* AUX USBC1/DDI TC1/PHY TC1: did not complete or timeout within 10ms (status 0xad4002ff)
[123672.802626] i915 0000:00:02.0: [drm] *ERROR* AUX USBC1/DDI TC1/PHY TC1: did not complete or timeout within 10ms (status 0xad4002ff)
[123672.813627] i915 0000:00:02.0: [drm] *ERROR* AUX USBC1/DDI TC1/PHY TC1: did not complete or timeout within 10ms (status 0xad4002ff)
[123672.824625] i915 0000:00:02.0: [drm] *ERROR* AUX USBC1/DDI TC1/PHY TC1: did not complete or timeout within 10ms (status 0xad4002ff)
[123672.835625] i915 0000:00:02.0: [drm] *ERROR* AUX USBC1/DDI TC1/PHY TC1: did not complete or timeout within 10ms (status 0xad4002ff)
[123672.835641] i915 0000:00:02.0: [drm] *ERROR* AUX USBC1/DDI TC1/PHY TC1: not done (status 0xad4002ff)
[123672.845646] i915 0000:00:02.0: AUX USBC1/DDI TC1/PHY TC1: not started (status 0xad4002ff)
[123672.845683] WARNING: CPU: 0 PID: 401263 at /var/lib/dkms/i915-sriov-dkms/2024.09.21/build/drivers/gpu/drm/i915/display/intel_dp_aux.c:246 intel_dp_aux_xfer+0x616/0x690 [i915]
[123672.845854] Modules linked in: cmac nls_utf8 cifs cifs_arc4 nls_ucs2_utils rdma_cm iw_cm ib_cm ib_core cifs_md4 netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter ipmi_devintf ipmi_msghandler nf_tables nvme_fabrics 8021q garp mrp iptable_nat xt_REDIRECT nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp bonding tls qrtr softdog sunrpc nfnetlink_log binfmt_misc nfnetlink intel_rapl_msr intel_rapl_common i915(O) x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm xe crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 aesni_intel drm_gpuvm drm_exec crypto_simd gpu_sched drm_buddy drm_suballoc_helper cryptd cmdlinepart drm_ttm_helper rapl spi_nor ttm intel_cstate pcspkr wmi_bmof drm_display_helper mtd mei_me 8250_dw cec rc_core i2c_algo_bit mei igen6_edac intel_pmc_core intel_vsec pmt_telemetry acpi_pad pmt_class acpi_tad joydev input_leds mac_hid vhost_net vhost vhost_iotlb tap vfio_pci vfio_pci_core irqbypass
[123672.846095] RIP: 0010:intel_dp_aux_xfer+0x616/0x690 [i915]
[123672.851412]  ? intel_dp_aux_xfer+0x616/0x690 [i915]
[123672.855046]  ? intel_dp_aux_xfer+0x616/0x690 [i915]
[123672.855859]  ? intel_dp_aux_xfer+0x616/0x690 [i915]
[123672.856675]  intel_dp_aux_transfer+0x12f/0x330 [i915]

But still both 3D acceleration and video encoding/decoding seems to work ok-ish for the N100 via remote desktop (despite the errors above):
grafik

I agree that the removing the major minor version check could be an option (or test it against multiple known minor versions). Funny enough, even in case of the wrong major version (firmware mismatch), VA-API still seems to remain functional for the virtual functions...

@johntdavis84
Copy link

johntdavis84 commented Oct 5, 2024 via email

@bbaa-bbaa
Copy link
Contributor

bbaa-bbaa commented Oct 8, 2024

I may try that when I have time.

Done. Anyone interested in it can try PR #207 .

@haoyouab
Copy link

haoyouab commented Oct 13, 2024

Fedora 40 running on i7-14700k confirmed working with the following setup:

sed -i 's/GUCFIRMWARE_MINOR:-9/GUCFIRMWARE_MINOR:-13/' Makefile
dkms remove i915-sriov-dkms/2024.09.21 --all
dkms install -m i915-sriov-dkms -v $(cat VERSION) -k $(uname -r) --force

Add the following to kernel cmdline:

intel_iommu=on iommu=pt i915.enable_guc=3 i915.max_vfs=7

Reboot and then:

echo 2 > /sys/devices/pci0000\:00/0000\:00\:02.0/sriov_numvfs

On the host:
image

lspci output:
image

Latest intel_gpu_top output on the host:
image

I got error code 43 on windows guest at first. After I did the steps below it works smoothly:

  1. add <vendor_id state='on' value='GenuineIntel'/> using virsh edit
  2. remove the line <feature policy='disable' name='hypervisor'/> from <cpu> section

On the windows guest:
image

@hmoffatt
Copy link

But still:

This is what I see on the N100 (Alder Lake) also, with the latest Promox 8.3 kernel (6.8.12-4) and pve-firmware (3.14-1);

[   10.121376] i915 0000:00:02.7: Running in SR-IOV VF mode
[   10.121576] i915 0000:00:02.7: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[   10.121613] i915 0000:00:02.7: [drm] *ERROR* GT0: IOV: Found interface version 0.1.13.4
[   10.121999] i915 0000:00:02.7: [drm] VT-d active for gfx access
[   10.122023] i915 0000:00:02.7: [drm] Using Transparent Hugepages
[   10.122489] i915 0000:00:02.7: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[   10.122526] i915 0000:00:02.7: [drm] *ERROR* GT0: IOV: Found interface version 0.1.13.4
[   10.122840] i915 0000:00:02.7: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
[   10.122844] i915 0000:00:02.7: HuC firmware PRELOADED
[   10.123124] i915 0000:00:02.7: [drm] *ERROR* GT0: GUC: mmio request 0x4100: failure 201/0
[   10.123129] i915 0000:00:02.7: [drm] *ERROR* GT0: Failed to retrieve hwconfig table: -ENOENT
[   10.124798] i915 0000:00:02.7: [drm] Protected Xe Path (PXP) protected content support initialized
[   10.124808] i915 0000:00:02.7: [drm] PMU not supported for this GPU.
[   10.124895] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.7 on minor 7

In a linux guest I see non-stop timeout errors

[    4.371366] [drm:fw_domains_get_with_fallback [i915]] *ERROR* gt: timed out waiting for forcewake ack request.
[    4.371486] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915]
[    4.972254] [drm:fw_domains_get_with_fallback [i915]] *ERROR* gt: timed out waiting for forcewake ack request.
[    4.972370] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915]

@stefyelan
Copy link

I have the same issue. My device info as blew:

pveversion:
pve-manager/8.2.8/a577cfa684c7476d (running kernel: 6.8.12-1-pve)
Linux Kernel:
6.8.12-1-pve
pve-firmware:

  Installed: 3.14-1
  Candidate: 3.14-1

CPU:
13th Gen Intel(R) Core(TM) i9-13900H
GPU:
Intel® HD Graphics:Intel Corporation Raptor Lake-P [UHD Graphics] [8086:a720]

Enabled 3 VFs:

  00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-P [UHD Graphics] (rev 04)
  00:02.1 VGA compatible controller: Intel Corporation Raptor Lake-P [UHD Graphics] (rev 04)
  00:02.2 VGA compatible controller: Intel Corporation Raptor Lake-P [UHD Graphics] (rev 04)
  00:02.3 VGA compatible controller: Intel Corporation Raptor Lake-P [UHD Graphics] (rev 04)

dmesg message info:


[    2.578692] xe 0000:00:02.0: Your graphics device a720 is not officially supported
               by xe driver in this kernel version. To force Xe probe,
               use xe.force_probe='a720' and i915.force_probe='!a720'
               module parameters or CONFIG_DRM_XE_FORCE_PROBE='a720' and
               CONFIG_DRM_I915_FORCE_PROBE='!a720' configuration options.

[    4.286746] xe 0000:00:02.1: Your graphics device a720 is not officially supported
               by xe driver in this kernel version. To force Xe probe,
               use xe.force_probe='a720' and i915.force_probe='!a720'
               module parameters or CONFIG_DRM_XE_FORCE_PROBE='a720' and
               CONFIG_DRM_I915_FORCE_PROBE='!a720' configuration options.
[    4.286782] i915 0000:00:02.1: enabling device (0000 -> 0002)
[    4.286806] i915 0000:00:02.1: Running in SR-IOV VF mode
[    4.287528] i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.287562] i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Found interface version 0.1.13.4
[    4.288244] i915 0000:00:02.1: [drm] VT-d active for gfx access
[    4.288268] i915 0000:00:02.1: [drm] Using Transparent Hugepages
[    4.288929] i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.288963] i915 0000:00:02.1: [drm] *ERROR* GT0: IOV: Found interface version 0.1.13.4
[    4.289430] i915 0000:00:02.1: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
[    4.289431] i915 0000:00:02.1: HuC firmware PRELOADED
[    4.292607] i915 0000:00:02.1: [drm] Protected Xe Path (PXP) protected content support initialized
[    4.292610] i915 0000:00:02.1: [drm] PMU not supported for this GPU.

I'm not sure if this is correct..

@stefyelan
Copy link

Fedora 40 running on i7-14700k confirmed working with the following setup:

sed -i 's/GUCFIRMWARE_MINOR:-9/GUCFIRMWARE_MINOR:-13/' Makefile
dkms remove i915-sriov-dkms/2024.09.21 --all
dkms install -m i915-sriov-dkms -v $(cat VERSION) -k $(uname -r) --force

Add the following to kernel cmdline:

intel_iommu=on iommu=pt i915.enable_guc=3 i915.max_vfs=7

Reboot and then:

echo 2 > /sys/devices/pci0000\:00/0000\:00\:02.0/sriov_numvfs

On the host: image

lspci output: image

Latest intel_gpu_top output on the host: image

I got error code 43 on windows guest at first. After I did the steps below it works smoothly:

  1. add <vendor_id state='on' value='GenuineIntel'/> using virsh edit
  2. remove the line <feature policy='disable' name='hypervisor'/> from <cpu> section

On the windows guest: image

For which linux kernel version do you use, please.
I get the same problem in Fedora. many thanks

@haoyouab
Copy link

Fedora 40 running on i7-14700k confirmed working with the following setup:

sed -i 's/GUCFIRMWARE_MINOR:-9/GUCFIRMWARE_MINOR:-13/' Makefile
dkms remove i915-sriov-dkms/2024.09.21 --all
dkms install -m i915-sriov-dkms -v $(cat VERSION) -k $(uname -r) --force

Add the following to kernel cmdline:

intel_iommu=on iommu=pt i915.enable_guc=3 i915.max_vfs=7

Reboot and then:

echo 2 > /sys/devices/pci0000\:00/0000\:00\:02.0/sriov_numvfs

On the host: image
lspci output: image
Latest intel_gpu_top output on the host: image
I got error code 43 on windows guest at first. After I did the steps below it works smoothly:

  1. add <vendor_id state='on' value='GenuineIntel'/> using virsh edit
  2. remove the line <feature policy='disable' name='hypervisor'/> from <cpu> section

On the windows guest: image

For which linux kernel version do you use, please. I get the same problem in Fedora. many thanks

Fedora Workstation 40. Kernel 6.10.12-200.fc40.x86_64

@sacredx72
Copy link

@hmoffatt @pasbec
Did you manage to install the driver without errors? I tried different builds and the result is the same, errors are present in the log
[ 10.123124] i915 0000:00:02.7: [drm] ERROR GT0: GUC: mmio request 0x4100: failure 201/0
[ 10.123129] i915 0000:00:02.7: [drm] ERROR GT0: Failed to retrieve hwconfig table: -ENOENT
In PVE, as the Intel driver is installed, and there is an output of the load command "intel_gpu_top"

But at the same time, when starting Plex or Windows VM, the load status does not change, all the time at 0, I am sure that the video core is not loading correctly. Do you have any progress in solving this problem "hwconfig table: -ENOENT", "GUC: mmio request 0x4100: failure 201/0" ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests