Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i915 support on Intel Ultra CPU #195

Open
aznablegs opened this issue Aug 18, 2024 · 17 comments
Open

i915 support on Intel Ultra CPU #195

aznablegs opened this issue Aug 18, 2024 · 17 comments

Comments

@aznablegs
Copy link

Hi there,

Thanks for the driver and guide which is very helpful. I have applied to 12th gen CPU and working like charm. Recently just bought a new mini PC with Intel Ultra CPU, seems the dkms driver doesn't work.

After dig out some document, found Intel mentioned it is not supported by SR-IOV but ATS. Any idea if I still can use your driver to support this function? Thanks.

https://www.intel.com/content/www/us/en/support/articles/000093216/graphics/processor-graphics.html

@wolfcar
Copy link

wolfcar commented Sep 11, 2024

There is a post on the official Intel forum at https://www.intel.com/content/www/us/en/support/articles/000008563/ethernet-products.html. I don’t have an Ultra PC, could you try it and see if it works successfully?

@aznablegs
Copy link
Author

thanks. but seems nothing output from the scan. after some troubleshooting, I think the driver under proxmox still not fully supported under the kernel 6.8...

@tristan-k
Copy link

I tried but failed. See my comment here.

@tristan-k
Copy link

tristan-k commented Sep 20, 2024

I gave it another try but the guests are still not able to use the VFs of the iGPU.

Here is what I did. See this comment for further explanations.

On the host:

$ uname -r
6.8.12-2-pve
$ apt install build-essential dkms sysfsutils
$ cd ~
$ git clone https://github.com/strongtz/i915-sriov-dkms.git
mv i915-sriov-dkms/ /usr/src/i915-sriov-dkms-2024.08.09
$ cd /usr/src/i915-sriov-dkms-2024.08.09
$ dkms add .
$ dkms install -m i915-sriov-dkms -v 2024.08.09 --force --kernelsourcedir /usr/src/linux-headers-6.8.12-2-pve/
$ echo "devices/pci0000:00/0000:00:02.0/sriov_numvfs = 7" > /etc/sysfs.conf
$ cat /etc/default/grub | grep GRUB_CMDLINE_LINUX_DEFAULT
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on i915.force_probe=7d55 i915.enable_guc=3 i915.max_vfs=7"
$ update-grub
$ update-initramfs -u -k $(uname -r)
$ mkdir firmware
$ cd firmware
$ wget -r -nd -e robots=no -A '*.bin' --accept-regex '/plain/' https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915/
$ mv *.bin /lib/firmware/i915/
$ cp /lib/firmware/i915/mtl_guc_70.bin /lib/firmware/i915/mtl_guc_70.6.4.bin
$ reboot
$ dkms status
i915-sriov-dkms/2024.08.09, 6.8.12-2-pve, x86_64: installed
$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
00:02.1 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
00:02.2 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
00:02.3 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
00:02.4 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
00:02.5 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
00:02.6 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
00:02.7 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
$ inxi -G | grep Device-1
  Device-1: Intel Meteor Lake-P [Intel Arc Graphics] driver: i915 v: kernel

On the guest (Ubunut 24.04). Currently only works with Kernel 6.8.0-41.

$ apt install linux-image-6.8.0-41-generic linux-headers-6.8.0-41-generic linux-modules-extra-6.5.0-41-generic
$ cat /etc/default/grub | grep GRUB_DEFAULT
GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 6.8.0-41-generic"
$ reboot
$ uname -r 
linux-image-6.8.0-41-generic
$ cd ~
$ git clone https://github.com/strongtz/i915-sriov-dkms.git
mv i915-sriov-dkms/ /usr/src/i915-sriov-dkms-2024.08.09
$ cd /usr/src/i915-sriov-dkms-2024.08.09
$ dkms add .
$ dkms install -m i915-sriov-dkms -v 2024.08.09 --force --kernelsourcedir /usr/src/linux-headers-6.8.0-41-generic/
$ cat /etc/default/grub | grep GRUB_CMDLINE_LINUX_DEFAULT
GRUB_CMDLINE_LINUX_DEFAULT="noquiet nosplash i915.force_probe=7d55 i915.enable_guc=3"
$ update-grub
$ update-initramfs -u -k $(uname -r)
$ reboot
$ dkms status
i915-sriov-dkms/2024.08.09, 6.8.0-41-generic, x86_64: installed
$ lspci | grep VGA
06:10.0 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
$ inxi -G | grep Device-1
  Device-1: Intel Meteor Lake-P [Intel Arc Graphics] driver: N/A

I guess progress for Core Ultra devices can be followed here. Currently it's not possible to use SR-IOV with Meteor Lake unless I did something wrong.

On host:

$ dmesg | grep i915
[    4.479634] i915 0000:00:02.6: [drm] Beware, driver is using hardcoded IPVER values!
[    4.479677] i915 0000:00:02.6: [drm] GT0: Incompatible option enable_guc=3 - HuC is not supported!
[    4.479807] i915 0000:00:02.6: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.479841] i915 0000:00:02.6: [drm] *ERROR* GT0: IOV: Found interface version 0.1.13.4
[    4.480415] i915 0000:00:02.6: [drm] *ERROR* GT1: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.480449] i915 0000:00:02.6: [drm] *ERROR* GT1: IOV: Found interface version 0.1.13.4
[    4.480868] i915 0000:00:02.6: [drm] VT-d active for gfx access
[    4.480879] i915 0000:00:02.6: [drm] Using Transparent Hugepages
[    4.481212] i915 0000:00:02.6: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.481247] i915 0000:00:02.6: [drm] *ERROR* GT0: IOV: Found interface version 0.1.13.4
[    4.481613] i915 0000:00:02.6: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
[    4.481614] i915 0000:00:02.6: HuC firmware N/A
[    4.483635] i915 0000:00:02.6: [drm] *ERROR* GT1: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.483648] i915 0000:00:02.6: [drm] *ERROR* GT1: IOV: Found interface version 0.1.13.4
[    4.483883] i915 0000:00:02.6: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
[    4.483884] i915 0000:00:02.6: HuC firmware DISABLED
[    4.485760] i915 0000:00:02.6: [drm] PMU not supported for this GPU.
[    4.485799] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.6 on minor 6
[    4.486037] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[    4.486040] i915 0000:00:02.1: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[    4.486043] i915 0000:00:02.2: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[    4.486045] i915 0000:00:02.3: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[    4.486048] i915 0000:00:02.4: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[    4.486051] i915 0000:00:02.5: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=none
[    4.486054] i915 0000:00:02.6: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
               use xe.force_probe='7d55' and i915.force_probe='!7d55'
[    4.486101] i915 0000:00:02.7: enabling device (0000 -> 0002)
[    4.486110] i915 0000:00:02.7: Running in SR-IOV VF mode
[    4.486111] i915 0000:00:02.7: [drm] Beware, driver is using hardcoded IPVER values!
[    4.486150] i915 0000:00:02.7: [drm] GT0: Incompatible option enable_guc=3 - HuC is not supported!
[    4.486270] i915 0000:00:02.7: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.486304] i915 0000:00:02.7: [drm] *ERROR* GT0: IOV: Found interface version 0.1.13.4
[    4.486835] i915 0000:00:02.7: [drm] *ERROR* GT1: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.486870] i915 0000:00:02.7: [drm] *ERROR* GT1: IOV: Found interface version 0.1.13.4
[    4.487371] i915 0000:00:02.7: [drm] VT-d active for gfx access
[    4.487382] i915 0000:00:02.7: [drm] Using Transparent Hugepages
[    4.487709] i915 0000:00:02.7: [drm] *ERROR* GT0: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.487744] i915 0000:00:02.7: [drm] *ERROR* GT0: IOV: Found interface version 0.1.13.4
[    4.488105] i915 0000:00:02.7: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
[    4.488107] i915 0000:00:02.7: HuC firmware N/A
[    4.489849] i915 0000:00:02.7: [drm] *ERROR* GT1: IOV: Unable to confirm version 1.13 (0000000000000000)
[    4.489885] i915 0000:00:02.7: [drm] *ERROR* GT1: IOV: Found interface version 0.1.13.4
[    4.490144] i915 0000:00:02.7: GuC firmware PRELOADED version 0.0 submission:SR-IOV VF
[    4.490145] i915 0000:00:02.7: HuC firmware DISABLED
[    4.491817] i915 0000:00:02.7: [drm] PMU not supported for this GPU.
[    4.491854] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.7 on minor 7
[    4.491973] i915 0000:00:02.0: Enabled 7 VFs
[ 1079.344911] i915 0000:00:02.0: Using 41-bit DMA addresses
[ 2492.496269] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF1 FLR notification
[ 2492.496276] i915 0000:00:02.0: VF1 FLR
[ 2494.741808] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF1 FLR notification
[ 2494.741911] i915 0000:00:02.0: VF1 FLR
[ 2494.860194] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF1 FLR notification
[ 2494.860273] i915 0000:00:02.0: VF1 FLR

on guest:

$ dmesg | grep i915
               use xe.force_probe='7d55' and i915.force_probe='!7d55'
[    3.316820] i915: loading out-of-tree module taints kernel.
[    3.316868] i915: module verification failed: signature and/or required key missing - tainting kernel
[    3.530334] i915 0000:06:10.0: Your graphics device 7d55 is not properly supported by i915 in this
               kernel version. To force driver probe anyway, use i915.force_probe=7d55

For Windows 11 guest:

  • add args: -cpu 'host,vendor=GenuineIntel' to /etc/pve/qemu-server/VM-ID.conf
  • Change VirtIO-GPU in Display to None

Error Code 43 is still in the device manager. Not sure what to do next.

@aznablegs
Copy link
Author

from Intel site, I just see the latest Ultra CPU doesn't support SR-IOV but sth called ATS. Still, can't find any relevant doc regarding this topic.

https://www.intel.com/content/www/us/en/support/articles/000093216/graphics/processor-graphics.html

@celesrenata
Copy link

from Intel site, I just see the latest Ultra CPU doesn't support SR-IOV but sth called ATS. Still, can't find any relevant doc regarding this topic.

https://www.intel.com/content/www/us/en/support/articles/000093216/graphics/processor-graphics.html

Meteor Lake supports GPU SR-IOV. I'm running it on Bee-Link GTi14s right now. However, i915's start is as you mentioned, not graceful.

[celes@gremlin-1:~]$ sudo dmesg | grep i915
[    0.000000] Command line: initrd=\EFI\nixos\0b62fra1gwz1mpmxbjn19y0x9d4ck3w1-initrd-linux-6.10.13-initrd.efi init=/nix/store/niqp9zww020c12k0wqjpdmzd9vb85418-nixos-system-gremlin-1-24.11pre690827.5633bcff0c61/init intel_iommu=on iommu=pt i915.enable_guc=3 i915.max_vfs=7 i915.force_probe=7d55 boot.shell_on_fail hugepagesz=1G hugepages=2 hugepagesz=2M hugepages=512 root=fstab loglevel=4
[    0.146116] Kernel command line: initrd=\EFI\nixos\0b62fra1gwz1mpmxbjn19y0x9d4ck3w1-initrd-linux-6.10.13-initrd.efi init=/nix/store/niqp9zww020c12k0wqjpdmzd9vb85418-nixos-system-gremlin-1-24.11pre690827.5633bcff0c61/init intel_iommu=on iommu=pt i915.enable_guc=3 i915.max_vfs=7 i915.force_probe=7d55 boot.shell_on_fail hugepagesz=1G hugepages=2 hugepagesz=2M hugepages=512 root=fstab loglevel=4
[    3.929375] i915: loading out-of-tree module taints kernel.
[    4.414028] i915 0000:00:02.0: Running in SR-IOV PF mode
[    4.414800] i915 0000:00:02.0: [drm] GT0: Incompatible option enable_guc=3 - HuC is not supported!
[    4.415605] i915 0000:00:02.0: [drm] VT-d active for gfx access
[    4.415608] i915 0000:00:02.0: vgaarb: deactivate vga console
[    4.415633] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[    4.415837] i915 0000:00:02.0: Port F asks to use VBT vswing/preemph tables
[    4.415856] WARNING: CPU: 3 PID: 568 at /build/source/drivers/gpu/drm/i915/display/intel_bios.c:2708 intel_bios_init+0x1b86/0x20a0 [i915]
[    4.416096] Modules linked in: snd_seq_device(+) videodev(+) snd_soc_acpi(+) x86_pkg_temp_thermal(+) mac80211(+) soundwire_bus intel_powerclamp videobuf2_common snd_soc_core mc i915(O+) libarc4 coretemp crc32_pclmul polyval_clmulni polyval_generic snd_compress ac97_bus snd_pcm_dmaengine iwlwifi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec btusb gf128mul ghash_clmulni_intel sha512_ssse3 btrtl sha256_ssse3 sha1_ssse3 btintel snd_hda_core aesni_intel btbcm cmdlinepart crypto_simd btmtk mei_gsc_proxy snd_hwdep iTCO_wdt hid_generic cryptd drm_buddy bluetooth sd_mod spi_nor igc snd_pcm cfg80211 usbhid intel_pmc_bxt ttm rapl mtd watchdog 8250_dw snd_timer mei_me evdev hid uas ptp wmi_bmof mac_hid intel_cstate drm_display_helper crc16 intel_lpss_pci pps_core intel_uncore i2c_i801 snd intel_lpss mei cec led_class rfkill spi_intel_pci idma64 intel_vpu i2c_mux soundcore spi_intel i2c_smbus virt_dma igen6_edac intel_gtt intel_ipu6 i2c_algo_bit edac_core thermal video fan ipu_bridge intel_pmc_core wmi intel_vsec
[    4.416196] RIP: 0010:intel_bios_init+0x1b86/0x20a0 [i915]
[    4.416437]  ? intel_bios_init+0x1b86/0x20a0 [i915]
[    4.416663]  ? intel_bios_init+0x1b86/0x20a0 [i915]
[    4.416844]  ? intel_bios_init+0x1b85/0x20a0 [i915]
[    4.417042]  intel_modeset_init_noirq+0x39/0x250 [i915]
[    4.417259]  i915_driver_probe+0x6c0/0xd90 [i915]
[    4.417483]  i915_init+0x22/0xc0 [i915]
[    4.417656]  ? __pfx_i915_init+0x10/0x10 [i915]
[    4.430734] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    4.446008] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/mtl_dmc.bin (v2.23)
[    4.454141] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.6.4.bin version 70.6.4
[    4.465013] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[    4.465015] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[    4.465334] i915 0000:00:02.0: [drm] GuC RC: enabled
[    4.470275] mei_gsc_proxy 0000:00:16.0-0f73db04-97ab-4125-b893-e904ad0d5464: bound 0000:00:02.0 (ops i915_gsc_proxy_component_ops [i915])
[    4.470621] i915 0000:00:02.0: [drm] GT1: GuC firmware i915/mtl_guc_70.6.4.bin version 70.6.4
[    4.470623] i915 0000:00:02.0: [drm] GT1: HuC firmware i915/mtl_huc_8.4.3_gsc.bin version 8.4.3
[    4.497649] i915 0000:00:02.0: [drm] GT1: HuC: authenticated for clear media!
[    4.498216] i915 0000:00:02.0: [drm] GT1: GUC: submission enabled
[    4.498218] i915 0000:00:02.0: [drm] GT1: GUC: SLPC enabled
[    4.498346] i915 0000:00:02.0: [drm] GuC RC: enabled
[    4.499013] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[    4.512941] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    4.516151] i915 0000:00:02.0: 7 VFs could be associated with this PF
[    4.516602] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[    4.516606] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    4.517481] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[    4.517916] i915 0000:00:02.0: [drm] Cannot find any crtc or sizes
[    5.016668] i915 0000:00:02.0: [drm] *ERROR* Request submission for GSC load failed (-62)
[    5.016679] i915 0000:00:02.0: [drm] *ERROR* GT1: Failed to load GSC firmware i915/mtl_gsc_102.0.0.1511.bin -ETIME
[    5.745241] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=io+mem
[    5.745440] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[    5.745575] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[    5.745737] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[    5.745921] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[    5.746088] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[    5.746295] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=none:owns=io+mem
[    5.746413] i915 0000:00:02.0: Enabled 7 VFs
[    8.841292] i915 0000:00:02.0: [drm] *ERROR* GT1: GUC: Engine reset failed on 5:6 (gsc0) because 0x00000000
[    8.851211] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:0:00000000
[    8.851337] i915 0000:00:02.0: [drm] Resetting chip for GuC failed to reset engine mask=0x4000000
[    8.851488] i915 0000:00:02.0: [drm] GT1: GuC firmware i915/mtl_guc_70.6.4.bin version 70.6.4
[    8.851492] i915 0000:00:02.0: [drm] GT1: HuC firmware i915/mtl_huc_8.4.3_gsc.bin version 8.4.3
[    8.870792] i915 0000:00:02.0: [drm] GT1: GUC: submission enabled
[    8.870797] i915 0000:00:02.0: [drm] GT1: GUC: SLPC enabled
[ 1479.181995] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF6 FLR notification
[ 1479.182151] i915 0000:00:02.0: VF6 FLR
[ 1479.299381] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF6 FLR notification
[ 1479.299722] i915 0000:00:02.0: VF6 FLR
[ 1643.799741] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 1643.799903] i915 0000:00:02.0: VF5 FLR
[ 1643.916533] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 1643.916755] i915 0000:00:02.0: VF5 FLR
[ 2150.196131] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 2150.196360] i915 0000:00:02.0: VF5 FLR
[ 2150.529756] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 2150.529984] i915 0000:00:02.0: VF5 FLR
[ 2359.631695] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 2359.631878] i915 0000:00:02.0: VF5 FLR
[ 2359.933601] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 2359.933802] i915 0000:00:02.0: VF5 FLR
[ 3968.114362] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 3968.114623] i915 0000:00:02.0: VF5 FLR
[ 3968.376868] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 3968.377148] i915 0000:00:02.0: VF5 FLR
[ 4182.503517] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 4182.503649] i915 0000:00:02.0: VF5 FLR
[ 4182.779505] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 4182.779858] i915 0000:00:02.0: VF5 FLR
[ 4922.648797] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 4922.649045] i915 0000:00:02.0: VF5 FLR
[ 4922.925597] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF5 FLR notification
[ 4922.925733] i915 0000:00:02.0: VF5 FLR
[ 5488.255838] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF4 FLR notification
[ 5488.256102] i915 0000:00:02.0: VF4 FLR
[ 5488.379197] i915 0000:00:02.0: [drm] GT1: IOV: Unexpected VF4 FLR notification
[ 5488.379300] i915 0000:00:02.0: VF4 FLR

despite all of that, I still am able to transcode with plex via Kubernetes and run ollama via the vGPUs in KubeVirt And validated that it was using the vGPUs. Pics below to confirm.

nixos-gremlin-1
nixos-gremlin-1-test

@mm2293
Copy link

mm2293 commented Oct 16, 2024

Meteor Lake supports GPU SR-IOV. I'm running it on Bee-Link GTi14s right now. However, i915's start is as you mentioned, not graceful.

Interesting... Could you please make a test with Windows 10 or 11?

@celesrenata
Copy link

Way ahead of you, I'm still fighting to get it to work.

I've got QEMU to report the PCI card from 0000:02.00
I got all the settings that intel drivers expect and then some in there
But I'm up against error 43, you can restart the device and it claims it works, but I'm still fighting that.

As for successes, I have Ubuntu 24 successfully running LLM tasks, as well as transcoding jobs from pods.
I am working on Arch and NixOS configs to run inside the VM to give linux headless features to the project next.

My work towards making everythign work lives here: https://github.com/celesrenata/nixos-k3s-configs

You can explore the settings required from this thread I am on for windows 11, windows 10 should already work, but I haven't tested it since it is so close to EOL: kubevirt/kubevirt#11338 (comment)

@mm2293
Copy link

mm2293 commented Oct 16, 2024

Way ahead of you, I'm still fighting to get it to work.

I've got QEMU to report the PCI card from 0000:02.00 I got all the settings that intel drivers expect and then some in there But I'm up against error 43, you can restart the device and it claims it works, but I'm still fighting that.

As for successes, I have Ubuntu 24 successfully running LLM tasks, as well as transcoding jobs from pods. I am working on Arch and NixOS configs to run inside the VM to give linux headless features to the project next.

My work towards making everythign work lives here: https://github.com/celesrenata/nixos-k3s-configs

You can explore the settings required from this thread I am on for windows 11, windows 10 should already work, but I haven't tested it since it is so close to EOL: kubevirt/kubevirt#11338 (comment)

So you think the problem is that the PCI card is reported on the wrong address? Do I understand right? As I could see in your kubevirt link you overwrite the address so that it gets reported on 0000:02.00. Why do you think it‘ll work on Windows 10 but not on Windows 11?

@jeeftor
Copy link

jeeftor commented Oct 16, 2024

This is your intel config?

https://github.com/celesrenata/nixos-k3s-configs/blob/main/nixos-kube-configs/gremlin-1/overlays/i915-sriov-dkms.nix

Your Nix skill is impressive ... i always get my head so confused with nix :) (even though I'm using nix-darwin on most of my macs)

@celesrenata
Copy link

Way ahead of you, I'm still fighting to get it to work.
I've got QEMU to report the PCI card from 0000:02.00 I got all the settings that intel drivers expect and then some in there But I'm up against error 43, you can restart the device and it claims it works, but I'm still fighting that.
As for successes, I have Ubuntu 24 successfully running LLM tasks, as well as transcoding jobs from pods. I am working on Arch and NixOS configs to run inside the VM to give linux headless features to the project next.
My work towards making everythign work lives here: https://github.com/celesrenata/nixos-k3s-configs
You can explore the settings required from this thread I am on for windows 11, windows 10 should already work, but I haven't tested it since it is so close to EOL: kubevirt/kubevirt#11338 (comment)

So you think the problem is that the PCI card is reported on the wrong address? Do I understand right? As I could see in your kubevirt link you overwrite the address so that it gets reported on 0000:02.00. Why do you think it‘ll work on Windows 10 but not on Windows 11?

I'll spin up windows 10 to verify this weekend.

@celesrenata
Copy link

This is your intel config?

https://github.com/celesrenata/nixos-k3s-configs/blob/main/nixos-kube-configs/gremlin-1/overlays/i915-sriov-dkms.nix

Your Nix skill is impressive ... i always get my head so confused with nix :) (even though I'm using nix-darwin on most of my macs)

https://github.com/celesrenata/nixos-k3s-configs/blob/main/nixos-kube-configs/gremlin-1/overlays/i915-sriov-dkms.nix is the main module from this repo I import.
https://github.com/celesrenata/nixos-k3s-configs/blob/main/nixos-kube-configs/gremlin-1/overlays/intel-gfx-sriov.nix I borrow from the original intel sr-iov project to create the VFs
https://github.com/celesrenata/nixos-k3s-configs/blob/main/nixos-kube-configs/gremlin-1/overlays/intel-firmware.nix to hunt down the specific firmware files.
and to add the custom kernel config values before compiling. https://github.com/celesrenata/nixos-k3s-configs/blob/main/nixos-kube-configs/gremlin-1/overlays/kernel.nix

@celesrenata
Copy link

I'll spin up windows 10 to verify this weekend.

Nope, win10 also fails.

@mm2293
Copy link

mm2293 commented Oct 21, 2024

What makes you think that the driver fully supports sriov for meteor lake?

@ccoles146
Copy link

This is what Intel says: https://dgpu-docs.intel.com/driver/kernel-driver-types.html

The following features are only available in the out-of-tree i915 kernel module:
-GPU debugging
-Single Root I/O Virtualization (SR-IOV)
-Virtual Memory Binding (VM_BIND) and Ultra Low Latency Submission (ULLS)

Although I tried to get it working with the Xe drivers, it seems it is not supported. So far I haven't got the dkms driver working either.

@celesrenata
Copy link

What makes you think that the driver fully supports sriov for meteor lake?

I thought I'd give it a try as I am consistently able to get LLMs via SR-IOV running in parallel in Ubuntu 24.04 KubeVirts with this driver.

@celesrenata
Copy link

https://lore.kernel.org/lkml/CAPM=9txbfH8vf-YjwTXEYL729a6r2eeLBxCJc3MSD-t5jXVA-w@mail.gmail.com/ looks like 6.13 has some sriov work, we might see it then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants