Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hibernation/Suspension gets stuck when iGPU VFs are enabled #41

Open
raldone01 opened this issue Mar 3, 2023 · 2 comments
Open

Hibernation/Suspension gets stuck when iGPU VFs are enabled #41

raldone01 opened this issue Mar 3, 2023 · 2 comments

Comments

@raldone01
Copy link

raldone01 commented Mar 3, 2023

I compiled the kernel at TAG: lts-v6.1.8-linux-230201T082419Z

  • CPU: 11th Gen Intel i7-1165G7 (8) @ 4.700GHz
  • GPU: Intel TigerLake-LP GT2 [Iris Xe Graphics]

I created 7 VFs with:
sudo bash -c "echo 7 > /sys/devices/pci0000:00/0000:00:02.0/sriov_numvfs"

sudo systemctl hibernate freezes the machine with the display turned off.
I must force it off and boot again.

If I run
sudo bash -c "echo 0 > /sys/devices/pci0000:00/0000:00:02.0/sriov_numvfs"
before sudo systemctl hibernate the hibernation works without issue.

I checked dmesg and the last entry is Mar 03 09:48:13 Elizabeth kernel: PM: hibernation: hibernation entry with no obvious error present.

I just tested sudo systemctl suspend exhibits the same issue.

suspend.dmesg.txt

@raldone01 raldone01 changed the title Hibernation gets stuck when iGPU VFs are enabled Hibernation/Suspension gets stuck when iGPU VFs are enabled Mar 3, 2023
sys-oak pushed a commit that referenced this issue May 5, 2023
Use local_irq_{enable, disable}_full() call forms to update the
interrupt state in the #DB handler. Issue caught by a kernel splat
running gdb on an application with CONFIG_DEBUG_DOVETAIL enabled:

[   52.097079] WARNING: CPU: 2 PID: 1318 at ../kernel/irq/pipeline.c:316 inband_irq_enable+0x10/0x20
[   52.097079] Modules linked in: 9p
[   52.097080] CPU: 2 PID: 1318 Comm: latency Not tainted 5.10.19+ #41
[   52.097080] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[   52.097080] IRQ stage: Linux
[   52.097081] RIP: 0010:inband_irq_enable+0x10/0x20
[   52.097081] Code: 00 00 00 01 75 ee e8 cf fa ff ff 53 9d 5b c3 66 66 2e 0f 1f 84 00 00 00 00 00 80 3d 9a 38 b3 02 00 75 09 9c 58 f6 c4 02 75 02 <0f> 0b eb 8c 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48
[   52.097081] RSP: 0000:ffffc90000783f20 EFLAGS: 00010046
[   52.097082] RAX: 0000000000000046 RBX: ffffc90000783f58 RCX: 0000000000000000
[   52.097082] RDX: ffffc90000783ef0 RSI: ffffffff8109e600 RDI: ffffffff81d4eee2
[   52.097082] RBP: ffff888006e70000 R08: 0000000000000000 R09: 0000000000000000
[   52.097083] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000004000
[   52.097083] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   52.097083] FS:  00007ffff7fe6640(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000
[   52.097084] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   52.097084] CR2: 00007ffff7243610 CR3: 00000000070c6001 CR4: 0000000000370ee0
[   52.097084] Call Trace:
[   52.097084]  noist_exc_debug+0xf7/0x180
[   52.097085]  ? asm_exc_debug+0x23/0x30
[   52.097085]  asm_exc_debug+0x2b/0x30
[   52.097085] RIP: 0033:0x401df3
[   52.097086] Code: 00 00 e9 b0 fb ff ff ff 25 62 44 20 00 68 44 00 00 00 e9 a0 fb ff ff ff 25 5a 44 20 00 68 45 00 00 00 e9 90 fb ff ff 31 ed 90 <e8> f9 30 01 00 48 8d 65 d8 5b 41 5c 41 5d 41 70 44 40 00 48 c7 c1
[   52.097086] RSP: 002b:00007fffffffe1c0 EFLAGS: 00000346
[   52.097086] RAX: 00007ffff7ffe0e0 RBX: 00007ffff7ffe0e0 RCX: 00007ffff7df23c7
[   52.097087] RDX: 0000103e00000000 RSI: 0000000000000000 RDI: 0000000000000000
[   52.097087] RBP: 00007fffffffe3a0 R08: 00007ffff6e8f008 R09: 0000000000000009
[   52.097087] R10: 00007ffff7ffd990 R11: 0000000000000206 R12: 0000000000000000
[   52.097087] R13: 00007ffff7ffe110 R14: 00007ffff7ffe110 R15: 00007ffff7fe6640
[   52.097088] irq event stamp: 0
[   52.097088] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[   52.097088] hardirqs last disabled at (0): [<ffffffff8106c648>] copy_process+0x718/0x1cd0
[   52.097089] softirqs last  enabled at (0): [<ffffffff8106c648>] copy_process+0x718/0x1cd0
[   52.097089] softirqs last disabled at (0): [<0000000000000000>] 0x0
[   52.097089] ---[ end trace b07496576d3779dc ]---

See https://xenomai.org/pipermail/xenomai/2021-March/044662.html.

Reported-by: Jan Kiszka <[email protected]>
Signed-off-by: Philippe Gerum <[email protected]>
@krispan-intel
Copy link

fecdf03 i915: drm/i915/iov: Fix suspend/hibernate when VFs are enabled

@raldone01
Copy link
Author

I compiled the kernel at the specified commit.

sudo bash -c "echo 7 > /sys/devices/pci0000:00/0000:00:02.0/sriov_numvfs"

uname: 6.1.38-1-intel-lts-sriov-02393-gfecdf03f88d0 #1 SMP PREEMPT_DYNAMIC Tue, 07 Nov 2023 04:00:29 +0000 x86_64 GNU/Linux

cmdline: root=/dev/mapper/main-root rootflags=subvol=/@ rw loglevel=3 quiet video=efifb:nobgrt resume=/dev/mapper/main-swap initrd=intel-ucode.img initrd=initramfs-%v.img intel_iommu=on i915.enable_guc=3 i915.enable_fbc=1 i915.max_vfs=7

dmesg -wH

[Nov 7 10:19] pci 0000:00:02.1: [8086:9a49] type 00 class 0x030000
[  +0.000026] pci 0000:00:02.1: DMAR: Skip IOMMU disabling for graphics
[  +0.000056] pci 0000:00:02.1: Adding to iommu group 17
[  +0.000097] pci 0000:00:02.1: vgaarb: bridge control possible
[  +0.000001] pci 0000:00:02.1: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[  +0.000004] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[  +0.000055] i915 0000:00:02.1: enabling device (0000 -> 0002)
[  +0.000035] i915 0000:00:02.1: Running in SR-IOV VF mode
[  +0.000155] i915 0000:00:02.1: [drm] GT0: GUC: interface version 0.1.0.0
[  +0.000536] i915 0000:00:02.1: [drm] VT-d active for gfx access
[  +0.000036] i915 0000:00:02.1: [drm] Using Transparent Hugepages
[  +0.000329] BUG: kernel NULL pointer dereference, address: 0000000000000018
[  +0.000003] #PF: supervisor read access in kernel mode
[  +0.000002] #PF: error_code(0x0000) - not-present page
[  +0.000001] PGD 0 P4D 0
[  +0.000003] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  +0.000002] CPU: 0 PID: 3909 Comm: bash Tainted: G     U             6.1.38-1-intel-lts-sriov-02393-gfecdf03f88d0 #1 79fabecb2b7d53307a9f46e7ad50bc0319ab8e27
[  +0.000004] Hardware name: SAMSUNG ELECTRONICS CO., LTD. 950QDB/NP950QDB-KC2DE, BIOS P11AKG.027.231017.MK 10/17/2023
[  +0.000001] RIP: 0010:wa_list_apply+0x32/0x190 [i915]
[  +0.000095] Code: 56 41 55 41 54 55 53 48 83 ec 18 4c 8b 2f 8b 77 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 48 c7 44 24 08 00 00 00 00 <49> 8b 6d 18 85 f6 75 27 48 8b 44 24 10 65 48 2b 04 25 28 00 00 00
[  +0.000002] RSP: 0018:ffffaa47898f7978 EFLAGS: 00010246
[  +0.000002] RAX: 0000000000000000 RBX: ffff90548815a608 RCX: 0000000000000018
[  +0.000002] RDX: 00001cf57a769568 RSI: 0000000000000000 RDI: ffff90548815b420
[  +0.000001] RBP: ffff90548815a608 R08: 0000000000000001 R09: ffff90548815b480
[  +0.000001] R10: 0000000000000001 R11: 0000000000000000 R12: ffff905488159dc0
[  +0.000001] R13: 0000000000000000 R14: ffff90548815b240 R15: ffff905281f22000
[  +0.000002] FS:  00007f290b767740(0000) GS:ffff9055fb600000(0000) knlGS:0000000000000000
[  +0.000001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000002] CR2: 0000000000000018 CR3: 0000000370614003 CR4: 0000000000f70ef0
[  +0.000001] PKRU: 55555554
[  +0.000002] Call Trace:
[  +0.000003]  <TASK>
[  +0.000002]  ? __die_body.cold+0x1a/0x1f
[  +0.000005]  ? page_fault_oops+0x15a/0x2d0
[  +0.000004]  ? exc_page_fault+0x7c/0x180
[  +0.000004]  ? asm_exc_page_fault+0x26/0x30
[  +0.000005]  ? wa_list_apply+0x32/0x190 [i915 9aa588a8a5901ef21a5553680c43be0e1b35b9a2]
[  +0.000080]  intel_gt_init_hw+0x84/0x220 [i915 9aa588a8a5901ef21a5553680c43be0e1b35b9a2]
[  +0.000080]  intel_gt_resume+0xbf/0x220 [i915 9aa588a8a5901ef21a5553680c43be0e1b35b9a2]
[  +0.000072]  intel_gt_init+0x17b/0x320 [i915 9aa588a8a5901ef21a5553680c43be0e1b35b9a2]
[  +0.000075]  i915_gem_init+0x1a4/0x240 [i915 9aa588a8a5901ef21a5553680c43be0e1b35b9a2]
[  +0.000080]  i915_driver_probe+0x6c5/0xbe0 [i915 9aa588a8a5901ef21a5553680c43be0e1b35b9a2]
[  +0.000071]  local_pci_probe+0x42/0x80
[  +0.000003]  pci_device_probe+0xc1/0x250
[  +0.000002]  ? sysfs_do_create_link_sd+0x6e/0xe0
[  +0.000003]  really_probe+0xdb/0x380
[  +0.000002]  ? pm_runtime_barrier+0x54/0x90
[  +0.000002]  __driver_probe_device+0x78/0x120
[  +0.000001]  driver_probe_device+0x1f/0x90
[  +0.000002]  __device_attach_driver+0x89/0x110
[  +0.000001]  ? driver_allows_async_probing+0x70/0x70
[  +0.000002]  bus_for_each_drv+0x8c/0xe0
[  +0.000002]  __device_attach+0xb2/0x1e0
[  +0.000002]  pci_bus_add_device+0x4e/0x70
[  +0.000002]  pci_iov_add_virtfn+0x2ee/0x330
[  +0.000003]  sriov_enable+0x1e9/0x3a0
[  +0.000002]  i915_sriov_pf_enable_vfs+0x14e/0x1d0 [i915 9aa588a8a5901ef21a5553680c43be0e1b35b9a2]
[  +0.000080]  sriov_numvfs_store+0xc6/0x160
[  +0.000003]  kernfs_fop_write_iter+0x133/0x1d0
[  +0.000003]  vfs_write+0x236/0x3f0
[  +0.000003]  ksys_write+0x6f/0xf0
[  +0.000003]  do_syscall_64+0x5d/0x90
[  +0.000003]  ? do_user_addr_fault+0x237/0x580
[  +0.000003]  ? exc_page_fault+0x7c/0x180
[  +0.000003]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  +0.000003] RIP: 0033:0x7f290b8e5034
[  +0.000030] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[  +0.000002] RSP: 002b:00007fff68c4a8e8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[  +0.000002] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f290b8e5034
[  +0.000001] RDX: 0000000000000002 RSI: 00005628cbdd7380 RDI: 0000000000000001
[  +0.000002] RBP: 00005628cbdd7380 R08: 0000000000000000 R09: 0000000000000001
[  +0.000001] R10: 0000000000000004 R11: 0000000000000202 R12: 0000000000000002
[  +0.000001] R13: 00007f290b9ba5c0 R14: 00007f290b9b7f20 R15: 0000000000000000
[  +0.000002]  </TASK>
[  +0.000001] Modules linked in: ccm uinput snd_seq_dummy snd_hrtimer snd_seq rfcomm snd_seq_device cmac algif_hash algif_skcipher af_alg snd_ctl_led snd_soc_skl_hda_dsp snd_soc_intel_hda_dsp_common snd_soc_hdac_hdmi snd_sof_probes snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio xt_CHECKSUM xt_MASQUERADE bridge stp llc rtsx_usb_ms memstick bnep uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 btusb videobuf2_common btrtl btbcm btintel videodev btmtk mc bluetooth ecdh_generic crc16 hid_sensor_custom_intel_hinge hid_sensor_gyro_3d hid_sensor_als hid_sensor_accel_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom snd_soc_dmic snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus
[  +0.000044]  snd_soc_core snd_compress iwlmvm ac97_bus snd_pcm_dmaengine snd_hda_intel hid_sensor_hub intel_tcc_cooling snd_intel_dspcfg mac80211 x86_pkg_temp_thermal intel_powerclamp snd_intel_sdw_acpi libarc4 coretemp snd_hda_codec intel_ishtp_hid joydev snd_hda_core iwlwifi hid_multitouch i915 snd_hwdep mousedev kvm_intel snd_pcm iTCO_wdt drm_buddy intel_pmc_bxt i2c_algo_bit intel_rapl_msr mei_pxp mei_hdcp iTCO_vendor_support snd_timer kvm cfg80211 irqbypass i2c_i801 mei_me snd processor_thermal_device_pci_legacy processor_thermal_device gpio_keys rapl intel_cstate ttm ucsi_acpi spi_nor processor_thermal_rfim typec_ucsi pcspkr intel_lpss_pci intel_ish_ipc mtd processor_thermal_mbox drm_display_helper typec intel_lpss rfkill mei soundcore i2c_smbus intel_uncore igen6_edac cec processor_thermal_rapl idma64 intel_rapl_common roles vfat intel_ishtp ip6t_REJECT fat intel_soc_dts_iosf i2c_hid_acpi wmi_bmof xt_hl intel_gtt i2c_hid ip6_tables int3403_thermal int340x_thermal_zone acpi_tad
[  +0.000045]  intel_hid ip6t_rt acpi_pad int3400_thermal acpi_thermal_rel sparse_keymap soc_button_array ipt_REJECT xt_LOG nf_log_syslog mac_hid nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nft_compat nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_masq nft_ct nft_chain_nat nf_nat pkcs8_key_parser nf_conntrack dm_multipath nf_defrag_ipv6 i2c_dev nf_defrag_ipv4 nf_tables nfnetlink fuse loop ip_tables x_tables dm_crypt cbc encrypted_keys trusted asn1_encoder tee rtsx_usb_sdmmc mmc_core rtsx_usb btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq crypto_user dm_mod crct10dif_pclmul serio_raw crc32_pclmul atkbd crc32c_intel polyval_clmulni polyval_generic libps2 gf128mul ghash_clmulni_intel sha512_ssse3 vivaldi_fmap aesni_intel crypto_simd nvme cryptd spi_intel_pci nvme_core spi_intel xhci_pci nvme_common xhci_pci_renesas i8042 video serio wmi thunderbolt usbhid
[  +0.000035] CR2: 0000000000000018
[  +0.000002] ---[ end trace 0000000000000000 ]---
[  +0.000001] RIP: 0010:wa_list_apply+0x32/0x190 [i915]
[  +0.000056] Code: 56 41 55 41 54 55 53 48 83 ec 18 4c 8b 2f 8b 77 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 48 c7 44 24 08 00 00 00 00 <49> 8b 6d 18 85 f6 75 27 48 8b 44 24 10 65 48 2b 04 25 28 00 00 00
[  +0.000001] RSP: 0018:ffffaa47898f7978 EFLAGS: 00010246
[  +0.000002] RAX: 0000000000000000 RBX: ffff90548815a608 RCX: 0000000000000018
[  +0.000000] RDX: 00001cf57a769568 RSI: 0000000000000000 RDI: ffff90548815b420
[  +0.000001] RBP: ffff90548815a608 R08: 0000000000000001 R09: ffff90548815b480
[  +0.000001] R10: 0000000000000001 R11: 0000000000000000 R12: ffff905488159dc0
[  +0.000000] R13: 0000000000000000 R14: ffff90548815b240 R15: ffff905281f22000
[  +0.000001] FS:  00007f290b767740(0000) GS:ffff9055fb600000(0000) knlGS:0000000000000000
[  +0.000001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000001] CR2: 0000000000000018 CR3: 0000000370614003 CR4: 0000000000f70ef0
[  +0.000001] PKRU: 55555554
[  +0.000001] note: bash[3909] exited with irqs disabled
```sudo bash -c "echo 7 > /sys/devices/pci0000:00/0000:00:02.0/sriov_numvfs"

sys-oak pushed a commit that referenced this issue Dec 6, 2023
…cArray[entry].opts1

[ Upstream commit dcf75a0 ]

KCSAN reported the following data-race:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169

race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0xb0000042 -> 0x00000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

The read side is in

drivers/net/ethernet/realtek/r8169_main.c
=========================================
   4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
   4356                    int budget)
   4357 {
   4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
   4359         struct sk_buff *skb;
   4360
   4361         dirty_tx = tp->dirty_tx;
   4362
   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
   4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
   4365                 u32 status;
   4366
 → 4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
   4368                 if (status & DescOwn)
   4369                         break;
   4370
   4371                 skb = tp->tx_skb[entry].skb;
   4372                 rtl8169_unmap_tx_skb(tp, entry);
   4373
   4374                 if (skb) {
   4375                         pkts_compl++;
   4376                         bytes_compl += skb->len;
   4377                         napi_consume_skb(skb, budget);
   4378                 }
   4379                 dirty_tx++;
   4380         }
   4381
   4382         if (tp->dirty_tx != dirty_tx) {
   4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
   4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
   4385
   4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
   4387                                               rtl_tx_slots_avail(tp),
   4388                                               R8169_TX_START_THRS);
   4389                 /*
   4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
   4391                  * too close. Let's kick an extra TxPoll request when a burst
   4392                  * of start_xmit activity is detected (if it is not detected,
   4393                  * it is slow enough). -- FR
   4394                  * If skb is NULL then we come here again once a tx irq is
   4395                  * triggered after the last fragment is marked transmitted.
   4396                  */
   4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);
   4399         }
   4400 }

tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes
this KCSAN warning.

   4366
 → 4367                 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
   4368                 if (status & DescOwn)
   4369                         break;
   4370

Cc: Heiner Kallweit <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Acked-by: Marco Elver <[email protected]>
Fixes: 1da177e ("Linux-2.6.12-rc2")
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
sys-oak pushed a commit that referenced this issue Dec 6, 2023
…>opts1

[ Upstream commit f97eee4 ]

KCSAN reported the following data-race bug:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169

race at unknown origin, with read to 0xffff888117e43510 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0x80003fff -> 0x3402805f

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

drivers/net/ethernet/realtek/r8169_main.c:
==========================================
   4429
 → 4430                 status = le32_to_cpu(desc->opts1);
   4431                 if (status & DescOwn)
   4432                         break;
   4433
   4434                 /* This barrier is needed to keep us from reading
   4435                  * any other fields out of the Rx descriptor until
   4436                  * we know the status of DescOwn
   4437                  */
   4438                 dma_rmb();
   4439
   4440                 if (unlikely(status & RxRES)) {
   4441                         if (net_ratelimit())
   4442                                 netdev_warn(dev, "Rx ERROR. status = %08x\n",

Marco Elver explained that dma_rmb() doesn't prevent the compiler to tear up the access to
desc->opts1 which can be written to concurrently. READ_ONCE() should prevent that from
happening:

   4429
 → 4430                 status = le32_to_cpu(READ_ONCE(desc->opts1));
   4431                 if (status & DescOwn)
   4432                         break;
   4433

As the consequence of this fix, this KCSAN warning was eliminated.

Fixes: 6202806 ("r8169: drop member opts1_mask from struct rtl8169_private")
Suggested-by: Marco Elver <[email protected]>
Cc: Heiner Kallweit <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Acked-by: Marco Elver <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
sys-oak pushed a commit that referenced this issue Dec 21, 2023
…>cur_tx

[ Upstream commit c1c0ce3 ]

KCSAN reported the following data-race:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll [r8169] / rtl8169_start_xmit [r8169]

write (marked) to 0xffff888102474b74 of 4 bytes by task 5358 on cpu 29:
rtl8169_start_xmit (drivers/net/ethernet/realtek/r8169_main.c:4254) r8169
dev_hard_start_xmit (./include/linux/netdevice.h:4889 ./include/linux/netdevice.h:4903 net/core/dev.c:3544 net/core/dev.c:3560)
sch_direct_xmit (net/sched/sch_generic.c:342)
__dev_queue_xmit (net/core/dev.c:3817 net/core/dev.c:4306)
ip_finish_output2 (./include/linux/netdevice.h:3082 ./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233)
__ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:293)
ip_finish_output (net/ipv4/ip_output.c:328)
ip_output (net/ipv4/ip_output.c:435)
ip_send_skb (./include/net/dst.h:458 net/ipv4/ip_output.c:127 net/ipv4/ip_output.c:1486)
udp_send_skb (net/ipv4/udp.c:963)
udp_sendmsg (net/ipv4/udp.c:1246)
inet_sendmsg (net/ipv4/af_inet.c:840 (discriminator 4))
sock_sendmsg (net/socket.c:730 net/socket.c:753)
__sys_sendto (net/socket.c:2177)
__x64_sys_sendto (net/socket.c:2185)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)

read to 0xffff888102474b74 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4397 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14))
asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0x002f4815 -> 0x002f4816

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

The write side of drivers/net/ethernet/realtek/r8169_main.c is:
==================
   4251         /* rtl_tx needs to see descriptor changes before updated tp->cur_tx */
   4252         smp_wmb();
   4253
 → 4254         WRITE_ONCE(tp->cur_tx, tp->cur_tx + frags + 1);
   4255
   4256         stop_queue = !netif_subqueue_maybe_stop(dev, 0, rtl_tx_slots_avail(tp),
   4257                                                 R8169_TX_STOP_THRS,
   4258                                                 R8169_TX_START_THRS);

The read side is the function rtl_tx():

   4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
   4356                    int budget)
   4357 {
   4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
   4359         struct sk_buff *skb;
   4360
   4361         dirty_tx = tp->dirty_tx;
   4362
   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
   4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
   4365                 u32 status;
   4366
   4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
   4368                 if (status & DescOwn)
   4369                         break;
   4370
   4371                 skb = tp->tx_skb[entry].skb;
   4372                 rtl8169_unmap_tx_skb(tp, entry);
   4373
   4374                 if (skb) {
   4375                         pkts_compl++;
   4376                         bytes_compl += skb->len;
   4377                         napi_consume_skb(skb, budget);
   4378                 }
   4379                 dirty_tx++;
   4380         }
   4381
   4382         if (tp->dirty_tx != dirty_tx) {
   4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
   4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
   4385
   4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
   4387                                               rtl_tx_slots_avail(tp),
   4388                                               R8169_TX_START_THRS);
   4389                 /*
   4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
   4391                  * too close. Let's kick an extra TxPoll request when a burst
   4392                  * of start_xmit activity is detected (if it is not detected,
   4393                  * it is slow enough). -- FR
   4394                  * If skb is NULL then we come here again once a tx irq is
   4395                  * triggered after the last fragment is marked transmitted.
   4396                  */
 → 4397                 if (tp->cur_tx != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);
   4399         }
   4400 }

Obviously from the code, an earlier detected data-race for tp->cur_tx was fixed in the
line 4363:

   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {

but the same solution is required for protecting the other access to tp->cur_tx:

 → 4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);

The write in the line 4254 is protected with WRITE_ONCE(), but the read in the line 4397
might have suffered read tearing under some compiler optimisations.

The fix eliminated the KCSAN data-race report for this bug.

It is yet to be evaluated what happens if tp->cur_tx changes between the test in line 4363
and line 4397. This test should certainly not be cached by the compiler in some register
for such a long time, while asynchronous writes to tp->cur_tx might have occurred in line
4254 in the meantime.

Fixes: 94d8a98 ("r8169: reduce number of workaround doorbell rings")
Cc: Heiner Kallweit <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Acked-by: Marco Elver <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
sys-oak pushed a commit that referenced this issue Dec 21, 2023
…cArray[entry].opts1

[ Upstream commit dcf75a0 ]

KCSAN reported the following data-race:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169

race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0xb0000042 -> 0x00000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

The read side is in

drivers/net/ethernet/realtek/r8169_main.c
=========================================
   4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
   4356                    int budget)
   4357 {
   4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
   4359         struct sk_buff *skb;
   4360
   4361         dirty_tx = tp->dirty_tx;
   4362
   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
   4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
   4365                 u32 status;
   4366
 → 4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
   4368                 if (status & DescOwn)
   4369                         break;
   4370
   4371                 skb = tp->tx_skb[entry].skb;
   4372                 rtl8169_unmap_tx_skb(tp, entry);
   4373
   4374                 if (skb) {
   4375                         pkts_compl++;
   4376                         bytes_compl += skb->len;
   4377                         napi_consume_skb(skb, budget);
   4378                 }
   4379                 dirty_tx++;
   4380         }
   4381
   4382         if (tp->dirty_tx != dirty_tx) {
   4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
   4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
   4385
   4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
   4387                                               rtl_tx_slots_avail(tp),
   4388                                               R8169_TX_START_THRS);
   4389                 /*
   4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
   4391                  * too close. Let's kick an extra TxPoll request when a burst
   4392                  * of start_xmit activity is detected (if it is not detected,
   4393                  * it is slow enough). -- FR
   4394                  * If skb is NULL then we come here again once a tx irq is
   4395                  * triggered after the last fragment is marked transmitted.
   4396                  */
   4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);
   4399         }
   4400 }

tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes
this KCSAN warning.

   4366
 → 4367                 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
   4368                 if (status & DescOwn)
   4369                         break;
   4370

Cc: Heiner Kallweit <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Acked-by: Marco Elver <[email protected]>
Fixes: 1da177e ("Linux-2.6.12-rc2")
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
sys-oak pushed a commit that referenced this issue Dec 21, 2023
…>opts1

[ Upstream commit f97eee4 ]

KCSAN reported the following data-race bug:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169

race at unknown origin, with read to 0xffff888117e43510 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0x80003fff -> 0x3402805f

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

drivers/net/ethernet/realtek/r8169_main.c:
==========================================
   4429
 → 4430                 status = le32_to_cpu(desc->opts1);
   4431                 if (status & DescOwn)
   4432                         break;
   4433
   4434                 /* This barrier is needed to keep us from reading
   4435                  * any other fields out of the Rx descriptor until
   4436                  * we know the status of DescOwn
   4437                  */
   4438                 dma_rmb();
   4439
   4440                 if (unlikely(status & RxRES)) {
   4441                         if (net_ratelimit())
   4442                                 netdev_warn(dev, "Rx ERROR. status = %08x\n",

Marco Elver explained that dma_rmb() doesn't prevent the compiler to tear up the access to
desc->opts1 which can be written to concurrently. READ_ONCE() should prevent that from
happening:

   4429
 → 4430                 status = le32_to_cpu(READ_ONCE(desc->opts1));
   4431                 if (status & DescOwn)
   4432                         break;
   4433

As the consequence of this fix, this KCSAN warning was eliminated.

Fixes: 6202806 ("r8169: drop member opts1_mask from struct rtl8169_private")
Suggested-by: Marco Elver <[email protected]>
Cc: Heiner Kallweit <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Acked-by: Marco Elver <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants