This repository has been archived by the owner on Jan 25, 2023. It is now read-only.
forked from intel/linux-intel-lts
-
Notifications
You must be signed in to change notification settings - Fork 6
DO NOT MERGE!! Debug HDMI audio issue #5
Open
AjanZhong
wants to merge
37
commits into
projectceladon:master
Choose a base branch
from
AjanZhong:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
re-implement security patch 594cc25 make 'user_access_begin()' do 'access_ok()' Change-Id: I71f657c5c0c8a2b1c4084d5ab327095aed82bad2 Tracked-On: OAM-79419 Signed-off-by: Junxiao Chang <[email protected]>
Upstream commit: efa9ace In dlpar_parse_cc_property(), 'prop->name' is allocated by kstrdup(). kstrdup() may return NULL, so it should be checked and handle error. And prop should be freed if 'prop->name' is NULL. Change-Id: I8614c66bb539b58f68894c6d8b3ac9b4da207bd5 Tracked-On: PKT-2296 Signed-off-by: Gen Zhang <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
upstream commit: 86e5aca7fa2927060839f3e3b40c8bd65a7e8d1e <commit found in scsi.git for 5.3 inclusion> In _ctl_ioctl_main(), 'ioctl_header' is fetched the first time from userspace. 'ioctl_header.ioc_number' is then checked. The legal result is saved to 'ioc'. Then, in condition MPT3COMMAND, the whole struct is fetched again from the userspace. Then _ctl_do_mpt_command() is called, 'ioc' and 'karg' as inputs. However, a malicious user can change the 'ioc_number' between the two fetches, which will cause a potential security issues. Moreover, a malicious user can provide a valid 'ioc_number' to pass the check in first fetch, and then modify it in the second fetch. To fix this, we need to recheck the 'ioc_number' in the second fetch. Change-Id: I2260545d47b85a0fd50d4606a5d21b00a072466f Tracked-On: PKT-2289 Signed-off-by: Gen Zhang <[email protected]> Acked-by: Suganath Prabu S <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
upstream commit: 95baa60 In function ip6_ra_control(), the pointer new_ra is allocated a memory space via kmalloc(). And it is used in the following codes. However, when there is a memory allocation error, kmalloc() fails. Thus null pointer dereference may happen. And it will cause the kernel to crash. Therefore, we should check the return value and handle the error. Change-Id: I435ba31f7657877fa8d4273ed8365f3ab2fee92f Tracked-On: PKT-2291 Signed-off-by: Gen Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
upstream commit: 4e78921 The old_memmap flow in efi_call_phys_prolog() performs numerous memory allocations, and either does not check for failure at all, or it does but fails to propagate it back to the caller, which may end up calling into the firmware with an incomplete 1:1 mapping. So let's fix this by returning NULL from efi_call_phys_prolog() on memory allocation failures only, and by handling this condition in the caller. Also, clean up any half baked sets of page tables that we may have created before returning with a NULL return value. Note that any failure at this level will trigger a panic() two levels up, so none of this makes a huge difference, but it is a nice cleanup nonetheless. [ardb: update commit log, add efi_call_phys_epilog() call on error path] Change-Id: I8bb466fa40b543ab882e4cd0a06872417309b6c9 Tracked-On: PKT-2249 Signed-off-by: Gen Zhang <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Bradford <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
upstream commit: 425aa0e In function ip_ra_control(), the pointer new_ra is allocated a memory space via kmalloc(). And it is used in the following codes. However, when there is a memory allocation error, kmalloc() fails. Thus null pointer dereference may happen. And it will cause the kernel to crash. Therefore, we should check the return value and handle the error. Change-Id: I4f364e0dbfb1c26b45dd7112f64dae1acbaea0af Tracked-On: PKT-2252 Signed-off-by: Gen Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
…erycap upstream commit: 5d2e73a SyzKaller hit the null pointer deref while reading from uninitialized udev->product in zr364xx_vidioc_querycap(). ================================================================== BUG: KASAN: null-ptr-deref in read_word_at_a_time+0xe/0x20 include/linux/compiler.h:274 Read of size 1 at addr 0000000000000000 by task v4l_id/5287 CPU: 1 PID: 5287 Comm: v4l_id Not tainted 5.1.0-rc3-319004-g43151d6 #6 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xe8/0x16e lib/dump_stack.c:113 kasan_report.cold+0x5/0x3c mm/kasan/report.c:321 read_word_at_a_time+0xe/0x20 include/linux/compiler.h:274 strscpy+0x8a/0x280 lib/string.c:207 zr364xx_vidioc_querycap+0xb5/0x210 drivers/media/usb/zr364xx/zr364xx.c:706 v4l_querycap+0x12b/0x340 drivers/media/v4l2-core/v4l2-ioctl.c:1062 __video_do_ioctl+0x5bb/0xb40 drivers/media/v4l2-core/v4l2-ioctl.c:2874 video_usercopy+0x44e/0xf00 drivers/media/v4l2-core/v4l2-ioctl.c:3056 v4l2_ioctl+0x14e/0x1a0 drivers/media/v4l2-core/v4l2-dev.c:364 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0xced/0x12f0 fs/ioctl.c:696 ksys_ioctl+0xa0/0xc0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x74/0xb0 fs/ioctl.c:718 do_syscall_64+0xcf/0x4f0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f3b56d8b347 Code: 90 90 90 48 8b 05 f1 fa 2a 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 90 90 90 90 90 90 90 90 90 90 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c1 fa 2a 00 31 d2 48 29 c2 64 RSP: 002b:00007ffe005d5d68 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f3b56d8b347 RDX: 00007ffe005d5d70 RSI: 0000000080685600 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000400884 R13: 00007ffe005d5ec0 R14: 0000000000000000 R15: 0000000000000000 ================================================================== For this device udev->product is not initialized and accessing it causes a NULL pointer deref. The fix is to check for NULL before strscpy() and copy empty string, if product is NULL Change-Id: Iec5ace2ba75ccf88f75aeac6ed3512ed5adf472c Reported-by: [email protected] Signed-off-by: Vandana BN <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
sk_forward_alloc's updating is also done on rx path, but to be consistent we change to use sk_mem_charge() in sctp_skb_set_owner_r(). In sctp_eat_data(), it's not enough to check sctp_memory_pressure only, which doesn't work for mem_cgroup_sockets_enabled, so we change to use sk_under_memory_pressure(). When it's under memory pressure, sk_mem_reclaim() and sk_rmem_schedule() should be called on both RENEGE or CHUNK DELIVERY path exit the memory pressure status as soon as possible. Note that sk_rmem_schedule() is using datalen to make things easy there. Change-Id: Ib62867e9b7b9c6084e7f25babfb50afbfa0f7f8e Tracked-On: PKT-2203 Reported-by: Matteo Croce <[email protected]> Tested-by: Matteo Croce <[email protected]> Acked-by: Neil Horman <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
Now when sending packets, sk_mem_charge() and sk_mem_uncharge() have been used to set sk_forward_alloc. We just need to call sk_wmem_schedule() to check if the allocated should be raised, and call sk_mem_reclaim() to check if the allocated should be reduced when it's under memory pressure. If sk_wmem_schedule() returns false, which means no memory is allowed to allocate, it will block and wait for memory to become available. Note different from tcp, sctp wait_for_buf happens before allocating any skb, so memory accounting check is done with the whole msg_len before it too. Reported-by: Matteo Croce <[email protected]> Tested-by: Matteo Croce <[email protected]> Acked-by: Neil Horman <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
In acp_hw_init there are some allocations that needs to be released in case of failure: 1- adev->acp.acp_genpd should be released if any allocation attemp for adev->acp.acp_cell, adev->acp.acp_res or i2s_pdata fails. 2- all of those allocations should be released if mfd_add_hotplug_devices or pm_genpd_add_device fail. 3- Release is needed in case of time out values expire. Change-Id: I4aa762c93ac8e3121087b7d3691036c131df9a74 Reviewed-by: Christian König <[email protected]> Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
In rtl_usb_probe if allocation for usb_data fails the allocated hw should be released. In addition the allocated rtlpriv->usb_data should be released on error handling path. Change-Id: I8186853e63176cb2c2fc5665bd8fbd330b1e78ed Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
In alloc_sgtable if alloc_page fails, the alocated table should be released. Change-Id: Ifda837dee2b933004f9725b068b0efa8c0a5f11a Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
In qrtr_tun_write_iter the allocated kbuf should be release in case of error or success return. v2 Update: Thanks to David Miller for pointing out the release on success path as well. Change-Id: Ida957e3598429b5af74f107e6f85040b664b0d32 Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: David S. Miller <[email protected]>
In bfad_im_get_stats if bfa_port_get_stats fails, allocated memory needs to be released. Change-Id: I5b8ef34b28b1a4a9fc2c3b86a9974402b7b4a5a4 Signed-off-by: Navid Emamdoost <[email protected]>
In iwl_pcie_ctxt_info_gen3_init there are cases that the allocated dma memory is leaked in case of error. DMA memories prph_scratch, prph_info, and ctxt_info_gen3 are allocated and initialized to be later assigned to trans_pcie. But in any error case before such assignment the allocated memories should be released. First of such error cases happens when iwl_pcie_init_fw_sec fails. Current implementation correctly releases prph_scratch. But in two sunsequent error cases where dma_alloc_coherent may fail, such releases are missing. This commit adds release for prph_scratch when allocation for prph_info fails, and adds releases for prph_scratch and prph_info when allocation for ctxt_info_gen3 fails. Change-Id: Idec086df6611d2d61d73d3252bbcd4a47b3d473e Fixes: 2ee8240 ("iwlwifi: pcie: support context information for 22560 devices") Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
In mwifiex_pcie_init_evt_ring, a new skb is allocated which should be released if mwifiex_map_pci_memory() fails. The release for skb and card->evtbd_ring_vbase is added. Change-Id: I8f759f5bf146c25276d01eb3f6394bda37f624ef Tracked-On: PKT-2836 Fixes: 0732484 ("mwifiex: separate ring initialization and ring creation routines") Signed-off-by: Navid Emamdoost <[email protected]> Acked-by: Ganapathi Bhat <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
In rsi_send_beacon, if rsi_prepare_beacon fails the allocated skb should be released. Change-Id: I8f3f7e56138ffa6569c26d7998967223cd83faa1 Tracked-On: PKT-2839 Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
In predicate_parse, there is an error path that is not going to out_free instead it returns directly which leads to a memory leak. Link: http://lkml.kernel.org/r/[email protected] Change-Id: Ib53900d9b6efb1e616fad32cc4841e8e58a46bc6 Tracked-On: PKT-2838 Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
In crypto_report, a new skb is created via nlmsg_new(). This skb should be released if crypto_report_alg() fails. Change-Id: Icf51d9de421067657572f61d0a22960c2444c1d0 Tracked-On: PKT-2840 Fixes: a38f790 ("crypto: Add userspace configuration API") Cc: <[email protected]> Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
In rtl8xxxu_submit_int_urb if usb_submit_urb fails the allocated urb should be released. Change-Id: I5fc9156321b9b8b26b27b7321f359e5fca41e590 Tracked-On: PKT-2837 Signed-off-by: Navid Emamdoost <[email protected]> Reviewed-by: Chris Chiu <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
In cx23888_ir_probe if kfifo_alloc fails the allocated memory for state should be released. Change-Id: I2d54093f9b13cf760a5d32ac4b616e1e100843a2 Tracked-On: PKT-2833 Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Sean Young <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Release all allocated memory if sha type is invalid: In ccp_run_sha_cmd, if the type of sha is invalid, the allocated hmac_buf should be released. v2: fix the goto. Change-Id: I4c090ea6085e9d37c38acb9b391d2a8b3b6d459f Tracked-On: PKT-2819 Signed-off-by: Navid Emamdoost <[email protected]> Acked-by: Gary R Hook <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
In mwifiex_pcie_alloc_cmdrsp_buf, a new skb is allocated which should be released if mwifiex_map_pci_memory() fails. The release is added. Change-Id: Idb4d88d135e8393e825d0e1e6849eaed5af3eda4 Tracked-On: PKT-2835 Fixes: fc33146 ("mwifiex: use pci_alloc/free_consistent APIs for PCIe") Signed-off-by: Navid Emamdoost <[email protected]> Acked-by: Ganapathi Bhat <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
In af9005_identify_state when returning -EIO the allocated buffer should be released. Replace the "return -EIO" with assignment into ret and move deb_info() under a check. Change-Id: Icf4f5f8e07cca6310bc902a2dc90b460c5cba4a7 Tracked-On: PKT-2822 Fixes: af4e067 ("V4L/DVB (5625): Add support for the AF9005 demodulator from Afatech") Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
In the implementation of aa_audit_rule_init(), when aa_label_parse() fails the allocated memory for rule is released using aa_audit_rule_free(). But after this release, the return statement tries to access the label field of the rule which results in use-after-free. Before releasing the rule, copy errNo and return it after release. Change-Id: Iebbc691a0098504f5637c29ec1b10f9ce679f7c0 Tracked-On: PKT-2823 Fixes: 52e8c38 ("apparmor: Fix memory leak of rule on error exit path") Signed-off-by: Navid Emamdoost <[email protected]> Reviewed-by: Tyler Hicks <[email protected]>
In the impelementation of __ipmi_bmc_register() the allocated memory for bmc should be released in case ida_simple_get() fails. Tracked-On: PKT-2829 Fixes: 68e7e50 ("ipmi: Don't use BMC product/dev ids in the BMC name") Signed-off-by: Navid Emamdoost <[email protected]> Message-Id: <[email protected]> Signed-off-by: Corey Minyard <[email protected]>
In ath9k_wmi_cmd, the allocated network buffer needs to be released if timeout happens. Otherwise memory will be leaked. Tracked-On: PKT-2831 Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
In ath10k_usb_hif_tx_sg the allocated urb should be released if usb_submit_urb fails. Tracked-On: PKT-2827 Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
In bnxt_re_create_srq(), when ib_copy_to_udata() fails allocated memory should be released by goto fail. Tracked-On: PKT-2826 Fixes: 37cb11a ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Navid Emamdoost <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
In the implementation of i2400m_op_rfkill_sw_toggle() the allocated buffer for cmd should be released before returning. The documentation for i2400m_msg_to_dev() says when it returns the buffer can be reused. Meaning cmd should be released in either case. Move kfree(cmd) before return to be reached by all execution paths. Change-Id: I6d8d53f30d6c7c7fcd65ae3da0f9dbefc8081cc9 Tracked-On: PKT-2858 Fixes: 2507e6a ("wimax: i2400: fix memory leak") Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: David S. Miller <[email protected]>
In htc_config_pipe_credits, htc_setup_complete, and htc_connect_service if time out happens, the allocated buffer needs to be released. Otherwise there will be memory leak. Tracked-On: PKT-2825 Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
"f->fmt.sdr.reserved" is uninitialized. As other peer drivers like msi2500 and airspy do, the fix initializes it to avoid memory disclosures. Change-Id: Ia5eb7c1077b56aa7aa4c6b88bc5a4ac0a2e0dc79 Tracked-On: PKT-2817 Signed-off-by: Kangjie Lu <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]>
The driver needs an isochronous endpoint to be present. It will oops in its absence. Add checking for it. Change-Id: If59d222591d8d60a624a8e96f086e353a09ad12b Tracked-On: PKT-2815 Reported-by: [email protected] Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Sean Young <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
In dcn*_clock_source_create when dcn20_clk_src_construct fails allocated clk_src needs release. Change-Id: I38aa202bc72ba6acd0c755b2d811a9e050304af4 Tracked-On: PKT-2856 Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
In dcn*_create_resource_pool the allocated memory should be released if construct pool fails. Change-Id: I24a83555dd34f362bba3a5bd93073c589f980e81 Tracked-On: PKT-2828 Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
When ashmem file is being mmapped the resulting vma->vm_file points to the backing shmem file with the generic fops that do not check ashmem permissions like fops of ashmem do. Fix that by disallowing mapping operation for backing shmem file. Change-Id: I5270b92fadab001e9177a05252ba02a7a58ad8a0 Tracked-On: PKT-2924 Bug: 142903466 Reported-by: Jann Horn <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]>
sysopenci
reviewed
Dec 30, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Autobuild started from pull-request-changes on this PR.
FAILURE: CheckBug Bad comments/Bugs
For more information, see: /absp/builders/celadon-autobuild/builds/1513
swaroopbalan
pushed a commit
that referenced
this pull request
Jan 28, 2020
[ Upstream commit f00b342 ] A hang was observed in the fcport delete path when the device was responding slow and an issue-lip path (results in session termination) was taken. Fix this by issuing logo requests unconditionally. PID: 19491 TASK: ffff8e23e67bb150 CPU: 0 COMMAND: "kworker/0:0" #0 [ffff8e2370297bf8] __schedule at ffffffffb4f7dbb0 #1 [ffff8e2370297c88] schedule at ffffffffb4f7e199 #2 [ffff8e2370297c98] schedule_timeout at ffffffffb4f7ba68 #3 [ffff8e2370297d40] msleep at ffffffffb48ad9ff #4 [ffff8e2370297d58] qlt_free_session_done at ffffffffc0c32052 [qla2xxx] #5 [ffff8e2370297e20] process_one_work at ffffffffb48bcfdf #6 [ffff8e2370297e68] worker_thread at ffffffffb48bdca6 #7 [ffff8e2370297ec8] kthread at ffffffffb48c4f81 Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
swaroopbalan
pushed a commit
that referenced
this pull request
Jan 28, 2020
…_clear_bit [ Upstream commit fadcbd2 ] We need to move "spin_lock_irq(&bitmap->counts.lock)" before unmap previous storage, otherwise panic like belows could happen as follows. [ 902.353802] sdl: detected capacity change from 1077936128 to 3221225472 [ 902.616948] general protection fault: 0000 [#1] SMP [snip] [ 902.618588] CPU: 12 PID: 33698 Comm: md0_raid1 Tainted: G O 4.14.144-1-pserver #4.14.144-1.1~deb10 [ 902.618870] Hardware name: Supermicro SBA-7142G-T4/BHQGE, BIOS 3.00 10/24/2012 [ 902.619120] task: ffff9ae1860fc600 task.stack: ffffb52e4c704000 [ 902.619301] RIP: 0010:bitmap_file_clear_bit+0x90/0xd0 [md_mod] [ 902.619464] RSP: 0018:ffffb52e4c707d28 EFLAGS: 00010087 [ 902.619626] RAX: ffe8008b0d061000 RBX: ffff9ad078c87300 RCX: 0000000000000000 [ 902.619792] RDX: ffff9ad986341868 RSI: 0000000000000803 RDI: ffff9ad078c87300 [ 902.619986] RBP: ffff9ad0ed7a8000 R08: 0000000000000000 R09: 0000000000000000 [ 902.620154] R10: ffffb52e4c707ec0 R11: ffff9ad987d1ed44 R12: ffff9ad0ed7a8360 [ 902.620320] R13: 0000000000000003 R14: 0000000000060000 R15: 0000000000000800 [ 902.620487] FS: 0000000000000000(0000) GS:ffff9ad987d00000(0000) knlGS:0000000000000000 [ 902.620738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 902.620901] CR2: 000055ff12aecec0 CR3: 0000001005207000 CR4: 00000000000406e0 [ 902.621068] Call Trace: [ 902.621256] bitmap_daemon_work+0x2dd/0x360 [md_mod] [ 902.621429] ? find_pers+0x70/0x70 [md_mod] [ 902.621597] md_check_recovery+0x51/0x540 [md_mod] [ 902.621762] raid1d+0x5c/0xeb0 [raid1] [ 902.621939] ? try_to_del_timer_sync+0x4d/0x80 [ 902.622102] ? del_timer_sync+0x35/0x40 [ 902.622265] ? schedule_timeout+0x177/0x360 [ 902.622453] ? call_timer_fn+0x130/0x130 [ 902.622623] ? find_pers+0x70/0x70 [md_mod] [ 902.622794] ? md_thread+0x94/0x150 [md_mod] [ 902.622959] md_thread+0x94/0x150 [md_mod] [ 902.623121] ? wait_woken+0x80/0x80 [ 902.623280] kthread+0x119/0x130 [ 902.623437] ? kthread_create_on_node+0x60/0x60 [ 902.623600] ret_from_fork+0x22/0x40 [ 902.624225] RIP: bitmap_file_clear_bit+0x90/0xd0 [md_mod] RSP: ffffb52e4c707d28 Because mdadm was running on another cpu to do resize, so bitmap_resize was called to replace bitmap as below shows. PID: 38801 TASK: ffff9ad074a90e00 CPU: 0 COMMAND: "mdadm" [exception RIP: queued_spin_lock_slowpath+56] [snip] -- <NMI exception stack> -- #5 [ffffb52e60f17c58] queued_spin_lock_slowpath at ffffffff9c0b27b8 #6 [ffffb52e60f17c58] bitmap_resize at ffffffffc0399877 [md_mod] #7 [ffffb52e60f17d30] raid1_resize at ffffffffc0285bf9 [raid1] #8 [ffffb52e60f17d50] update_size at ffffffffc038a31a [md_mod] #9 [ffffb52e60f17d70] md_ioctl at ffffffffc0395ca4 [md_mod] And the procedure to keep resize bitmap safe is allocate new storage space, then quiesce, copy bits, replace bitmap, and re-start. However the daemon (bitmap_daemon_work) could happen even the array is quiesced, which means when bitmap_file_clear_bit is triggered by raid1d, then it thinks it should be fine to access store->filemap since counts->lock is held, but resize could change the storage without the protection of the lock. Cc: Jack Wang <[email protected]> Cc: NeilBrown <[email protected]> Signed-off-by: Guoqing Jiang <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 5eed6f1 ] Commit 9b6f7e1 ("mm: rework memcg kernel stack accounting") will result in fork failing if allocating a kernel stack for a task in dup_task_struct exceeds the kernel memory allowance for that cgroup. Unfortunately, it also results in a crash. This is due to the code jumping to free_stack and calling free_thread_stack when the memcg kernel stack charge fails, but without tsk->stack pointing at the freshly allocated stack. This in turn results in the vfree_atomic in free_thread_stack oopsing with a backtrace like this: #5 [ffffc900244efc88] die at ffffffff8101f0ab #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86 #7 [ffffc900244efce0] general_protection at ffffffff818ff082 [exception RIP: llist_add_batch+7] RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000 RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0 R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37 intel#10 [ffffc900244efe98] _do_fork at ffffffff810884e0 intel#11 [ffffc900244eff10] sys_vfork at ffffffff810887ff intel#12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43 RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948 RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421 RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00 R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040 R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018 ORIG_RAX: 000000000000003a CS: 0033 SS: 002b The simplest fix is to assign tsk->stack right where it is allocated. Link: http://lkml.kernel.org/r/[email protected] Fixes: 9b6f7e1 ("mm: rework memcg kernel stack accounting") Signed-off-by: Rik van Riel <[email protected]> Acked-by: Roman Gushchin <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 96bf313ecb33567af4cb53928b0c951254a02759 ] There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 #3 [ffffc900297bb708] __lock_page at ffffffff811f145b #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02 ("Btrfs: Add writepages support") CC: [email protected] # 4.4+ Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit 35df429 upstream. EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by KCSAN, BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4] write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127: ext4_write_end+0x4e3/0x750 [ext4] ext4_update_i_disksize at fs/ext4/ext4.h:3032 (inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046 (inlined by) ext4_write_end at fs/ext4/inode.c:1287 generic_perform_write+0x208/0x2a0 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb47 entry_SYSCALL_64_after_hwframe+0x49/0xbe read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37: ext4_writepages+0x10ac/0x1d00 [ext4] mpage_map_and_submit_extent at fs/ext4/inode.c:2468 (inlined by) ext4_writepages at fs/ext4/inode.c:2772 do_writepages+0x5e/0x130 __writeback_single_inode+0xeb/0xb20 writeback_sb_inodes+0x429/0x900 __writeback_inodes_wb+0xc4/0x150 wb_writeback+0x4bd/0x870 wb_workfn+0x6b4/0x960 process_one_work+0x54c/0xbe0 worker_thread+0x80/0x650 kthread+0x1e0/0x200 ret_from_fork+0x27/0x50 Reported by Kernel Concurrency Sanitizer on: CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 Workqueue: writeback wb_workfn (flush-7:0) Since only the read is operating as lockless (outside of the "i_data_sem"), load tearing could introduce a logic bug. Fix it by adding READ_ONCE() for the read and WRITE_ONCE() for the write. Signed-off-by: Qian Cai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]> Cc: [email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.