From patchwork Tue Jan 7 07:25:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kasireddy, Vivek" X-Patchwork-Id: 13928322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90C6EE77197 for ; Tue, 7 Jan 2025 07:54:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 180316B00BA; Tue, 7 Jan 2025 02:54:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 108ED6B00BD; Tue, 7 Jan 2025 02:54:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EEBCE6B00BE; Tue, 7 Jan 2025 02:54:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CDA8A6B00BA for ; Tue, 7 Jan 2025 02:54:05 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 82ED5C0788 for ; Tue, 7 Jan 2025 07:54:05 +0000 (UTC) X-FDA: 82979892450.22.F64299B Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) by imf05.hostedemail.com (Postfix) with ESMTP id 0425C10000A for ; Tue, 7 Jan 2025 07:54:02 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VIAq0H7m; spf=pass (imf05.hostedemail.com: domain of vivek.kasireddy@intel.com designates 192.198.163.12 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736236443; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=w14Tf38hcc6iHp9ThUAjBIqAOYbc2JfQwHhBt1g3HEE=; b=CmVD0qgjubBp/qgYU1kJUrSmqBHRog6b2+mvIbFSUVj49V/JI7TQVevhoHETcdAyEOzQ1l 1BugzzaJQ36ijvNNgMz5xDizYNoVzJY6NT83/XUg87PvT11aT5uPAvwgII/pUCi1ChlX46 VtMmS/Ts0mDUZQtokTHT1a1UIxkZgIY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736236443; a=rsa-sha256; cv=none; b=A/ZhGcdPdFW1YmPpMzz01pxjosvvrbrNeS+mM4Px71vdxDV4xGqEM7sFI0Ej3kyqKcV55U R0FK+5mFZ1bc9rrCNutwZS6kUYxAYnloWK6JEfvVk4QbKqVAP56JQDsNCTA5zZNidgASoN GCAQ+VpdDmD8UwnarF9zLp1fX0w2ZJw= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VIAq0H7m; spf=pass (imf05.hostedemail.com: domain of vivek.kasireddy@intel.com designates 192.198.163.12 as permitted sender) smtp.mailfrom=vivek.kasireddy@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1736236443; x=1767772443; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=X7cDLAjyxgk+evwvmjXf6u77pbjNyKKAe9SRTY62z+g=; b=VIAq0H7mE0h2nuODETy+GsquDs4NqSsM3ES4RMBFacuQSVNuI/wX8gp+ sYmUSlWV+WJ3brnoCx2zlcVeAc7vaMSKXBxhEvjYhM6+WOoo/0HobpC5+ MzkVx2tkUn5pqXKLMyL3ZmMXKWbgis0bHU9BI8mjcdCMhk0tpYqFVR0Rf syjniR0wa9PX07NxtC/9amqo10EvLwQLwc0RPzNiGiR4z+d+H11WGT5gd U65lZ8GobR4lvInHNEPeUJKETdWJictcTrIunMtiZWk6jBMzKlo+E1Bce L0W5bxEFJP/w7zhtG5LRRNnyCrOE09d1p9cDV42FU2mHnJgt4M3S4C6Ub A==; X-CSE-ConnectionGUID: LoXKfz62SMOviRZAxu41IQ== X-CSE-MsgGUID: vN+APXkiS8SjN7Z9MKHVMg== X-IronPort-AV: E=McAfee;i="6700,10204,11307"; a="40348607" X-IronPort-AV: E=Sophos;i="6.12,294,1728975600"; d="scan'208";a="40348607" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jan 2025 23:54:01 -0800 X-CSE-ConnectionGUID: Qg1LbPQgROmgA5LybfagDA== X-CSE-MsgGUID: drwh5KkpRjyIaBA9vNCzDw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,294,1728975600"; d="scan'208";a="102580260" Received: from vkasired-desk2.fm.intel.com ([10.105.128.132]) by fmviesa007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Jan 2025 23:54:01 -0800 From: Vivek Kasireddy To: linux-mm@kvack.org Cc: Vivek Kasireddy , syzbot+a504cb5bae4fe117ba94@syzkaller.appspotmail.com, Steve Sistare , Muchun Song , David Hildenbrand , Andrew Morton Subject: [PATCH] mm/memfd: reserve hugetlb folios before allocation Date: Mon, 6 Jan 2025 23:25:17 -0800 Message-ID: <20250107072517.2089633-1-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.47.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 0425C10000A X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: q58ndfguqtkqyza4hnx6r6hnwbd7q3zr X-HE-Tag: 1736236442-325147 X-HE-Meta: U2FsdGVkX1+2c/K9AqJvatVUUzmvlPZ4v1/nZVYn9s5QRQcR9fB75JwcCaOZs1SiDhKEjTs8QJSaoQGHgSibQ2Jn5u/xdhNwq/8RCz9btSbZxOqJQQvFrakbecxHv+8ohhNtKekQ4HKALtGHfRsKn624D6RpbapN76P635QGrqvqmw5nz1kinFFZNeMjsIPnl7JR0+1zVVWRMs5C6Mqkjn1cyQDPHz+t6Dm+o/wo5Mr9nZ03nIA3KalgrAljUHQUoZ82sYYfH19cIGNNF0+KO2NPJUVsiYp0n5wlaAmS7JZ2E2SW67mTILBwpFF1Mw98PeiRf2Ao3Zrr49UxBC2BGOrMgJeN9EtO/90TEeU4cOX0hIPAbh9UJD4PpOLOAo/ytS7oNkOV34TS49uY9jv/LC2PhVHCJOqbFyiA1HiMeIIyro1CfJrubjPtiuCK2h4ew1NSgKPBMB9IPw3UO5ELG/6A/08rkMwQqFYU8LSm6pOH2ZQNGqoAzjgf6U3YvxT3BZF2c55JPQ8hu4L9diiciScbq3F7hKGjU0ATZ3LjBJL8fjnTQxC5+TpV94IUoh20SVdA0l6+zw8uFMpBld1kCjkpah9r2URjuY27RVw81JAJI+FaBMZpl+wCWNPFeB2Z/IyqvVXVnILNM7z3XqZs3tpJ4NYREPi+yp1w4PoCbf6BcxYMBNZITMw5HZsDhNEFKf+8FqEAsyr4h51Hb5UBTmVJRdMIZeqHxhAncPSxRomcRB4C5iBr6bQhjfSYem6gNfPFe7tkc6RtXegD0k322H6tiXw81Np+onTyJLfyVXbaLet7VIA98hPAYciPSyYmuyLMx9XI/PLtpwIHk0P99YbGmQqKN8szmJ61c5gmDX9rP1+I+fOWewTDbfi3LLVT1IxjZWtJr34WSM+igKEIFYyRQoarCoIdF0Aw/899knIhSO85pwPKuJVfjVqJdrCbt9SMrw7sLS9M2PJxL8s GEKii1H2 nz/Sc3t/CbBqDJXVCNFvacVvDxNQSrOMo0AL3cVfco2NkquN7ymx1jn8I+7MNDOcW+SZetxCtwKvcBcW4Qs0nt0oY+ZnIkHK++5ERtaVH3d8a21M1aIFzLqNeQiR5DgapFoKcY1v//zocYFQdHKX+C/WYYiTZWj4rRIoCN5VuYUzgrRajhxRpkkYiOIpYVsjr7qRUfGYIR7X5xM6aHboUo3TggHBg2g/JcdXhQQ0MYCILH4/q+BTB5TqvSIBq09RAatMNnPB6zJanWCfRt4BX8TsAgLFc1gzAQLRRWkjmFvUTjPVxk7qxbORfJyMCnbPBQQwLwEsuGN2IhCMLLqzs0G3mAM93WailjClIw5pWW5T6flvmQjsPjC4zDOg4hMIvOezkOAkgdFmQ5T4Xz/6QD9TsAPX4fqHdJKitGxootC2HCxmFbpv3h+zXEw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: There are cases when we try to pin a folio but discover that it has not been faulted-in. So, we try to allocate it in memfd_alloc_folio() but there is a chance that we might encounter a crash/failure (VM_BUG_ON(!h->resv_huge_pages)) if there are no active reservations at that instant. This issue was reported by syzbot: kernel BUG at mm/hugetlb.c:2403! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 UID: 0 PID: 5315 Comm: syz.0.0 Not tainted 6.13.0-rc5-syzkaller-00161-g63676eefb7a0 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:alloc_hugetlb_folio_reserve+0xbc/0xc0 mm/hugetlb.c:2403 Code: 1f eb 05 e8 56 18 a0 ff 48 c7 c7 40 56 61 8e e8 ba 21 cc 09 4c 89 f0 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc e8 35 18 a0 ff 90 <0f> 0b 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f RSP: 0018:ffffc9000d3d77f8 EFLAGS: 00010087 RAX: ffffffff81ff6beb RBX: 0000000000000000 RCX: 0000000000100000 RDX: ffffc9000e51a000 RSI: 00000000000003ec RDI: 00000000000003ed RBP: 1ffffffff34810d9 R08: ffffffff81ff6ba3 R09: 1ffffd4000093005 R10: dffffc0000000000 R11: fffff94000093006 R12: dffffc0000000000 R13: dffffc0000000000 R14: ffffea0000498000 R15: ffffffff9a4086c8 FS: 00007f77ac12e6c0(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f77ab54b170 CR3: 0000000040b70000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: memfd_alloc_folio+0x1bd/0x370 mm/memfd.c:88 memfd_pin_folios+0xf10/0x1570 mm/gup.c:3750 udmabuf_pin_folios drivers/dma-buf/udmabuf.c:346 [inline] udmabuf_create+0x70e/0x10c0 drivers/dma-buf/udmabuf.c:443 udmabuf_ioctl_create drivers/dma-buf/udmabuf.c:495 [inline] udmabuf_ioctl+0x301/0x4e0 drivers/dma-buf/udmabuf.c:526 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Therefore, to avoid this situation and fix this issue, we just need to make a reservation before we try to allocate the folio. While at it, also remove the VM_BUG_ON() as there is no need to crash the system in this scenario and instead we could just fail the allocation. Fixes: 26a8ea80929c ("mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak") Reported-by: syzbot+a504cb5bae4fe117ba94@syzkaller.appspotmail.com Signed-off-by: Vivek Kasireddy Cc: Steve Sistare Cc: Muchun Song Cc: David Hildenbrand Cc: Andrew Morton --- mm/hugetlb.c | 9 ++++++--- mm/memfd.c | 5 +++++ 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c498874a7170..e46c461210a4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2397,12 +2397,15 @@ struct folio *alloc_hugetlb_folio_reserve(struct hstate *h, int preferred_nid, struct folio *folio; spin_lock_irq(&hugetlb_lock); + if (!h->resv_huge_pages) { + spin_unlock_irq(&hugetlb_lock); + return NULL; + } + folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, preferred_nid, nmask); - if (folio) { - VM_BUG_ON(!h->resv_huge_pages); + if (folio) h->resv_huge_pages--; - } spin_unlock_irq(&hugetlb_lock); return folio; diff --git a/mm/memfd.c b/mm/memfd.c index 35a370d75c9a..a3012c444285 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -85,6 +85,10 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) gfp_mask &= ~(__GFP_HIGHMEM | __GFP_MOVABLE); idx >>= huge_page_order(h); + if (!hugetlb_reserve_pages(file_inode(memfd), + idx, idx + 1, NULL, 0)) + return ERR_PTR(-ENOMEM); + folio = alloc_hugetlb_folio_reserve(h, numa_node_id(), NULL, @@ -100,6 +104,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) folio_unlock(folio); return folio; } + hugetlb_unreserve_pages(file_inode(memfd), idx, idx + 1, 1); return ERR_PTR(-ENOMEM); } #endif