From patchwork Wed Nov 13 16:01:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13873977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F00AED462CA for ; Wed, 13 Nov 2024 16:01:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 65AA76B010F; Wed, 13 Nov 2024 11:01:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 60A5A6B0110; Wed, 13 Nov 2024 11:01:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D2826B0111; Wed, 13 Nov 2024 11:01:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 292026B010F for ; Wed, 13 Nov 2024 11:01:22 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 84142120A06 for ; Wed, 13 Nov 2024 16:01:21 +0000 (UTC) X-FDA: 82781535144.12.288BC4C Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf27.hostedemail.com (Postfix) with ESMTP id F12494000F for ; Wed, 13 Nov 2024 16:00:34 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=tK9+b4Dc; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=X6c9wmVL; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=tK9+b4Dc; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=X6c9wmVL; spf=pass (imf27.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731513505; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=XZslpWj8E+Fl0ziezYbWMfvy4qaW5tQNPAKAFXkDZIM=; b=KhKB6mcgmaYIvzz88lHgI5Ib3QTwOzum9FNerD5dPzfcmxMZAa+rTyufb9V2L4yS0NV25r rqK/9ayV3oVW3pjqqtvQ3OFgOyWgLCOTtB3aXbViLjfFQNNgZRljE7wXzevjbT2BhDOJ17 a69DluBrTDCasQ9cMFa3FhAd+bso7Xk= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=tK9+b4Dc; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=X6c9wmVL; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=tK9+b4Dc; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=X6c9wmVL; spf=pass (imf27.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731513505; a=rsa-sha256; cv=none; b=ZBnWIImuKMpugU/ARlNJzAF13hD7cJNHRghuADm5H4TwdHJqqRSOVTp9acUu3FKHzVGmCG AIIHaoWaYQ4crbcaI5RIZHdHGq3c9gJ9WXY4pzWgLKbWM/hvO0ZLsW+78N6S/xT0nUmKub MWJlobbxVLg5wd+VSO5QZkOHRHAT6f8= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 0BF131F38C; Wed, 13 Nov 2024 16:01:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1731513677; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=XZslpWj8E+Fl0ziezYbWMfvy4qaW5tQNPAKAFXkDZIM=; b=tK9+b4Dc+M6H32Dmr/0GeJDukqfEcUmLJQRQlk663WiHHywnxVrL8LCodJr4CtvDpNVU8H ySg37JdB5j96X/HINQZ6+nl4dB7wjgkciCguUTiKRSMN4PVrjhh/vrBPW6bpzwFEwt6/vg 6ZuqiJJensTM1rYdyupzEM+04Bb5hn0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1731513677; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=XZslpWj8E+Fl0ziezYbWMfvy4qaW5tQNPAKAFXkDZIM=; b=X6c9wmVLUEvX1eZeGkgu4u5ktknWpSowfIGXfx1aYG5Jv97aQUTL+c0Okd3b4fqkvBysvo c8phidNl4eVqUIDg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1731513677; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=XZslpWj8E+Fl0ziezYbWMfvy4qaW5tQNPAKAFXkDZIM=; b=tK9+b4Dc+M6H32Dmr/0GeJDukqfEcUmLJQRQlk663WiHHywnxVrL8LCodJr4CtvDpNVU8H ySg37JdB5j96X/HINQZ6+nl4dB7wjgkciCguUTiKRSMN4PVrjhh/vrBPW6bpzwFEwt6/vg 6ZuqiJJensTM1rYdyupzEM+04Bb5hn0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1731513677; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=XZslpWj8E+Fl0ziezYbWMfvy4qaW5tQNPAKAFXkDZIM=; b=X6c9wmVLUEvX1eZeGkgu4u5ktknWpSowfIGXfx1aYG5Jv97aQUTL+c0Okd3b4fqkvBysvo c8phidNl4eVqUIDg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 00B1613A6E; Wed, 13 Nov 2024 16:01:16 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id WOIfO0zNNGfAKQAAD6G6ig (envelope-from ); Wed, 13 Nov 2024 16:01:16 +0000 From: Vlastimil Babka To: linux-mm@kvack.org Cc: Vlastimil Babka , Peter Xu , David Hildenbrand Subject: [RFC for stable 5.15 and 5.10] mm/memory: only copy anonymous pages during fork() Date: Wed, 13 Nov 2024 17:01:04 +0100 Message-ID: <20241113160103.48943-2-vbabka@suse.cz> X-Mailer: git-send-email 2.47.0 MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: F12494000F X-Stat-Signature: juozjupdj8qusuetshxp8wwikciky4c8 X-Rspam-User: X-HE-Tag: 1731513634-538665 X-HE-Meta: U2FsdGVkX1+d+dxEuMMvNP8niFctzJcJXi9gGexssR8tlrE09vzC4qv5EdWny0WbjYxETdIRiDee39sn9gyqtZsnFdJv2BjlQA4E5lfw1Yq7UDZ0RkNaD/5Zr7YUM/3iMEe5ezjYbE+9u3XtHMUslo8fOm1O864CX3U6m6aRRoIaijDiePO4eNXI+x17PGcyq05+PAGm6Jwa2EJgsaTzW7usI9LQg0f3e0qlnrhqggPyiPcKQNnrhzbRu1MBcqHT3uMTe4BfZ3d/TwvDd8tDqq0Xdy4fT9vQp16TkxdhSXf4T/b7RIrcLc5mCoiJWjFkd2OvXIeG3D49dLFRRhYj2wrRLfsLgFndcqVb5y2iPRAU9boYiMvla28dtc57pgElEnQtmJsPe48IMgVXRRMJjUZz0oi8xw2b08FS9KixxGAflYw/iV1KAOVqAZ7rFL32i24Sf3AXXyf/25pjmxNOFlQOFst0xKFFwGIocYZseBpv/xEBOmg1QSQbLjIGQ9tv+Rz+sNgXUD3ewNWl8pbBLjYqIqWZXrUHKxlp0gE4l9+Ai+BPkA0f37rzSgeNmiToEJKxw2Db7ljWX6ydC1TZtIOrWYkLACYDYWt9z+aMY9SGf3A2zODR+FViAoJP9LC9nGUE2gLG/S8HFo6rw5H1HZe7SaSrJD7pILgbfKm1hkA6J29rfTxlUK930QT3BR1SDheLh4I0lzc/eDlffBaN/QrPQhk+bijQgIcMeX8YECEvHJ4ry3dX+J4HclEawwTGAnUHju8Av9SgRCYMJOZYIzpK801u7mbFvsJPjoYVC2gvhwkn92QuGH5IToGN7lLCxsZz4Pnv6kCBY3Ku89oUsOuAdz+3ToG69yFe2xHkrm85Uwr4ezwFvizIb/MHB3oD0C8+QfdB0sb7DinFlOAWAIiK+y5EN8WFHXZdEPVB8JS/8fHx9Cp7WIDwPH6MJg0Dg28YHqMUWqs8TPjVNA6 FpNo5FVD AuILXUvg0vXc195n1saIJASGt1Ts7rx+JfoAAR1b1hP7bjndXcZMR/ueqnqNY1J6XeK0ctm/A7DPzFGrNRLXA1str2EL1pcySWVvzYe04cR6fiPaBJpRixb91wCwRgmPUxFX0owvzzYj+qP/gMlwgpq1Bf1/uiZJcAizdLSbAMx278TxKBh/IdyAdbI3uVFpvcESDbO4Kjvjqj4r+RjXswadG4LejGsxf19IKqtgKYxhKFelfWJUDCQF0HDuHZP0pe8t61As4DaMv2yQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When a combination of unfortunate factors occur, we might BUG in fork(): dup_mmap() copy_page_range() copy_***_range() copy_present_pte() copy_present_page() page_add_new_anon_rmap() __page_set_anon_rmap() BUG_ON(!anon_vma); The factors are: - source vma is VM_MIXEDMAP otherwise copy_page_range() would bail out when !src_vma->anon_vma - I think this was due to gpfs, but can happen in-tree as well - is_cow_mapping() is true because VM_MAYWRITE (even though the vma was a read-only mapping of a .so file) - MMF_HAS_PINNED is true, thus some actual pinning has happened - page_maybe_dma_pinned() is true as a false positive, because mapcount and thus refcount is >1024 That makes us reach page_needs_cow_for_dma() in copy_present_page() and evaluate it as true and attempt to CoW a file page and hit the BUG_ON() because we never had a reason to instantiate anon_vma for the source vma. AFAICS this was fixed inadvertedly in 5.19 by commit fb3d824d1a46 ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()") or another commit in that series. What caught my attention is this part of the changelog: We really only care about pins on anonymous pages, because they are prone to getting replaced in the COW handler once mapped R/O. For !anon pages in cow-mappings (!VM_SHARED && VM_MAYWRITE) we shouldn't really care about that, at least not that I could come up with an example. And as part of that commit, an PageAnon() test is added in copy_present_pte(). But the code is already refactored a lot, so this is an attempt at a minimal fix for LTS kernels by placing the PageAnon() check to copy_present_page(). Fixes: 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes") Cc: Peter Xu Cc: David Hildenbrand Signed-off-by: Vlastimil Babka --- Hi, we've seen this in our 5.14 based kernel and it involved the out of tree gpfs module, but I believe the same thing can happen in LTS's 5.10 and 5.15 without out of tree modules as well. So I'd like your opinion on this fix before I propose it to stable as a non-standard version-specific fix (I don't think we'd want to backport fb3d824d1a46 with prerequisities). Thanks. mm/memory.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 6d058973a97e..73871bac0e4c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -887,6 +887,10 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma { struct page *new_page; + /* We only care about pins on anonymous pages */ + if (!PageAnon(page)) + return 1; + /* * What we want to do is to check whether this page may * have been pinned by the parent process. If so,