From patchwork Mon Nov 7 17:05:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13034902 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A291C433FE for ; Mon, 7 Nov 2022 17:06:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 473EB6B0075; Mon, 7 Nov 2022 12:06:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 425048E0001; Mon, 7 Nov 2022 12:06:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0795F6B0078; Mon, 7 Nov 2022 12:06:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DEA098E0001 for ; Mon, 7 Nov 2022 12:06:00 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9AF381C675A for ; Mon, 7 Nov 2022 17:06:00 +0000 (UTC) X-FDA: 80107273680.08.ECBE1BB Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf13.hostedemail.com (Postfix) with ESMTP id 8E5282000C for ; Mon, 7 Nov 2022 17:05:59 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9C8E11F86C; Mon, 7 Nov 2022 17:05:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1667840757; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XWjycHIf1fSEY4m1EHn+sUN2YmLp5Vnftx1jsDrRGXI=; b=kxzFyQGhSBAuYX3n5vugY5Y1QRoh5BV3g26LWmvLqWvL+G4qKvJ3XBBofnMtINeFd2bF5H 7CejRIjqnhbPP0kkN1YSvz+BK6moFCDtlwFGDOYIXsczJ84EiAhExjcYYzpXM9CKH8HdCN oBNV8Ic1D6HYqSCaWDTGvuJqUvcZoBQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1667840757; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XWjycHIf1fSEY4m1EHn+sUN2YmLp5Vnftx1jsDrRGXI=; b=yRTYUHecqIiNuULs8LuFdpJ7HeC/EhwT6lAYv0dSjjqOZpf7JTBj+IAUFnKwzGa15nNh8r caIfQ/3exJcVyGBQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 71CBC13AC7; Mon, 7 Nov 2022 17:05:57 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id EJQxG/U6aWOYfwAAMHmgww (envelope-from ); Mon, 07 Nov 2022 17:05:57 +0000 From: Vlastimil Babka To: Christoph Lameter , David Rientjes , Joonsoo Kim , Pekka Enberg , Joel Fernandes Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Roman Gushchin , Matthew Wilcox , paulmck@kernel.org, rcu@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, patches@lists.linux.dev, Vlastimil Babka , kernel test robot Subject: [PATCH v2 2/3] mm/migrate: make isolate_movable_page() skip slab pages Date: Mon, 7 Nov 2022 18:05:53 +0100 Message-Id: <20221107170554.7869-3-vbabka@suse.cz> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221107170554.7869-1-vbabka@suse.cz> References: <20221107170554.7869-1-vbabka@suse.cz> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667840759; a=rsa-sha256; cv=none; b=41AZMAFaTACLbq4wrL5uIDx6LJUluml2NWL5mvP2TzGWP49LWVVJiTL5mXzoyJgo4r/RVa 8dDOJIDTPDB60ruY56mGM3Jly8J4uEP8HSS7+oX6OLAYYkHa0yzWJnKdUi+GYIXgByLCvs 26pmQ7YVY+hFs7BOKa8Qe4cLzTod//0= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=kxzFyQGh; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=yRTYUHec; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667840759; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XWjycHIf1fSEY4m1EHn+sUN2YmLp5Vnftx1jsDrRGXI=; b=OOzmJOm9+4aQUrF6Mso6ByQks2nZlLWFFIelqjGZKS6eWU27tvR6BU8bqGZd0jCE7pktvw veG5V5cvOLFgDRsMaBIlyaLPDfKmVzOdE+OX5b7YT06UiMjQoDSronseEejEuTHeQMtz60 Wd+3d3IfQeJMKezzmenh6BU3ye2BQX8= X-Stat-Signature: 4z14cdbw5w917br5qunbqsb1xamd3ept X-Rspamd-Queue-Id: 8E5282000C Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=kxzFyQGh; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=yRTYUHec; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1667840759-840471 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In the next commit we want to rearrange struct slab fields to allow a larger rcu_head. Afterwards, the page->mapping field will overlap with SLUB's "struct list_head slab_list", where the value of prev pointer can become LIST_POISON2, which is 0x122 + POISON_POINTER_DELTA. Unfortunately the bit 1 being set can confuse PageMovable() to be a false positive and cause a GPF as reported by lkp [1]. To fix this, make isolate_movable_page() skip pages with the PageSlab flag set. This is a bit tricky as we need to add memory barriers to SLAB and SLUB's page allocation and freeing, and their counterparts to isolate_movable_page(). Based on my RFC from [2]. Added a comment update from Matthew's variant in [3] and, as done there, moved the PageSlab checks to happen before trying to take the page lock. [1] https://lore.kernel.org/all/208c1757-5edd-fd42-67d4-1940cc43b50f@intel.com/ [2] https://lore.kernel.org/all/aec59f53-0e53-1736-5932-25407125d4d4@suse.cz/ [3] https://lore.kernel.org/all/YzsVM8eToHUeTP75@casper.infradead.org/ Reported-by: kernel test robot Signed-off-by: Vlastimil Babka Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/migrate.c | 15 ++++++++++++--- mm/slab.c | 6 +++++- mm/slub.c | 6 +++++- 3 files changed, 22 insertions(+), 5 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 1379e1912772..959c99cff814 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -74,13 +74,22 @@ int isolate_movable_page(struct page *page, isolate_mode_t mode) if (unlikely(!get_page_unless_zero(page))) goto out; + if (unlikely(PageSlab(page))) + goto out_putpage; + /* Pairs with smp_wmb() in slab freeing, e.g. SLUB's __free_slab() */ + smp_rmb(); /* - * Check PageMovable before holding a PG_lock because page's owner - * assumes anybody doesn't touch PG_lock of newly allocated page - * so unconditionally grabbing the lock ruins page's owner side. + * Check movable flag before taking the page lock because + * we use non-atomic bitops on newly allocated page flags so + * unconditionally grabbing the lock ruins page's owner side. */ if (unlikely(!__PageMovable(page))) goto out_putpage; + /* Pairs with smp_wmb() in slab allocation, e.g. SLUB's alloc_slab_page() */ + smp_rmb(); + if (unlikely(PageSlab(page))) + goto out_putpage; + /* * As movable pages are not isolated from LRU lists, concurrent * compaction threads can race against page migration functions diff --git a/mm/slab.c b/mm/slab.c index 59c8e28f7b6a..219beb48588e 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1370,6 +1370,8 @@ static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags, account_slab(slab, cachep->gfporder, cachep, flags); __folio_set_slab(folio); + /* Make the flag visible before any changes to folio->mapping */ + smp_wmb(); /* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */ if (sk_memalloc_socks() && page_is_pfmemalloc(folio_page(folio, 0))) slab_set_pfmemalloc(slab); @@ -1387,9 +1389,11 @@ static void kmem_freepages(struct kmem_cache *cachep, struct slab *slab) BUG_ON(!folio_test_slab(folio)); __slab_clear_pfmemalloc(slab); - __folio_clear_slab(folio); page_mapcount_reset(folio_page(folio, 0)); folio->mapping = NULL; + /* Make the mapping reset visible before clearing the flag */ + smp_wmb(); + __folio_clear_slab(folio); if (current->reclaim_state) current->reclaim_state->reclaimed_slab += 1 << order; diff --git a/mm/slub.c b/mm/slub.c index 99ba865afc4a..5e6519d5169c 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1800,6 +1800,8 @@ static inline struct slab *alloc_slab_page(gfp_t flags, int node, slab = folio_slab(folio); __folio_set_slab(folio); + /* Make the flag visible before any changes to folio->mapping */ + smp_wmb(); if (page_is_pfmemalloc(folio_page(folio, 0))) slab_set_pfmemalloc(slab); @@ -2000,8 +2002,10 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab) int pages = 1 << order; __slab_clear_pfmemalloc(slab); - __folio_clear_slab(folio); folio->mapping = NULL; + /* Make the mapping reset visible before clearing the flag */ + smp_wmb(); + __folio_clear_slab(folio); if (current->reclaim_state) current->reclaim_state->reclaimed_slab += pages; unaccount_slab(slab, order, s);