From patchwork Wed Mar 6 04:08:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 13583364 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 910A4C54E49 for ; Wed, 6 Mar 2024 04:16:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 616066B0092; Tue, 5 Mar 2024 23:16:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5C5496B0093; Tue, 5 Mar 2024 23:16:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F1606B0095; Tue, 5 Mar 2024 23:16:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2034C6B0092 for ; Tue, 5 Mar 2024 23:16:33 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CAD3A1C0BC7 for ; Wed, 6 Mar 2024 04:16:32 +0000 (UTC) X-FDA: 81865302624.25.D64F98F Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) by imf02.hostedemail.com (Postfix) with ESMTP id 2A23080009 for ; Wed, 6 Mar 2024 04:16:30 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b="R8/jonVh"; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf02.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.176 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709698591; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=36xbR356rfIzWtgB9qLDyiEGey6MMsGLn5MqndsELSg=; b=NDN2hgUj2pR31dz7e57nOXRuqJAQ0LUYb//KjNsI95jxZ8SHL4ElwALbMsX0eSKBdHD/p8 Bj9IgQ9wwWWfBCaRmJocodMWUmMz+7NnQipah5q/5yMDQsNgvKxNquKltqsiS/eE/WzolK /fBdRsTV9OBBbzii0wJlsY3dDzMvt3g= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b="R8/jonVh"; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf02.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.176 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709698591; a=rsa-sha256; cv=none; b=SezpY6OkZTg4/suWNbeBiIbrvRj0D5NN574dTXYyCa/dhJgjvmzeExyvvF2bDPDjYyo7JU ufq6SFj4KEhI1JmR3xKzsXllY0QAXsbZXDRsH6Ch2cMT8nmSl/PjqfdlHY/E84Mr9mLB64 1/O2idkp8D+1RW1mzBxvCgDgImY2OYk= Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-78822adc835so169180885a.3 for ; Tue, 05 Mar 2024 20:16:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1709698590; x=1710303390; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=36xbR356rfIzWtgB9qLDyiEGey6MMsGLn5MqndsELSg=; b=R8/jonVhnarpcr6M0jLp41uJoZ+q/1sl6CmSUKbljXaqimj5CyhvfRF0QaSQedv28A RVDtkcF2eIUAT4T14fQoz2tueaqlI59GXdXuoJQ6eYs4dCz5JV2yWevNfGvuTBxIWS59 8HiSazn4nNKZ3pszsHkpU/nWDH0Qq6sEnSV+ZAQEsBq82ONucYN2OeEMoJq9fnKV197F +HblZzYeJH4g0UkIDVlGmAyrwfVgG5QRA0mC5qCSxPvjMYtuvIA9Ub7rwf9Hbu3siWX8 8L2MDNP5RKQ/ugVtBdPWVzJP6Vk3Fjs1SiWc8q8vEpJ1BaLToY100DFZGxWSBKQBb1qe DY8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709698590; x=1710303390; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=36xbR356rfIzWtgB9qLDyiEGey6MMsGLn5MqndsELSg=; b=ENStCra0L2he4lYsx9pD1awk8jiMkQEDxLkjmDB6n3QT5YA5+kPQJWEOz79Q5QjCtp stdXAmQ4ffISPzyy8YIAUXmTcs2tMrKpd+V41xeWPPX7N6NaXSzvkGlCDrJSmQMi1n4P qjZDK6yAC1XnO6c/jjLb57yiwpV5iINw5lnEv18K0uSxOE/32FuBEPwzED6KJZGG/+/2 Oz/REV5brTNj8PY+aajDCpJGSDjWJEMZaojL4NR7S1JUbRx2Xd6FObFKTwV+N9e8ZzT+ uArF+wg5pSelRNUz/iu+Jt8GjgmjLy8ow01nVDSf+V8ehl9tQ3oVbbQoUC8G+RFcW0ak 5m9w== X-Forwarded-Encrypted: i=1; AJvYcCUa5B+gMaXXQYTBUcBWz7s8z0zhaU4+kV7V2kymC6RU9kSJumCKlxec5/oFQH1DsxiznmQ8D0l8td7IcCZ1JlvVzyY= X-Gm-Message-State: AOJu0YzkG4xkcAeUrgEqkmxo9nmdmgF+Rv2/MRtyzGclFB0nle1ew47N vmuTLOvCBf446VqIKJ0LKGUgInERZxAE9l/omK/BLGgqMlVOZDYed7+oEPENMIs= X-Google-Smtp-Source: AGHT+IHjBLTEjXsYssAa7PuQ/BMacfQ6VruPXkruqLKHw1gogijgd2A4F8i/4YwGrtqS5m5epciAkw== X-Received: by 2002:a05:620a:24c8:b0:788:3f44:f2bf with SMTP id m8-20020a05620a24c800b007883f44f2bfmr488280qkn.43.1709698590368; Tue, 05 Mar 2024 20:16:30 -0800 (PST) Received: from localhost (2603-7000-0c01-2716-da5e-d3ff-fee7-26e7.res6.spectrum.com. [2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with ESMTPSA id v13-20020a05620a122d00b00788357d6759sm1351474qkj.11.2024.03.05.20.16.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Mar 2024 20:16:29 -0800 (PST) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , Zi Yan , Mike Kravetz , "Huang, Ying" , David Hildenbrand , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 07/10] mm: page_alloc: close migratetype race between freeing and stealing Date: Tue, 5 Mar 2024 23:08:38 -0500 Message-ID: <20240306041526.892167-8-hannes@cmpxchg.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240306041526.892167-1-hannes@cmpxchg.org> References: <20240306041526.892167-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 2A23080009 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: b91pwpy8ytnj1mb7744nje66f34acua8 X-HE-Tag: 1709698590-257578 X-HE-Meta: U2FsdGVkX1/abQf5F6UC0olEFw9ghhOIr32Fp4lpleql21MT3nqoQHqHq0/snkRiYM+0MseGT2sAS6MJbkZfBA7x9Omj2C50eJnSbdQDfvbTGjTlp14vb2SAdfLG9c0C6jeAu/wqrb9ANudUrb7Bf+VZ1hpYmrIp0Wa5+XIXua4sZ5mgfEutNrjjQcWEuSiOvdYq8BZw6SUBhTdTgskklvcOcSMeeikQm1m1H9Qn1j7xbYCD9Yyh/8djoRvT7Tg1qipKyLGLT+dEYBPA3Qh/fvoYMVdEyNZvPoKjWbj1tB7bm7JmZawZk4m0VYuq5bn/vi+cnI99Lk1jdZBDWu6bvyIwFxCPUFK+wrRkqESA7f/nJLnbjYIJMm/e1AXnl1d8z5kCsnAHCmObqFPwa/md2dace95zr507ydj/iPoflXUDBl3BXVuHrXy05AAKLwhGu01X3gpyMwaNXpOii+3Ue8hkPyjcZdIzYyly8+9b3m1+z8KT82VuNAN1yN+uHn8fZOnjZrYV9NVgKYvuuaywRztp+z5zbL6UrEsBoAfYOTlbJ5/xTFzz3JLT+kabV5VfDQSSAAOOya+5cGqjj4reBtp815mxbdF+4qgNHa59X4thzZjFrlU7PnqW/+k9P/YGiYedZRBhwF2mQvuIXmBjNOTIlYmXPflEzr7tGWrYa00z42E/2QD4ykpa3CRa7XGAzmX7Xqz2Q4YXbN3eAQZ7CkErfjGdg7PXHi9XZIDu66xpz8VsvOCYZL3obDT+zX46YwZe9F3dEXBLb/ApjsEvgcVYATmK0IcQdyMk5fPDS0jpHXKfb5TzhHH4OIQlp5DUY/OXVdbDaayB6I7z/irrEsmV87U81HNu+cdAZVLaTHdAqS7imB9wxrgbLMUmwdjH4ISY5WcLhuBZfGAUnxYHElYtaQGXjIJDXyJaS6bvOLmB8nU/totqrceeBFfWLo6AN5CeIFNCwVHC+FsCPIo s2gZHAha O92AkaflqWeRImO4tBqiiiluOe1kSkNnI1ucvjKJEdfKC3ZYzkEVdyeQBRU5Mh1T0UjsYpyC6C6XZolPZeMaKoMTPSP4DuIfJyedy0RuXMCymA1ZkgWlntm8xYjtCbC6VpX7FNJWBg3Y1S2kqmjehcXGF0sZ97D95j3caCK/lJdIMES8brbjcZP38Sq671zRBepHt8MpJq5OHCAyYTWPm/x3TFCZ4zh+QdJZA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: There are three freeing paths that read the page's migratetype optimistically before grabbing the zone lock. When this races with block stealing, those pages go on the wrong freelist. The paths in question are: - when freeing >costly orders that aren't THP - when freeing pages to the buddy upon pcp lock contention - when freeing pages that are isolated - when freeing pages initially during boot - when freeing the remainder in alloc_pages_exact() - when "accepting" unaccepted VM host memory before first use - when freeing pages during unpoisoning None of these are so hot that they would need this optimization at the cost of hampering defrag efforts. Especially when contrasted with the fact that the most common buddy freeing path - free_pcppages_bulk - is checking the migratetype under the zone->lock just fine. In addition, isolated pages need to look up the migratetype under the lock anyway, which adds branches to the locked section, and results in a double lookup when the pages are in fact isolated. Move the lookups into the lock. Reported-by: Vlastimil Babka Signed-off-by: Johannes Weiner --- mm/page_alloc.c | 44 +++++++++++++++++--------------------------- 1 file changed, 17 insertions(+), 27 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9cf7ed0c4cd6..82e6c4068647 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1227,18 +1227,15 @@ static void free_pcppages_bulk(struct zone *zone, int count, spin_unlock_irqrestore(&zone->lock, flags); } -static void free_one_page(struct zone *zone, - struct page *page, unsigned long pfn, - unsigned int order, - int migratetype, fpi_t fpi_flags) +static void free_one_page(struct zone *zone, struct page *page, + unsigned long pfn, unsigned int order, + fpi_t fpi_flags) { unsigned long flags; + int migratetype; spin_lock_irqsave(&zone->lock, flags); - if (unlikely(has_isolate_pageblock(zone) || - is_migrate_isolate(migratetype))) { - migratetype = get_pfnblock_migratetype(page, pfn); - } + migratetype = get_pfnblock_migratetype(page, pfn); __free_one_page(page, pfn, zone, order, migratetype, fpi_flags); spin_unlock_irqrestore(&zone->lock, flags); } @@ -1246,21 +1243,13 @@ static void free_one_page(struct zone *zone, static void __free_pages_ok(struct page *page, unsigned int order, fpi_t fpi_flags) { - int migratetype; unsigned long pfn = page_to_pfn(page); struct zone *zone = page_zone(page); if (!free_pages_prepare(page, order)) return; - /* - * Calling get_pfnblock_migratetype() without spin_lock_irqsave() here - * is used to avoid calling get_pfnblock_migratetype() under the lock. - * This will reduce the lock holding time. - */ - migratetype = get_pfnblock_migratetype(page, pfn); - - free_one_page(zone, page, pfn, order, migratetype, fpi_flags); + free_one_page(zone, page, pfn, order, fpi_flags); __count_vm_events(PGFREE, 1 << order); } @@ -2530,7 +2519,7 @@ void free_unref_page(struct page *page, unsigned int order) struct per_cpu_pages *pcp; struct zone *zone; unsigned long pfn = page_to_pfn(page); - int migratetype, pcpmigratetype; + int migratetype; if (!free_pages_prepare(page, order)) return; @@ -2542,23 +2531,23 @@ void free_unref_page(struct page *page, unsigned int order) * get those areas back if necessary. Otherwise, we may have to free * excessively into the page allocator */ - migratetype = pcpmigratetype = get_pfnblock_migratetype(page, pfn); + migratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { - free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE); + free_one_page(page_zone(page), page, pfn, order, FPI_NONE); return; } - pcpmigratetype = MIGRATE_MOVABLE; + migratetype = MIGRATE_MOVABLE; } zone = page_zone(page); pcp_trylock_prepare(UP_flags); pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (pcp) { - free_unref_page_commit(zone, pcp, page, pcpmigratetype, order); + free_unref_page_commit(zone, pcp, page, migratetype, order); pcp_spin_unlock(pcp); } else { - free_one_page(zone, page, pfn, order, migratetype, FPI_NONE); + free_one_page(zone, page, pfn, order, FPI_NONE); } pcp_trylock_finish(UP_flags); } @@ -2615,7 +2604,7 @@ void free_unref_folios(struct folio_batch *folios) */ if (is_migrate_isolate(migratetype)) { free_one_page(zone, &folio->page, pfn, - order, migratetype, FPI_NONE); + order, FPI_NONE); continue; } @@ -2628,7 +2617,7 @@ void free_unref_folios(struct folio_batch *folios) if (unlikely(!pcp)) { pcp_trylock_finish(UP_flags); free_one_page(zone, &folio->page, pfn, - order, migratetype, FPI_NONE); + order, FPI_NONE); continue; } locked_zone = zone; @@ -6796,13 +6785,14 @@ bool take_page_off_buddy(struct page *page) bool put_page_back_buddy(struct page *page) { struct zone *zone = page_zone(page); - unsigned long pfn = page_to_pfn(page); unsigned long flags; - int migratetype = get_pfnblock_migratetype(page, pfn); bool ret = false; spin_lock_irqsave(&zone->lock, flags); if (put_page_testzero(page)) { + unsigned long pfn = page_to_pfn(page); + int migratetype = get_pfnblock_migratetype(page, pfn); + ClearPageHWPoisonTakenOff(page); __free_one_page(page, pfn, zone, 0, migratetype, FPI_NONE); if (TestClearPageHWPoison(page)) {