From patchwork Mon Apr 24 13:45:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13222192 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 067F6C77B61 for ; Mon, 24 Apr 2023 13:45:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 73EF56B0071; Mon, 24 Apr 2023 09:45:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6EF146B0074; Mon, 24 Apr 2023 09:45:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DD886B0075; Mon, 24 Apr 2023 09:45:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4F7886B0071 for ; Mon, 24 Apr 2023 09:45:58 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id DF919C01D5 for ; Mon, 24 Apr 2023 13:45:57 +0000 (UTC) X-FDA: 80716407954.19.F11A496 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by imf08.hostedemail.com (Postfix) with ESMTP id E19C316000B for ; Mon, 24 Apr 2023 13:45:54 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf08.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682343956; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=of4y4r11HnMBjpqe6J5ROFKicQ684iyvaOjw+qGJf5k=; b=vFmM2u7Me6uWwmCJWfsneKa1emclrLPr0vt2eKlip4Q6gHMdF3uK1aMyygg6P4dYVbybQm qqbBGsi4upOy7BJ8l0urwSDJTbTwdrjkBeNe+BPu/1VYBes+LuJhTQcwxGKJpBC44hEFnc u8l9iPCrgpmRKrhVAGju2ywPgvXf5ro= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf08.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682343956; a=rsa-sha256; cv=none; b=o3ADhy+hrfNG4SGzn5WtpiFPARFMt3xP/5bWCdtqRJVZiUmscwECQMJNk0b4fRQlS1VwoJ lXuYpJX5F1mB8r7jIR60x0dQo8waAZR+JkaGsD4dn5aWbTuwXyI25ByhZ5GaG34zW8BSIK FYxoqHwW6StUoMbP9uQdanQQRXeNJYU= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R211e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0Vgw4CI-_1682343948; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0Vgw4CI-_1682343948) by smtp.aliyun-inc.com; Mon, 24 Apr 2023 21:45:49 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: rppt@kernel.org, ying.huang@intel.com, mgorman@techsingularity.net, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 1/2] mm/page_alloc: drop the unnecessary pfn_valid() for start pfn Date: Mon, 24 Apr 2023 21:45:39 +0800 Message-Id: X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E19C316000B X-Stat-Signature: kitscx45b4a3nun1dyi94x58wxzeo6ef X-Rspam-User: X-HE-Tag: 1682343954-359674 X-HE-Meta: U2FsdGVkX18+cNa6qsk8Uvw4+gEtbTSAKO1kqxa8jUHfQu9hyu9fR1AxYZ5MFBvvshlL3+C0qw3clRUNVYOnatin0zuOjqKNC7OGQeqjdHnKsmg4mI88SypAJIIHz0u+itCq/cdI1SfcXIZuEL2w/UFQm61cX91HUbm6JE217BoHxgrczk14kGQV7kV8M6UvZisz6ybkpN+MR86nlHx0SH56iUc2ypJO5rl8TQE1Xr3raBpUHP4mR+5Dc4pbZVZ88PbUuIrdgrXdHMq9iP28ZXUWI+uieCVZnWiKUFcAnFT8itle9KFf25WfA4sg3KcJ9Yyj1kWXrQFXqmGrcC4oQ+vaB3oa8nk5hmlm3ktMSNODfbocZsGnCZJsdYKpbylOUU1IepIuG84q6sxNu/bauEaklflExaFhtYutN/PoOsRy5LtZAtsC2oU5G67vam0DeJ3LSasHebBSy/ZXT2NGTbTf1sccc7qRD6ybPGF1AgrW8m3O7WS4NyearH1CjTDCZqP8KK4Pa2XpJpMfyNwqLyQASsSFiHY6mlvFsIfLU7kTAR9Fleo6OWiAXzwh6QcFA3eAx364egEuSGnAHuG7LGnRdRlW3uzkm/VsSmlUvV2MbJCqA+hGwtvHVTTb03e/J76jFuN2hHRgZGh5TNGSk6oY5q+lE/s1zqIuFm4dIy+xl15FIqGaBAk8KEBW4MrDwcnBmEU/ss+5020yIIzjOfa6MfTmQu2qQWoqbgaC3s6FJ+Dpb64tvcPMq2r31CadxHARSZOEMEavIksczAkoXKJxF8+DA+hAi5xOUROJuSTEMFjqhgKD82joxkeoDBVW5Ss6QL+RGncuLheq/5pLgLSA5SvmMOiAdT5MLbEO8hqNGWpPbAMdgefsjRVmTg5Mv/i+FmP1HTa+244KLTrD95F3PDKGmMbpVrROtFTyO5WPkAH6aF3JcF4Jms2JPgdxXPz9v4zp92GIs7Kpr0h o7Ro/C6w Gr1elXeM6f8mUx/5KCdeVWS9pVKi63z2zfOThjA9BvZWXGChR4cm3STrYdmSyKfzjEG5qX46o2S8vcIsg98rkR5TNvUQk+U7sWssZTrJbLpirWZx4SiGTd2i/lyTaF9tyIssYp7E4Rd5xmDLxzPny458/MVfifKJtWhu0z5sbD2ZVVt3+rBO8TSLZ187Pufp8kUuU2TFr4+0FDbyxr5J82p8OWjF25j2Q4CejvdwCzDvr0sqSgz5gMhnYEg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: __pageblock_pfn_to_page() currently performs both pfn_valid check and pfn_to_online_page(). The former one is redundant because the latter is a stronger check. Drop pfn_valid(). Signed-off-by: Baolin Wang Reviewed-by: David Hildenbrand Reviewed-by: "Huang, Ying" Acked-by: Michal Hocko --- Changes from v2: - Update the commit message per Michal. Thanks. - Add acked tag from Michal. Changes from v1: - Collect reviewed tags. Thanks David and Ying. --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9de2a18519a1..6457b64fe562 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1512,7 +1512,7 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn, /* end_pfn is one past the range we are checking */ end_pfn--; - if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn)) + if (!pfn_valid(end_pfn)) return NULL; start_page = pfn_to_online_page(start_pfn); From patchwork Mon Apr 24 13:45:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13222193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA8BEC7618E for ; Mon, 24 Apr 2023 13:45:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C47B06B0074; Mon, 24 Apr 2023 09:45:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BFC636B0075; Mon, 24 Apr 2023 09:45:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ABF556B0078; Mon, 24 Apr 2023 09:45:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9ED196B0074 for ; Mon, 24 Apr 2023 09:45:58 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4299D1601E1 for ; Mon, 24 Apr 2023 13:45:58 +0000 (UTC) X-FDA: 80716407996.14.05C67DD Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf27.hostedemail.com (Postfix) with ESMTP id 72F1940027 for ; Mon, 24 Apr 2023 13:45:54 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf27.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682343956; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UBWDO8RCzgEr8K6oHfsOpzVH+9c5LYLiXMC+yOgcIZY=; b=dMxRu1Ogdo6Khsbl9sDbeAfL5PPjXa9pog1IJb0fWIFXeDl7aNNL8aArJuhRmEN5/7pb9z S92ovbDvcKJ5gKlo/q4xnbe+rFSA1kq13TVmxelcBfN3PsgwlWUfLLs4yOSgIWtW8FR+Xq 20Mvm4//KXEqEfVWccpZce6RqYreOyM= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf27.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682343956; a=rsa-sha256; cv=none; b=MX2z6z6nQNC31+0SFBua+7OWWUhPj/XMVARjyYSOMAqi53sZPkLlvfx62FWi+R7pS7g3Pb tW3WqZFV2shCRfRLdxh5H4+IKqOKfa0Ld3ZNTywaB+X88/ty+nJPzqG3V6Fse315GqvorE 1tAds3gzDDzG/rGByBcJU5GJGuwPPhA= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0Vgw4CIR_1682343949; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0Vgw4CIR_1682343949) by smtp.aliyun-inc.com; Mon, 24 Apr 2023 21:45:50 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: rppt@kernel.org, ying.huang@intel.com, mgorman@techsingularity.net, vbabka@suse.cz, mhocko@suse.com, david@redhat.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 2/2] mm/page_alloc: add some comments to explain the possible hole in __pageblock_pfn_to_page() Date: Mon, 24 Apr 2023 21:45:40 +0800 Message-Id: <50b5e05dbb007e3a969ac946bc9ee0b2b77b185f.1682342634.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 72F1940027 X-Stat-Signature: it1bsj698xffy88i6f3xip1q44gkdrtt X-Rspam-User: X-HE-Tag: 1682343954-608299 X-HE-Meta: U2FsdGVkX192pfu5pyVabvVexiAyLZiP/2bkhCEOJa9gHsG8XW5gMuFLURDl9Xl1lq/wmWNqGVdz6DQwE4ZL8syqe1LA/4tdWzHKJTw6kNATMInxwC6+W4R5vUlOK4+JI3PhIj7sUJ+MeeW9PWHNzhauD2D9GzsLyBlPm1Vq6OuLNZKmpu0lg+fElMafWYsazxK0SocEwujsh5pAZWJF2b2rtYIGcov8SHfYx1Eeao8zBqfe0Qv8BxPr1hOf02b16U81lVFJ+j51PHdGX+fZKQUlQ5ixw2bXUwAA1Y5yykLuXzARfGLSdwyodFZ5EyyOQriaZ3iW8Jf0a5O6rDBZ+AF0CV4nHPJrX2arn16+GHNWHEm8MjUkR+j6aIxq0XjdWR0F1u0qseEDfChQIVrF7F+FZUX0tVr2jJ05QpEVMZgCToOOa/Srn6FpuItBPw+7YSU9gJsACTKuQ0Mf21vQg3GsLvYJIuDnEk3QPDECwdQ758jHE62oQGZSQL8rxnzA9CaxEs5MwBT2O451Km/KbdkLN9nEiEhf+FMPpEZ+w/uO51bo9uAviuwbGlLOWAyruiQ/KOzmci6uCaEi1ou2mEajhUK3OuFk+lWjc4H3QzXkVSpvh0IOzb/VCqZZJVmEzd+SVtCKD92LHpZkn207AtmXPtbKq+ktAESUvg4xLakAVo5NrNgpJZCKhQT8ggp2VyQFAk2w/NsP2jL92XRFDRqvsQ74LOeISI5BcIxq9jlMTiaPKMQQ8WVpgIGZKPay4cZH5i4tOzgol+vlMpLbi1D4STZ+AfmYwWqAe6LvbS05fU7uo+ePbC5aNlaRJuNAbOumolOA+iSZ6ZlDhtv5wSflgSiq3AxapK/d0ayKgU/Xv/lAQtLQRE1k2ZLxYRGmvbB+rNlVEbGG1hf+GaqAdpFQad+oE488Zjtsx5tRuvGsAqHT8EB7cGifw22PttL8lYNwtqH0e9e3O6I6DRl nmsWOeq6 M7Vnr3hz+EcNJH9Mjlb513YcWmIVWMlmxeh5ufULZnEeNCrRraZMkijQ17L9Bqfles9VrHBm1Mz4YKvOI6rHzjLVt5H5QMrhXCmgsfqhVGKnk30J9n2S2OXZD9b3jUsAt/4793ktiQ276hn7lhGCw1YL8Nnhbws4c8Q4Brf5+0+syytsXpUtGlJN4gHhzCDlfHwTktVCB9RKDU0lTu74a72/TeLRbzpyox6RbMgezcP/ZupCvv0WxksIW0vxqpQpYkM2d X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now the __pageblock_pfn_to_page() is used by set_zone_contiguous(), which checks whether the given zone contains holes, and uses pfn_to_online_page() to validate if the start pfn is online and valid, as well as using pfn_valid() to validate the end pfn. However, the __pageblock_pfn_to_page() function may return non-NULL even if the end pfn of a pageblock is in a memory hole in some situations. For example, if the pageblock order is MAX_ORDER, which will fall into 2 sub-sections, and the end pfn of the pageblock may be hole even though the start pfn is online and valid. See below memory layout as an example and suppose the pageblock order is MAX_ORDER. [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] [ 0.000000] DMA32 empty [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x0000000040000000-0x0000001fa3c7ffff] [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7dfffff] Focus on the last memory range, and there is a hole for the range [mem 0x0000001fa7590000-0x0000001fa7dfffff]. That means the last pageblock will contain the range from 0x1fa7c00000 to 0x1fa7ffffff, since the pageblock must be 4M aligned. And in this pageblock, these pfns will fall into 2 sub-section (the sub-section size is 2M aligned). So, the 1st sub-section (indicates pfn range: 0x1fa7c00000 - 0x1fa7dfffff ) in this pageblock is valid by calling subsection_map_init() in free_area_init(), but the 2nd sub-section (indicates pfn range: 0x1fa7e00000 - 0x1fa7ffffff ) in this pageblock is not valid. This did not break anything until now, but the zone continuous is fragile in this possible scenario. So as previous discussion[1], it is better to add some comments to explain this possible issue in case there are some future pfn walkers that rely on this. [1] https://lore.kernel.org/all/87r0sdsmr6.fsf@yhuang6-desk2.ccr.corp.intel.com/ Signed-off-by: Baolin Wang Acked-by: Michal Hocko --- Changes from v2: - Update the commit log and comments per Michal, thanks. Changes from v1: - Update the comments per Ying and Mike, thanks. Note, I did not add Huang Ying's reviewed tag, since there are some updates per Michal's suggestion. Ying, please review the v3. Thanks. --- mm/page_alloc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6457b64fe562..bd124390c79b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1502,6 +1502,15 @@ void __free_pages_core(struct page *page, unsigned int order) * interleaving within a single pageblock. It is therefore sufficient to check * the first and last page of a pageblock and avoid checking each individual * page in a pageblock. + * + * Note: the function may return non-NULL struct page even for a page block + * which contains a memory hole (i.e. there is no physical memory for a subset + * of the pfn range). For example, if the pageblock order is MAX_ORDER, which + * will fall into 2 sub-sections, and the end pfn of the pageblock may be hole + * even though the start pfn is online and valid. This should be safe most of + * the time because struct pages are still zero pre-filled and pfn walkers + * shouldn't touch any physical memory range for which they do not recognize + * any specific metadata in struct pages. */ struct page *__pageblock_pfn_to_page(unsigned long start_pfn, unsigned long end_pfn, struct zone *zone)