From patchwork Mon Aug 15 07:02:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Haiyue" X-Patchwork-Id: 12943177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79F84C25B0E for ; Mon, 15 Aug 2022 07:02:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D995C6B0073; Mon, 15 Aug 2022 03:02:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D49388D0002; Mon, 15 Aug 2022 03:02:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEA066B0075; Mon, 15 Aug 2022 03:02:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B0AAD6B0073 for ; Mon, 15 Aug 2022 03:02:32 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 85C22AB620 for ; Mon, 15 Aug 2022 07:02:32 +0000 (UTC) X-FDA: 79800933744.31.4171A77 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf27.hostedemail.com (Postfix) with ESMTP id D6DA9401C1 for ; Mon, 15 Aug 2022 07:02:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660546951; x=1692082951; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6y3vJM03inTN3n4fTtr3sj64MkkloN/gZvJBPW9m2l0=; b=dIrfVjlHj2h0aSKI+F0HK7c9chMOayaR3RY+VGYSPt/hyYglmEVlF+Wx pXmE+ebnDGGgzB2Q1nHfTmJ5aULsWbUJyvNpgBahg+HysHt15I5VqpF0X gXY3uBU2v3CZwn29/WC5egYTUwzANMnC30OWomydorZrUF5TF2akOZmNd 4B3JH03EkUzf81Y9kJhxYbnQEBUyTFIY8iAKzBg1RXwugBVyZnp2w0cj5 eeXs6i1jCioZUL7Dk2x1ueGD9ZQrkoCQUIqkhyXB1ZXu4m+w6M7LcoZGZ HnGsy+efy1CpD9OelA+iYuPUlejxYtFh7dX29X+XgBwoCBJ4raT5Xg3B1 Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10439"; a="271677579" X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="271677579" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Aug 2022 00:02:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="666566767" Received: from sse-cse-haiyue-nuc.sh.intel.com ([10.239.241.114]) by fmsmga008.fm.intel.com with ESMTP; 15 Aug 2022 00:02:27 -0700 From: Haiyue Wang To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, david@redhat.com, apopple@nvidia.com, linmiaohe@huawei.com, ying.huang@intel.com, songmuchun@bytedance.com, naoya.horiguchi@linux.dev, alex.sierra@amd.com, Haiyue Wang Subject: [PATCH v5 1/2] mm: migration: fix the FOLL_GET failure on following huge page Date: Mon, 15 Aug 2022 15:02:39 +0800 Message-Id: <20220815070240.470469-2-haiyue.wang@intel.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220815070240.470469-1-haiyue.wang@intel.com> References: <20220812084921.409142-1-haiyue.wang@intel.com> <20220815070240.470469-1-haiyue.wang@intel.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=dIrfVjlH; spf=pass (imf27.hostedemail.com: domain of haiyue.wang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660546952; a=rsa-sha256; cv=none; b=oko8I2u5SoMj5iUtHrBFiK2GCPSOQbvZyWaKyMmczs1tS0jKRULYzk59mKPKPEdB1oXIsk GjxtvjZfRdKXD6cZuNq5KbezKXKNQd8scIXdOiqA/+t4BjTbcM4KNPzyMynqIgcy4e0JC8 Fx1QA3PM+2cmtvg14Kb0Woj403S9hmg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660546952; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cQ4S/O8CjD3II0HTtAD3PoTwPLV+tZ6cGUGg+Czk+XQ=; b=spMiO5wXFHf00iPRhuEWpLnUd+2cKof9v1O035xmAE+FclhIRljfFUbpjsTAAO+w9YRCYw Teg6SY4g+TPNOYWvrJNQEW0azJH4eJM4boPrSkXuL4p650Sc7sYyFO/iAzac4wLomBS/QN muZmLPeZdEq0hpcX6OivSr2CPm+26eU= Authentication-Results: imf27.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=dIrfVjlH; spf=pass (imf27.hostedemail.com: domain of haiyue.wang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D6DA9401C1 X-Stat-Signature: x3eh6daninrp4gooj8pqccse1xsm9ui9 X-Rspam-User: X-HE-Tag: 1660546951-86914 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Not all huge page APIs support FOLL_GET option, so move_pages() syscall will fail to get the page node information for some huge pages. Like x86 on linux 5.19 with 1GB huge page API follow_huge_pud(), it will return NULL page for FOLL_GET when calling move_pages() syscall with the NULL 'nodes' parameter, the 'status' parameter has '-2' error in array. Note: follow_huge_pud() now supports FOLL_GET in linux 6.0. Link: https://lore.kernel.org/all/20220714042420.1847125-3-naoya.horiguchi@linux.dev But these huge page APIs don't support FOLL_GET: 1. follow_huge_pud() in arch/s390/mm/hugetlbpage.c 2. follow_huge_addr() in arch/ia64/mm/hugetlbpage.c It will cause WARN_ON_ONCE for FOLL_GET. 3. follow_huge_pgd() in mm/hugetlb.c This is an temporary solution to mitigate the side effect of the race condition fix by calling follow_page() with FOLL_GET set for huge pages. After supporting follow huge page by FOLL_GET is done, this fix can be reverted safely. Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline") Signed-off-by: Haiyue Wang Reviewed-by: "Huang, Ying" --- mm/migrate.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 6a1597c92261..581dfaad9257 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, for (i = 0; i < nr_pages; i++) { unsigned long addr = (unsigned long)(*pages); + unsigned int foll_flags = FOLL_DUMP; struct vm_area_struct *vma; struct page *page; int err = -EFAULT; @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (!vma) goto set_status; + /* Not all huge page follow APIs support 'FOLL_GET' */ + if (!is_vm_hugetlb_page(vma)) + foll_flags |= FOLL_GET; + /* FOLL_DUMP to ignore special (like zero) pages */ - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP); + page = follow_page(vma, addr, foll_flags); err = PTR_ERR(page); if (IS_ERR(page)) @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (page && !is_zone_device_page(page)) { err = page_to_nid(page); - put_page(page); + if (foll_flags & FOLL_GET) + put_page(page); } else { err = -ENOENT; } From patchwork Mon Aug 15 07:02:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Haiyue" X-Patchwork-Id: 12943178 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1821BC00140 for ; Mon, 15 Aug 2022 07:02:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B021A8D0003; Mon, 15 Aug 2022 03:02:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AB1448D0002; Mon, 15 Aug 2022 03:02:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 951BF8D0003; Mon, 15 Aug 2022 03:02:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 860688D0002 for ; Mon, 15 Aug 2022 03:02:35 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2E7AD160D59 for ; Mon, 15 Aug 2022 07:02:35 +0000 (UTC) X-FDA: 79800933870.26.6C3E0DE Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf23.hostedemail.com (Postfix) with ESMTP id 746CD1401A1 for ; Mon, 15 Aug 2022 07:02:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660546954; x=1692082954; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3RyMMTN/AHHSqtfx+jaKLw7DbxnmzdBDsCyu34bLAqQ=; b=IPx1ysd5WyqNX2l01lM1CZTXDZAcArMkfMC9s9NHSGrFr9okEriLPNFh aaHSLWNLzy6JeJTJoUkXuFhCn1K7DBetKzv1emPumynjCixhxRTttvorj 8fKQmxfW+324hbjjOZa9VYdOhLzUDhC6SGY50r6ifeGB7C6sTzVeGdO4k vzxZMeZeKY1dV/tuyKBpD7JBdB0IueffP8UNMUxqHub/7tERvHNs/rCz8 PunwPSzEJ7FQ8ebvHfYZ6urGZwWSQzr5KNw+LWK8bxSvOcZKeuPqQEb+X tQvTBoEvy6uD0k8Gp8x7E4yrwXDP/XH1h5WOtAFRqkKJfdsGm5M0IRE1Y w==; X-IronPort-AV: E=McAfee;i="6400,9594,10439"; a="271677588" X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="271677588" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Aug 2022 00:02:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="666566780" Received: from sse-cse-haiyue-nuc.sh.intel.com ([10.239.241.114]) by fmsmga008.fm.intel.com with ESMTP; 15 Aug 2022 00:02:30 -0700 From: Haiyue Wang To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, david@redhat.com, apopple@nvidia.com, linmiaohe@huawei.com, ying.huang@intel.com, songmuchun@bytedance.com, naoya.horiguchi@linux.dev, alex.sierra@amd.com, Haiyue Wang , Felix Kuehling Subject: [PATCH v5 2/2] mm: fix the handling Non-LRU pages returned by follow_page Date: Mon, 15 Aug 2022 15:02:40 +0800 Message-Id: <20220815070240.470469-3-haiyue.wang@intel.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220815070240.470469-1-haiyue.wang@intel.com> References: <20220812084921.409142-1-haiyue.wang@intel.com> <20220815070240.470469-1-haiyue.wang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660546954; a=rsa-sha256; cv=none; b=BGC0UzszXC1sVgVJqL9vUN7kHJOI9Ni4MD8Eff88d6yIa2x6hpd1LebrAbxtlgDIOui1MP MZSDvpKDZzphrNQxrPaqKEVYMuid4xc/tsFh490B2pGsZZqJJEzKHdn1eW2XKHHmdxyhHT 4j5VNQO8J6V55slB+JuFyo0M0NlmpgQ= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=IPx1ysd5; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf23.hostedemail.com: domain of haiyue.wang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660546954; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UXVGBXteGgnAWCtkLVBBIWLJ+s/PcHculFgKaTH2TmQ=; b=oJ+ZV86DL5ATymD4ZWTch45gpHI1BafrMiSGv+G7z3BAjvEPOQbPMwcXwWD726+rIe02BG mz0ph19TwgwPEm6eLbMHKzQA6oTtmWRjJgyTCZ/DdsHhdIRrU7ygrt52Z+4QvTLshUJtxe c50vJb4iwN2TKSAK41FNYfgudWJh6jA= X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 746CD1401A1 Authentication-Results: imf23.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=IPx1ysd5; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf23.hostedemail.com: domain of haiyue.wang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com X-Stat-Signature: h7mpxjekxtrf13c4w3a1rjqf6gofg8id X-Rspam-User: X-HE-Tag: 1660546954-609614 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The handling Non-LRU pages returned by follow_page() jumps directly, it doesn't call put_page() to handle the reference count, since 'FOLL_GET' flag for follow_page() has get_page() called. Fix the zone device page check by handling the page reference count correctly before returning. And as David reviewed, "device pages are never PageKsm pages". Drop this zone device page check for break_ksm(). Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages") Signed-off-by: Haiyue Wang Reviewed-by: "Huang, Ying" Reviewed-by: Felix Kuehling --- mm/huge_memory.c | 4 ++-- mm/ksm.c | 12 +++++++++--- mm/migrate.c | 10 +++++++--- 3 files changed, 18 insertions(+), 8 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8a7c1b344abe..b2ba17c3dcd7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, /* FOLL_DUMP to ignore special (like zero) pages */ page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) continue; - if (!is_transparent_hugepage(page)) + if (is_zone_device_page(page) || !is_transparent_hugepage(page)) goto next; total++; diff --git a/mm/ksm.c b/mm/ksm.c index 42ab153335a2..e26f57fc1f0e 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) cond_resched(); page = follow_page(vma, addr, FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) break; if (PageKsm(page)) ret = handle_mm_fault(vma, addr, @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item) goto out; page = follow_page(vma, addr, FOLL_GET); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) goto out; + if (is_zone_device_page(page)) + goto out_putpage; if (PageAnon(page)) { flush_anon_page(vma, page, addr); flush_dcache_page(page); } else { +out_putpage: put_page(page); out: page = NULL; @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) if (ksm_test_exit(mm)) break; *page = follow_page(vma, ksm_scan.address, FOLL_GET); - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) { + if (IS_ERR_OR_NULL(*page)) { ksm_scan.address += PAGE_SIZE; cond_resched(); continue; } + if (is_zone_device_page(*page)) + goto next_page; if (PageAnon(*page)) { flush_anon_page(vma, *page, ksm_scan.address); flush_dcache_page(*page); @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) mmap_read_unlock(mm); return rmap_item; } +next_page: put_page(*page); ksm_scan.address += PAGE_SIZE; cond_resched(); diff --git a/mm/migrate.c b/mm/migrate.c index 581dfaad9257..fee12cd2f294 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, goto out; err = -ENOENT; - if (!page || is_zone_device_page(page)) + if (!page) goto out; + if (is_zone_device_page(page)) + goto out_putpage; + err = 0; if (page_to_nid(page) == node) goto out_putpage; @@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (IS_ERR(page)) goto set_status; - if (page && !is_zone_device_page(page)) { - err = page_to_nid(page); + if (page) { + err = !is_zone_device_page(page) ? page_to_nid(page) + : -ENOENT; if (foll_flags & FOLL_GET) put_page(page); } else {