From patchwork Tue Aug 16 02:21:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Haiyue" X-Patchwork-Id: 12944222 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C428CC00140 for ; Tue, 16 Aug 2022 02:20:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D27F78D0001; Mon, 15 Aug 2022 22:20:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ADF828D0006; Mon, 15 Aug 2022 22:20:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B8198D0001; Mon, 15 Aug 2022 22:20:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 752928D0003 for ; Mon, 15 Aug 2022 22:20:55 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4B3F7140E5E for ; Tue, 16 Aug 2022 02:20:55 +0000 (UTC) X-FDA: 79803852870.05.367D6A2 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf27.hostedemail.com (Postfix) with ESMTP id 8C3F34003A for ; Tue, 16 Aug 2022 02:20:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660616454; x=1692152454; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OvL9/IKf/g+zRmxGRGIODU/Jn6YGizzjnlk8aW6h8us=; b=AiQG7FC36WaPZBkrbzBDXw6JukTnCXzlsbMcWJTl3fcUCCADoQBdoPD2 aRvY/Slb32K8uv0bNoOi0lad7mQ7eg+xv19twqYr8RuhUTl32Zc/pZMwO AD0ZlWmgCtpa+cTHdOLUQqkfDYxE0Xi3ZidtliOick6eSaC+xSc9xPJC1 swNK60k6SH/L0WdtOVo2vanHjpk7s30Mq/dYPO7DW7H85hckZ+V2xUPKI Ccn4eaRAz40OZvN+W60bw5P8FpUfFt+vOSl54ytRNQTVSxXwRGTUdvLeL mK8NAvzR6MM+ht96zcrPwN2P530Aoz6vhVVoqY84MhhZhHwLRSOotfk+m Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10440"; a="279067288" X-IronPort-AV: E=Sophos;i="5.93,240,1654585200"; d="scan'208";a="279067288" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Aug 2022 19:20:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,240,1654585200"; d="scan'208";a="934706650" Received: from sse-cse-haiyue-nuc.sh.intel.com ([10.239.241.114]) by fmsmga005.fm.intel.com with ESMTP; 15 Aug 2022 19:20:47 -0700 From: Haiyue Wang To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, david@redhat.com, apopple@nvidia.com, linmiaohe@huawei.com, ying.huang@intel.com, songmuchun@bytedance.com, naoya.horiguchi@linux.dev, alex.sierra@amd.com, Haiyue Wang Subject: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page Date: Tue, 16 Aug 2022 10:21:00 +0800 Message-Id: <20220816022102.582865-2-haiyue.wang@intel.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220816022102.582865-1-haiyue.wang@intel.com> References: <20220812084921.409142-1-haiyue.wang@intel.com> <20220816022102.582865-1-haiyue.wang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660616454; a=rsa-sha256; cv=none; b=s/fhH9QjLMm5Cve8PWq8Emh4DZruCFRNkc9Xeqq+5bl8MCgO4Wu71HTn0EWO1WLLN9D8H5 Lk6uTV44BNHhmnvUnpWp0rlud0LI42mSxRgEgkD/iN2BsKFDhHrG9IUF7pbzV/5hUWx8no x7nF/BP15oFNZW20RcbJ+RJ6LTno28M= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=AiQG7FC3; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of haiyue.wang@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660616454; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1NNO30rsohMjoZjTYsz8uAHoqngXjPIKK86JozdGWMM=; b=psu3jTIvNVVt/3gmPY6wldoAf4FE26Pe9696MiQuBWsDWjy89OSIy3GGWytRha/fDdmiP2 sv+ERyGE8c1Hvzn7eLpWjeqhWfbLOQqTc0q8833GddQSmS6l6XN+MO6ZPNX3ORPM/pTyiD p5E7EncnyzDeEAIXTdv4zQqomZ4+YfM= X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 8C3F34003A X-Rspam-User: Authentication-Results: imf27.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=AiQG7FC3; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of haiyue.wang@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com X-Stat-Signature: ibrdet4xb6535o6cp1hnn351cqkaads1 X-HE-Tag: 1660616454-683263 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Not all huge page APIs support FOLL_GET option, so move_pages() syscall will fail to get the page node information for some huge pages. Like x86 on linux 5.19 with 1GB huge page API follow_huge_pud(), it will return NULL page for FOLL_GET when calling move_pages() syscall with the NULL 'nodes' parameter, the 'status' parameter has '-2' error in array. Note: follow_huge_pud() now supports FOLL_GET in linux 6.0. Link: https://lore.kernel.org/all/20220714042420.1847125-3-naoya.horiguchi@linux.dev But these huge page APIs don't support FOLL_GET: 1. follow_huge_pud() in arch/s390/mm/hugetlbpage.c 2. follow_huge_addr() in arch/ia64/mm/hugetlbpage.c It will cause WARN_ON_ONCE for FOLL_GET. 3. follow_huge_pgd() in mm/hugetlb.c This is an temporary solution to mitigate the side effect of the race condition fix by calling follow_page() with FOLL_GET set for huge pages. After supporting follow huge page by FOLL_GET is done, this fix can be reverted safely. Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline") Signed-off-by: Haiyue Wang Reviewed-by: "Huang, Ying" Reviewed-by: Baolin Wang Reviewed-by: Miaohe Lin Signed-off-by: Mike Kravetz Signed-off-by: Gerald Schaefer --- mm/migrate.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 6a1597c92261..581dfaad9257 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, for (i = 0; i < nr_pages; i++) { unsigned long addr = (unsigned long)(*pages); + unsigned int foll_flags = FOLL_DUMP; struct vm_area_struct *vma; struct page *page; int err = -EFAULT; @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (!vma) goto set_status; + /* Not all huge page follow APIs support 'FOLL_GET' */ + if (!is_vm_hugetlb_page(vma)) + foll_flags |= FOLL_GET; + /* FOLL_DUMP to ignore special (like zero) pages */ - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP); + page = follow_page(vma, addr, foll_flags); err = PTR_ERR(page); if (IS_ERR(page)) @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (page && !is_zone_device_page(page)) { err = page_to_nid(page); - put_page(page); + if (foll_flags & FOLL_GET) + put_page(page); } else { err = -ENOENT; } From patchwork Tue Aug 16 02:21:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Haiyue" X-Patchwork-Id: 12944221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F488C25B08 for ; Tue, 16 Aug 2022 02:20:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A94478D0005; Mon, 15 Aug 2022 22:20:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D2A88D0003; Mon, 15 Aug 2022 22:20:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81C358D0005; Mon, 15 Aug 2022 22:20:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6ACEB8D0001 for ; Mon, 15 Aug 2022 22:20:55 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 428C3C04F7 for ; Tue, 16 Aug 2022 02:20:55 +0000 (UTC) X-FDA: 79803852870.08.C760CD1 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf11.hostedemail.com (Postfix) with ESMTP id B609140079 for ; Tue, 16 Aug 2022 02:20:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660616454; x=1692152454; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=azMMIEPUBKNPjDvLiZ7ETAmaGJRsnErQLDZfHSszH5s=; b=S6YD1erIXyCPT4EQV8dhfyuzgiqbon51o+dj0g/n3vTVpaKnb89ef41d KM9IaD+28GSBw2GdKLihuceWwEqg4t2OukIjpaj+TFQuqpDWqerkEZtNz A2Lw6+DNjR+0gnHkMYaj5QmJ55N4JOksv0X2vs9gCGMQpl9a4DVbAifN0 gpb2LxNR/BZOLwnjTeiZ5e97E9iTBVzNKM7EB7GqM+/1uwKfiYxk7X0vA o2//HjNCm7CBfSQW1Do99UFpENbnEUQvfFntz0C4BCBJpJECJXiwDExLG XJEm1XOdnZVLQGT0fMG8iIUvvnW3Jurh/XQlfBXiGILTYb1j2HErdmAYd w==; X-IronPort-AV: E=McAfee;i="6400,9594,10440"; a="279067293" X-IronPort-AV: E=Sophos;i="5.93,240,1654585200"; d="scan'208";a="279067293" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Aug 2022 19:20:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,240,1654585200"; d="scan'208";a="934706666" Received: from sse-cse-haiyue-nuc.sh.intel.com ([10.239.241.114]) by fmsmga005.fm.intel.com with ESMTP; 15 Aug 2022 19:20:50 -0700 From: Haiyue Wang To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, david@redhat.com, apopple@nvidia.com, linmiaohe@huawei.com, ying.huang@intel.com, songmuchun@bytedance.com, naoya.horiguchi@linux.dev, alex.sierra@amd.com, Haiyue Wang , Felix Kuehling Subject: [PATCH v6 2/2] mm: fix the handling Non-LRU pages returned by follow_page Date: Tue, 16 Aug 2022 10:21:01 +0800 Message-Id: <20220816022102.582865-3-haiyue.wang@intel.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220816022102.582865-1-haiyue.wang@intel.com> References: <20220812084921.409142-1-haiyue.wang@intel.com> <20220816022102.582865-1-haiyue.wang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660616455; a=rsa-sha256; cv=none; b=DG5R1PA2yNH9+Lf5QIwJCD9sHJybvqUYyxEPc2h/GSSCz9UHvkUxeT1GMiQnywnwVKSZtu ZTgPufg71fxOv9IRdD+hLYOC+t1cq8T3qgBaEQHZW7pVKRM/KD3BHL4YdDlQGtpLr1tpLe dWQKUscEh2UivuqABIlMROkluZgNCCk= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=S6YD1erI; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf11.hostedemail.com: domain of haiyue.wang@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660616455; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4m6FFsiMXjVrj/sY66FRQBAOVFUgarp7kGspBrENgLI=; b=z8RVyP5z3WVH/5EUuSOT2FJkbCZ1psV8CQNFoCQGNZgGcB4MVaaTgYVfJBsv/E87jsVFdD iEVn+QF4mhucrZo/aZNSr9ziGyPQiKK5I5EcH/BPKU0kZWDsZ4Hl1liLb0+cVfG5X5bXNv rqTth4MeNkg4FcxGS9aIUq305KNrz6k= X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B609140079 X-Rspam-User: Authentication-Results: imf11.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=S6YD1erI; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf11.hostedemail.com: domain of haiyue.wang@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com X-Stat-Signature: 6jdiesp9ybjqepgkghy1i1smb9q855a1 X-HE-Tag: 1660616454-134696 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The handling Non-LRU pages returned by follow_page() jumps directly, it doesn't call put_page() to handle the reference count, since 'FOLL_GET' flag for follow_page() has get_page() called. Fix the zone device page check by handling the page reference count correctly before returning. And as David reviewed, "device pages are never PageKsm pages". Drop this zone device page check for break_ksm(). Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages") Signed-off-by: Haiyue Wang Reviewed-by: "Huang, Ying" Reviewed-by: Felix Kuehling Reviewed-by: Alistair Popple Reviewed-by: Miaohe Lin Acked-by: David Hildenbrand --- mm/huge_memory.c | 4 ++-- mm/ksm.c | 12 +++++++++--- mm/migrate.c | 19 ++++++++++++------- 3 files changed, 23 insertions(+), 12 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8a7c1b344abe..b2ba17c3dcd7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, /* FOLL_DUMP to ignore special (like zero) pages */ page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) continue; - if (!is_transparent_hugepage(page)) + if (is_zone_device_page(page) || !is_transparent_hugepage(page)) goto next; total++; diff --git a/mm/ksm.c b/mm/ksm.c index 42ab153335a2..e26f57fc1f0e 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) cond_resched(); page = follow_page(vma, addr, FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) break; if (PageKsm(page)) ret = handle_mm_fault(vma, addr, @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item) goto out; page = follow_page(vma, addr, FOLL_GET); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) goto out; + if (is_zone_device_page(page)) + goto out_putpage; if (PageAnon(page)) { flush_anon_page(vma, page, addr); flush_dcache_page(page); } else { +out_putpage: put_page(page); out: page = NULL; @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) if (ksm_test_exit(mm)) break; *page = follow_page(vma, ksm_scan.address, FOLL_GET); - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) { + if (IS_ERR_OR_NULL(*page)) { ksm_scan.address += PAGE_SIZE; cond_resched(); continue; } + if (is_zone_device_page(*page)) + goto next_page; if (PageAnon(*page)) { flush_anon_page(vma, *page, ksm_scan.address); flush_dcache_page(*page); @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) mmap_read_unlock(mm); return rmap_item; } +next_page: put_page(*page); ksm_scan.address += PAGE_SIZE; cond_resched(); diff --git a/mm/migrate.c b/mm/migrate.c index 581dfaad9257..44e05ce41d49 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, goto out; err = -ENOENT; - if (!page || is_zone_device_page(page)) + if (!page) goto out; + if (is_zone_device_page(page)) + goto out_putpage; + err = 0; if (page_to_nid(page) == node) goto out_putpage; @@ -1868,13 +1871,15 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (IS_ERR(page)) goto set_status; - if (page && !is_zone_device_page(page)) { + err = -ENOENT; + if (!page) + goto set_status; + + if (!is_zone_device_page(page)) err = page_to_nid(page); - if (foll_flags & FOLL_GET) - put_page(page); - } else { - err = -ENOENT; - } + + if (foll_flags & FOLL_GET) + put_page(page); set_status: *status = err;