From patchwork Mon Aug 15 01:59:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Haiyue" X-Patchwork-Id: 12943035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07396C25B0F for ; Mon, 15 Aug 2022 01:59:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8E2976B0073; Sun, 14 Aug 2022 21:59:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 86BB58D0002; Sun, 14 Aug 2022 21:59:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 583AE6B0075; Sun, 14 Aug 2022 21:59:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4AAF46B0073 for ; Sun, 14 Aug 2022 21:59:01 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 72318C03A7 for ; Mon, 15 Aug 2022 01:59:00 +0000 (UTC) X-FDA: 79800168840.04.168486F Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf30.hostedemail.com (Postfix) with ESMTP id 4C2BE8008C for ; Mon, 15 Aug 2022 01:58:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660528739; x=1692064739; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SnO+KuyY2Ig6pp+7jgz3mFNr913uLAxnryQVGv4z4qw=; b=fiw4Oqa0QCaAN++RjkTKp4Cwk+6rDoqWtm9F9u6MR4gl3n0e2H3vgjbz hxq2MvGZnIlxLf1E7CORWKFUnHGBthGsFa+8eb1BZmJ5R0NLC1XKnYtDK oRmvgN99sY7g/YwqXYTSAAe/8NhoHpC1fP9J85A+/5zbakAwvx9NKJWGE SLopR0FdPC0Cd4dzuNnFdX/awsUlYwrkOwebtdjR7eThCV8iO8wAOoG59 V1o5UUNtR+iLFrZcoXNTISk7n9wjRoYgdhoNmoG94HGt1MxPzPi+PNi8H ABVZJOUXSMrTFzXG0vliBoydXFyiMfXw7OMaHhX80O7uIwe45zU1ccjIR w==; X-IronPort-AV: E=McAfee;i="6400,9594,10439"; a="317851808" X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="317851808" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Aug 2022 18:58:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="557132019" Received: from sse-cse-haiyue-nuc.sh.intel.com ([10.239.241.114]) by orsmga003.jf.intel.com with ESMTP; 14 Aug 2022 18:58:55 -0700 From: Haiyue Wang To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, ying.huang@intel.com, songmuchun@bytedance.com, naoya.horiguchi@linux.dev, alex.sierra@amd.com, Haiyue Wang Subject: [PATCH v4 1/2] mm: migration: fix the FOLL_GET failure on following huge page Date: Mon, 15 Aug 2022 09:59:08 +0800 Message-Id: <20220815015909.439623-2-haiyue.wang@intel.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220815015909.439623-1-haiyue.wang@intel.com> References: <20220812084921.409142-1-haiyue.wang@intel.com> <20220815015909.439623-1-haiyue.wang@intel.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=fiw4Oqa0; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf30.hostedemail.com: domain of haiyue.wang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660528740; a=rsa-sha256; cv=none; b=vWYnbqAH9jHZ62yiGycnYmDoHKiiugsrxV2GmDv7mFooiHVbNClJeR8+eAJYSNtGvN1tqc +sVoRx1Laa7wM3H4TXkGQ930i7H9KrR1Hbg4DTxZNfA+xZQAK/7fFOc6mq3p9GPdUT36r9 uvDxyr0ls863KBwnH8KxIkfgJEOHvpU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660528740; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TW5aLozSfEFNEaUPfuQ7159iYgjPVED4osgH8qKTcys=; b=AO8n0R97/3JizYBuVPiF/vm2c5EdCmd4fXRcb4Zpq4J88lHoXxmcKhWX9AjzMYgnO6+NOy cc6ip1HvkJ7184fKToDIKz4KDX0jCZ6HywsETFxW8/8V2EYat81Btj6DXwnk0sx2HG5cZA ru+qsQADZ17A8B+MZAe7IM/4OL/y0NU= X-Rspam-User: Authentication-Results: imf30.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=fiw4Oqa0; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf30.hostedemail.com: domain of haiyue.wang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com X-Stat-Signature: x4jdm3y1nn3r7jecfw3dxk9191nfybw7 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 4C2BE8008C X-HE-Tag: 1660528739-241861 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Not all huge page APIs support FOLL_GET option, so the __NR_move_pages will fail to get the page node information for huge page. This is an temporary solution to mitigate the racing fix. After supporting follow huge page by FOLL_GET is done, this fix can be reverted safely. Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline") Signed-off-by: Haiyue Wang --- mm/migrate.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 6a1597c92261..581dfaad9257 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1848,6 +1848,7 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, for (i = 0; i < nr_pages; i++) { unsigned long addr = (unsigned long)(*pages); + unsigned int foll_flags = FOLL_DUMP; struct vm_area_struct *vma; struct page *page; int err = -EFAULT; @@ -1856,8 +1857,12 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (!vma) goto set_status; + /* Not all huge page follow APIs support 'FOLL_GET' */ + if (!is_vm_hugetlb_page(vma)) + foll_flags |= FOLL_GET; + /* FOLL_DUMP to ignore special (like zero) pages */ - page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP); + page = follow_page(vma, addr, foll_flags); err = PTR_ERR(page); if (IS_ERR(page)) @@ -1865,7 +1870,8 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (page && !is_zone_device_page(page)) { err = page_to_nid(page); - put_page(page); + if (foll_flags & FOLL_GET) + put_page(page); } else { err = -ENOENT; } From patchwork Mon Aug 15 01:59:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Haiyue" X-Patchwork-Id: 12943036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE1C4C25B06 for ; Mon, 15 Aug 2022 01:59:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C5E78D0003; Sun, 14 Aug 2022 21:59:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 44AFC8D0002; Sun, 14 Aug 2022 21:59:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29F8B8D0003; Sun, 14 Aug 2022 21:59:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 140338D0002 for ; Sun, 14 Aug 2022 21:59:03 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id DE2B240BCD for ; Mon, 15 Aug 2022 01:59:02 +0000 (UTC) X-FDA: 79800168924.28.044D469 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf05.hostedemail.com (Postfix) with ESMTP id 3A8E81001A9 for ; Mon, 15 Aug 2022 01:59:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660528742; x=1692064742; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3RyMMTN/AHHSqtfx+jaKLw7DbxnmzdBDsCyu34bLAqQ=; b=atMAAw3U76874QQNeFVqmOX+wMCfX1rtlCCmiX06t7Rds7r/mvY0cw1V G4+CvzIz5iTZ/m90d1HymnrtUnKQneO3cVP283QIY23HuLva74Yfuevdt OgoCzw1ENFfnlCMOuLylDZhoRl03aC8bGjNRENiboHgrKkvsYlKwcwWYc acEm0PB4SLLXbjfkyJTG8YkN4TC7dxUBbgrqkyXf1fE7GDUP5FQgavDTQ jIYgEnBuM79qnJpMZsqM+2RHYpYCPI4F6lCVc9VvrqFmaAzQkx/5mTe5x 4KpGKy1RUC/hgUveTnDRUarG2nZAdf7tKF5lpiZ7iedbhd72h34arCrOV g==; X-IronPort-AV: E=McAfee;i="6400,9594,10439"; a="317851821" X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="317851821" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Aug 2022 18:59:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="557132038" Received: from sse-cse-haiyue-nuc.sh.intel.com ([10.239.241.114]) by orsmga003.jf.intel.com with ESMTP; 14 Aug 2022 18:58:58 -0700 From: Haiyue Wang To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, ying.huang@intel.com, songmuchun@bytedance.com, naoya.horiguchi@linux.dev, alex.sierra@amd.com, Haiyue Wang , Alistair Popple , Felix Kuehling Subject: [PATCH v4 2/2] mm: fix the handling Non-LRU pages returned by follow_page Date: Mon, 15 Aug 2022 09:59:09 +0800 Message-Id: <20220815015909.439623-3-haiyue.wang@intel.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220815015909.439623-1-haiyue.wang@intel.com> References: <20220812084921.409142-1-haiyue.wang@intel.com> <20220815015909.439623-1-haiyue.wang@intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660528742; a=rsa-sha256; cv=none; b=HrhTllRdsbZ/Q7YjkVivm77IFEu99fM+lS0R3L3/cGpBgJq018mlhTmfSch0ShcKIwoRfS DdCQYeU0dXZSElLioH2wkh3ZLIea7GcoWfYBlrnVwgnpKzb+EeDbTRcz+KOTZBV0QbiMEf bVGn75kf99WubUvPPSJrLqvZs6eOO4g= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=atMAAw3U; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf05.hostedemail.com: domain of haiyue.wang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660528742; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UXVGBXteGgnAWCtkLVBBIWLJ+s/PcHculFgKaTH2TmQ=; b=wlypLZVYbeagrS/2JwBa8JpH1vTR4s6dAJBP87HxC1XHiJjgptk3C422Opi9G1JJ7s0SG4 Qf28x1S1cUsxkYJwcYYLZEWuAtUpg1v0KWwcLLqbefoUuSSGrGF5sVCB/bLoQ7wtJs1j6W 1LnAq0z35rQGrfDhqWSGYKQpMcVsziM= X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 3A8E81001A9 Authentication-Results: imf05.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=atMAAw3U; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf05.hostedemail.com: domain of haiyue.wang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=haiyue.wang@intel.com X-Rspam-User: X-Stat-Signature: x4sypz8ws5hynyezihjmmdho7r636gc5 X-HE-Tag: 1660528742-347257 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The handling Non-LRU pages returned by follow_page() jumps directly, it doesn't call put_page() to handle the reference count, since 'FOLL_GET' flag for follow_page() has get_page() called. Fix the zone device page check by handling the page reference count correctly before returning. And as David reviewed, "device pages are never PageKsm pages". Drop this zone device page check for break_ksm(). Fixes: 3218f8712d6b ("mm: handling Non-LRU pages returned by vm_normal_pages") Signed-off-by: Haiyue Wang --- mm/huge_memory.c | 4 ++-- mm/ksm.c | 12 +++++++++--- mm/migrate.c | 10 +++++++--- 3 files changed, 18 insertions(+), 8 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8a7c1b344abe..b2ba17c3dcd7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2963,10 +2963,10 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, /* FOLL_DUMP to ignore special (like zero) pages */ page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) continue; - if (!is_transparent_hugepage(page)) + if (is_zone_device_page(page) || !is_transparent_hugepage(page)) goto next; total++; diff --git a/mm/ksm.c b/mm/ksm.c index 42ab153335a2..e26f57fc1f0e 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -475,7 +475,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) cond_resched(); page = follow_page(vma, addr, FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) break; if (PageKsm(page)) ret = handle_mm_fault(vma, addr, @@ -560,12 +560,15 @@ static struct page *get_mergeable_page(struct rmap_item *rmap_item) goto out; page = follow_page(vma, addr, FOLL_GET); - if (IS_ERR_OR_NULL(page) || is_zone_device_page(page)) + if (IS_ERR_OR_NULL(page)) goto out; + if (is_zone_device_page(page)) + goto out_putpage; if (PageAnon(page)) { flush_anon_page(vma, page, addr); flush_dcache_page(page); } else { +out_putpage: put_page(page); out: page = NULL; @@ -2308,11 +2311,13 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) if (ksm_test_exit(mm)) break; *page = follow_page(vma, ksm_scan.address, FOLL_GET); - if (IS_ERR_OR_NULL(*page) || is_zone_device_page(*page)) { + if (IS_ERR_OR_NULL(*page)) { ksm_scan.address += PAGE_SIZE; cond_resched(); continue; } + if (is_zone_device_page(*page)) + goto next_page; if (PageAnon(*page)) { flush_anon_page(vma, *page, ksm_scan.address); flush_dcache_page(*page); @@ -2327,6 +2332,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) mmap_read_unlock(mm); return rmap_item; } +next_page: put_page(*page); ksm_scan.address += PAGE_SIZE; cond_resched(); diff --git a/mm/migrate.c b/mm/migrate.c index 581dfaad9257..fee12cd2f294 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1672,9 +1672,12 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr, goto out; err = -ENOENT; - if (!page || is_zone_device_page(page)) + if (!page) goto out; + if (is_zone_device_page(page)) + goto out_putpage; + err = 0; if (page_to_nid(page) == node) goto out_putpage; @@ -1868,8 +1871,9 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages, if (IS_ERR(page)) goto set_status; - if (page && !is_zone_device_page(page)) { - err = page_to_nid(page); + if (page) { + err = !is_zone_device_page(page) ? page_to_nid(page) + : -ENOENT; if (foll_flags & FOLL_GET) put_page(page); } else {