From patchwork Wed Jul 12 06:01:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13309582 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02F34EB64DA for ; Wed, 12 Jul 2023 06:02:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 551456B0072; Wed, 12 Jul 2023 02:02:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 501BD6B0075; Wed, 12 Jul 2023 02:02:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F00B6B0078; Wed, 12 Jul 2023 02:02:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2F0FB6B0072 for ; Wed, 12 Jul 2023 02:02:01 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EB6571C766E for ; Wed, 12 Jul 2023 06:02:00 +0000 (UTC) X-FDA: 81001914000.17.2FDC537 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf25.hostedemail.com (Postfix) with ESMTP id E5023A001F for ; Wed, 12 Jul 2023 06:01:57 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=iBX8JCvK; spf=pass (imf25.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689141718; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8Tp9SwptlNmcx1F/KycyE/xYsKdhV38ttWqI2MHrVbE=; b=o/RE1d7KFhXfM6qQ1IzGLQrszDg4iyq/msisz6GnM3njCU4qXOOCKduotrqH7lwfjCz/Iq nLPeEiVcSDA0SypjGVyY7JM5nQkXhY71aspshlhlAr0sprRAK54pk26lv0hxocKJjp4eUd obuXMtO3TIOUtx6K9XjKGyF+RXBI1Cc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689141718; a=rsa-sha256; cv=none; b=OGi9FemOfaxXjg7X3Gr1HgUaMQaNNRXviZpXhFendYy3LCthFXOFVVymELU54D4RcJQyD1 dtz5rgCSUUjPT4PM0FmQqQQHpxL7X/IQSieUulMs7TzIQ08GqWYV4DsA83mza/usLKzN9R nEMyM35gbvberWRNKCQCffqE6YCVRMo= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=iBX8JCvK; spf=pass (imf25.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689141718; x=1720677718; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=o/Aws3JMOGR03jlt+wsWp78WyeH6J7qvKifxMU03VXk=; b=iBX8JCvK49IYQdjFyUL61YrqkMZLWHn98iC4MyDMq+vGLcIJ+6XRQH+X xTR2Gl154T5o8qZGThQ3cMgyI1nF4yryczUiyWBkAvKFxBHgCUXjKbS3P ckUyAC3B5FGFIvNvugpGlt+K6UhlFy6DjJbqkwtiJoaReMv1ZMqvqcR9D qwQNbo4mupbt3DmgOWfZwprqhVb186wucQW0RRdxQ9KjeTfW9X3KvMzTh bpZbKx6i95QNWiTA5sY7PYYo59WA0Yvfd89XC8CEMwXoQ6D8hkQzwTO0e pid71D657gMuMeMcMqVce6SFx5ZG984RORGvRpY6m1JreE+tZaF/+EGg8 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="354715287" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="354715287" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2023 23:01:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="756643382" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="756643382" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga001.jf.intel.com with ESMTP; 11 Jul 2023 23:01:53 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 1/3] mm: add functions folio_in_range() and folio_within_vma() Date: Wed, 12 Jul 2023 14:01:42 +0800 Message-Id: <20230712060144.3006358-2-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230712060144.3006358-1-fengwei.yin@intel.com> References: <20230712060144.3006358-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: E5023A001F X-Rspam-User: X-Stat-Signature: ruxmq6exs1ydmods1e6ixmz8ybki98dw X-Rspamd-Server: rspam03 X-HE-Tag: 1689141717-113180 X-HE-Meta: U2FsdGVkX18n0UwzWNBCbdUPuV1IOipChMBQZ9l0PIhVQsFMrVaQWy4WzH9Di5OQ/pn1Xz7td0A8WDq9VpusTveQtW9zj9Ll5joEiYWREj9i9HiJpFGYUag6/ppBgdxjjpp2XG1JXp0G1PGGnfjwhSTMEexmKrhzbA1W4BMo6zB94yUZFrktYyUyvUSX05ZSAt6MxfTevtf/4Bcmk9hprQFGU2beU5hLIquS93iACkRDjqGcJS0FPHnY7Ybc3cI7OoU1O4jRZyFD5dG4EAv+bhbR6U0Odwqjq3uNGVrngs13NxoAkpp4Mqc/47VvOnCdkqlPqP/bIg6RRoDxSlrbmUCN24s3a6Rqux2jm2hkMhmP0JKme3sfUBRbbW83YmXtgWPaKU9E55ypJVZ4rjEkPCE4l4oCv7Y7G9tGz8SI3d/B8OpGXme7iNaknCH3sG9BfdZ0GMGFHxqECZ37GxcD9H73cNrs/V4Px/h9UrgPAgrBnBMIyl18OkFDfP+Wj/2nTXB3nQeICy628wPcYyOFoxDqLlhXGnyXoQ6k3q0gyvNvr5ElKuszLRfWdl8LWQoDFwzeuprnhQHjOE2aDOA2rl32ZR0PZkYyBrmt4QUsOO+pvyuCZNaS0DbqjS1QcIQNFAzoU5o5qX2CUiIANS/rfA0AtqO2kmP90wiW9AUEuBIy+G4e1DTwK/AEV4aeDlyc6n9m3XibbjpxzUe4787oF5431BJaOVb5nErzts4wHXE4pGY/sfHLQ2zsMlYM/X8tdB+fBp9VyvL9HGOQSvCbbdtjwaQL80AU1IpzKnn3Y+OlskHtr+CCzf+UNmliy+QBJjJhBJUXhHOOteN6tZJWq9MgHz4m8gh8RrpO5hrrV8aRS5E44hJC1ytQnkEPcMDZUGIOupIqWp1AITJo6KCRq/AMgImMkYrvjDDWX9SQ3uvyXk239stG1Ncfx7OY4kStJIWFLAr1W2bdpSkuEvS Oud0WqdT UhPw9SpbktRoimabbfhd/PbwikCwDr/uV7pNpVbXwClGsn0Jukcgx7kxuagL/PE1LacgsZ5axbrgeAR4dsAHXY53z47BGrLocerV+XfpQtnIuSu3m0YejY3g7wNInHO0vVcqdY3udxgQNiF9g5uX3fKlxYMX0w4iw6mgicxNqCciw0MmFHEv90aHpudiuMINKNZQdwyNsihHmBvo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It will be used to check whether the folio is mapped to specific VMA and whether the mapping address of folio is in the range. Also a helper function folio_within_vma() to check whether folio is in the range of vma based on folio_in_range(). Signed-off-by: Yin Fengwei Reviewed-by: Yu Zhao --- mm/internal.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/mm/internal.h b/mm/internal.h index 483add0bfb289..c7dd15d8de3ef 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -585,6 +585,38 @@ extern long faultin_vma_page_range(struct vm_area_struct *vma, bool write, int *locked); extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags, unsigned long bytes); + +static inline bool +folio_in_range(struct folio *folio, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + pgoff_t pgoff, addr; + unsigned long vma_pglen = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + + VM_WARN_ON_FOLIO(folio_test_ksm(folio), folio); + if (start < vma->vm_start) + start = vma->vm_start; + + if (end > vma->vm_end) + end = vma->vm_end; + + pgoff = folio_pgoff(folio); + + /* if folio start address is not in vma range */ + if (pgoff < vma->vm_pgoff || pgoff > vma->vm_pgoff + vma_pglen) + return false; + + addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); + + return ((addr >= start) && (addr + folio_size(folio) <= end)); +} + +static inline bool +folio_within_vma(struct folio *folio, struct vm_area_struct *vma) +{ + return folio_in_range(folio, vma, vma->vm_start, vma->vm_end); +} + /* * mlock_vma_folio() and munlock_vma_folio(): * should be called with vma's mmap_lock held for read or write, From patchwork Wed Jul 12 06:01:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13309583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFFC2EB64D9 for ; Wed, 12 Jul 2023 06:02:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59FD56B0075; Wed, 12 Jul 2023 02:02:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 54EF96B0078; Wed, 12 Jul 2023 02:02:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4172B6B007B; Wed, 12 Jul 2023 02:02:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 304D76B0075 for ; Wed, 12 Jul 2023 02:02:13 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id EF9E0160124 for ; Wed, 12 Jul 2023 06:02:12 +0000 (UTC) X-FDA: 81001914504.02.A65A803 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf01.hostedemail.com (Postfix) with ESMTP id AB97F40013 for ; Wed, 12 Jul 2023 06:02:10 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=FvPwCiTQ; spf=pass (imf01.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689141731; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PVsUsy9u6N0EfW896VdKQHOaT76JviUbVuFG7umdeIo=; b=zsvfInvS0tL9TaYtqBrIelwwU5yZi4QHpurtBYrzqpK/u8lq9Ew5cH1qDVtXNfplljrpgX Yx/qLkfLamftxSrWg6k06UwZLNCwZEmpasK+HxCmwUHKNWEbokP3+7vc1N6nz6GlDvj/H7 byayKJmkbx+896RoBXutjclObhl8atQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689141731; a=rsa-sha256; cv=none; b=QVH0nIF14GQC+n3d+zNAiR6BmrBQu8JxV24Yhp/95CH+smJbk6qkAmXhfBChwMgR/LHKaC mxCD/xVgBRBeJ5jhxcaMSInt89MWFpuYbbyFDnGf3AwVKCye4D/ph0y1gvBnjWv2BHsZHX moSrCIbz0K0Jgs5Wmaw+kgcJTXZkhaM= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=FvPwCiTQ; spf=pass (imf01.hostedemail.com: domain of fengwei.yin@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689141730; x=1720677730; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bgQflrx8qMg9vkqGlfihJRy1xgfAbAds62AMD1mGmM4=; b=FvPwCiTQ2ewOzBpaDjqFetT9isTYHhpAjobWoHZFeIe6ep3ztCH19Ivb HwavmLE6K5tpFAHGpCl9gLrlUIO86Hu4bpaQHM5vFzoSqw/0sip947DEb v/GqNwmNSHzHsOXFTTeiBg/xMFKtubFyY587jPVcxWUlJ1YlqlBD1UL73 45TaGFtOkIz/5LQL9iVU2D6N+2MH4innH2+N977o9rYInXR42z3XrSjMl bE7OxDR4p6pvixxAa7G5PGwrkors1Egl2usSZhabG4nJyGOg1/OtVtW5p vsMDOZX+f41IOJE0TLfGLil/F6I8tf3uPAqVWsszJPqaYkwq4g83C+0K4 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="363673769" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="363673769" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2023 23:02:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="1052051350" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="1052051350" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga005.fm.intel.com with ESMTP; 11 Jul 2023 23:02:06 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 2/3] mm: handle large folio when large folio in VM_LOCKED VMA range Date: Wed, 12 Jul 2023 14:01:43 +0800 Message-Id: <20230712060144.3006358-3-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230712060144.3006358-1-fengwei.yin@intel.com> References: <20230712060144.3006358-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: AB97F40013 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: e3iqkj6p8bwqp3b1d4xrpq5dtyjiq78b X-HE-Tag: 1689141730-840000 X-HE-Meta: U2FsdGVkX18yBhl+Kqd5G0UEZ/a/kZhV5cQlGEz9KWI/TlENSADiF0w1TkSKYcGDwpXJdIbeEkEW4Ivn78/wFcZnSQAG9UnufaqrjSp2HxlWCNKERbKaAKrT9uu9DjxoBFOULV0DC1YYp0S13iIUSOtuyTbIWMtk1dERXz6thfiTAtqB6UtKJGyS+31epJDYrQgEtCxOWzAEF6KB19HDXBP1Rxqp1vzNFhrMUwdcmTe5g2bvPGEa93pJW+Ho4A0S0pwVecQCDAxsL2vq3/JIcLl89WxCxX7fmqcAyY1+fEETUBP5m11SuKITHed/2nVDlDj3ggYnlWGwUoqHIkp+/wFjg3uh6yHk3ejsBvha8lm5l48kEDxWcUYzP7TSk+RPMjNPP1cAb1EzBWXvv37XNdcL0p2+gH19DqpFoV09qOKqwrbtZpJnoNdw7xtTERRNOnioap4XP7WT00fHsoBqoiPuE1FVUHO4Mo0WBFNNUTYIfLDElBQEW2OUTUF/65G4PID3SPTHHJI0/W9twdj7+VzhhMRO+zPKfnTpz5iV5eAHDbh2GorFvc8HXJoYx2q91MltMYysYkNXEu4Bx8POQNNnG+UOJJzonx5tQ+AJ9kYXxxMxqkLYE0/tEk7vrwVyAx+VqSXrf+h8HffPEz8miW4ea8F4TvGLIEAMOzIC567uDJO3BOU7kJ0DQO9mnxDw+Bb+kncr22r32/4DCC7KbkKxRBOvGe/7mc34zVO45gj5k4fNNUg4gfXjz6yenjQVuUYvztA58Ga19dnxd1Br0CzBgsrzv4ygjaSW4s6Mz4St/SE+TD1gCPKCPVdYdAZfdV6r9nT4+O2qGWVHXqE5h9fGTpGHxavtssRjFQNTSucFhrNBi6cugfeAQ/RFseCIHzktE8QP3oxJfnzPQPz8w09ZoZT+T5mbkWI09e9fGOsO9BopcDgWSQUiuobcoY7xO43+altWUZ3Q78XScJj L2Ne2gNT a61s2CTYmaQsdD3F16MFKf+ra0NqX+WSxXr77eTj3nK7Io/YSw9HtEJ5vkr5O1hrz+eRKb1hKojV3S8jiE71HY2+WdRHPzldhbLY1Ui4tpjcvnSUHaBY5f7DQPZpSjm0IrjrFLscgh1sNfNeQal/3ul8WPPi3R+TXs65Bvf4rvg3hDy6FU4oPNxYtGG7vMnCX8WkLyAFwMsCNgPg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If large folio is in the range of VM_LOCKED VMA, it should be mlocked to avoid being picked by page reclaim. Which may split the large folio and then mlock each pages again. Mlock this kind of large folio to prevent them being picked by page reclaim. For the large folio which cross the boundary of VM_LOCKED VMA, we'd better not to mlock it. So if the system is under memory pressure, this kind of large folio will be split and the pages ouf of VM_LOCKED VMA can be reclaimed. Signed-off-by: Yin Fengwei --- mm/internal.h | 11 ++++++++--- mm/rmap.c | 34 +++++++++++++++++++++++++++------- 2 files changed, 35 insertions(+), 10 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index c7dd15d8de3ef..776141de2797a 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -643,7 +643,8 @@ static inline void mlock_vma_folio(struct folio *folio, * still be set while VM_SPECIAL bits are added: so ignore it then. */ if (unlikely((vma->vm_flags & (VM_LOCKED|VM_SPECIAL)) == VM_LOCKED) && - (compound || !folio_test_large(folio))) + (compound || !folio_test_large(folio) || + folio_in_range(folio, vma, vma->vm_start, vma->vm_end))) mlock_folio(folio); } @@ -651,8 +652,12 @@ void munlock_folio(struct folio *folio); static inline void munlock_vma_folio(struct folio *folio, struct vm_area_struct *vma, bool compound) { - if (unlikely(vma->vm_flags & VM_LOCKED) && - (compound || !folio_test_large(folio))) + /* + * To handle the case that a mlocked large folio is unmapped from VMA + * piece by piece, allow munlock the large folio which is partially + * mapped to VMA. + */ + if (unlikely(vma->vm_flags & VM_LOCKED)) munlock_folio(folio); } diff --git a/mm/rmap.c b/mm/rmap.c index 2668f5ea35342..455f415d8d9ca 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -803,6 +803,14 @@ struct folio_referenced_arg { unsigned long vm_flags; struct mem_cgroup *memcg; }; + +static inline bool should_restore_mlock(struct folio *folio, + struct vm_area_struct *vma, bool pmd_mapped) +{ + return !folio_test_large(folio) || + pmd_mapped || folio_within_vma(folio, vma); +} + /* * arg: folio_referenced_arg will be passed */ @@ -816,13 +824,25 @@ static bool folio_referenced_one(struct folio *folio, while (page_vma_mapped_walk(&pvmw)) { address = pvmw.address; - if ((vma->vm_flags & VM_LOCKED) && - (!folio_test_large(folio) || !pvmw.pte)) { - /* Restore the mlock which got missed */ - mlock_vma_folio(folio, vma, !pvmw.pte); - page_vma_mapped_walk_done(&pvmw); - pra->vm_flags |= VM_LOCKED; - return false; /* To break the loop */ + if (vma->vm_flags & VM_LOCKED) { + if (should_restore_mlock(folio, vma, !pvmw.pte)) { + /* Restore the mlock which got missed */ + mlock_vma_folio(folio, vma, !pvmw.pte); + page_vma_mapped_walk_done(&pvmw); + pra->vm_flags |= VM_LOCKED; + return false; /* To break the loop */ + } else { + /* + * For large folio cross VMA boundaries, it's + * expected to be picked by page reclaim. But + * should skip reference of pages which are in + * the range of VM_LOCKED vma. As page reclaim + * should just count the reference of pages out + * the range of VM_LOCKED vma. + */ + pra->mapcount--; + continue; + } } if (pvmw.pte) { From patchwork Wed Jul 12 06:01:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13309584 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DB98EB64DA for ; Wed, 12 Jul 2023 06:02:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0BCB56B0078; Wed, 12 Jul 2023 02:02:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 06DB26B007B; Wed, 12 Jul 2023 02:02:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E77EF6B007D; Wed, 12 Jul 2023 02:02:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D62026B0078 for ; Wed, 12 Jul 2023 02:02:27 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 77510C0123 for ; Wed, 12 Jul 2023 06:02:27 +0000 (UTC) X-FDA: 81001915134.21.CFAE5C8 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf16.hostedemail.com (Postfix) with ESMTP id 06C9918001A for ; Wed, 12 Jul 2023 06:02:24 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=K9sFKhG9; spf=pass (imf16.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689141745; a=rsa-sha256; cv=none; b=C4x65wFWcFPa2/Jm+XYjh1T8yCjaEP5NSsCDNY9NdyxPTFgvyh5UVviAxNheL9ebhviPa3 78NF1D4aor3W3X/PvusO94Qd9E7bc1d+n9VnzQYBn11xsndSEPBwPJGXw/BE5B4KY9LE9q Rc12c1IuSHOsKgVgOkRJmeVqV3gTKBE= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=K9sFKhG9; spf=pass (imf16.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689141745; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rjv9dWHp8SbpP9YjBKKRvSmX7w1XyPe4qRtFYjWE2Uk=; b=d8ijtrwH0ftPmVuZnXtkgu2+ZtGcl1IUhSYCiDHJ6csbFDPkvDOaMVdm/muROOfb6Mf4TH gRDMqV/vszXxJ8mL7mEm41UbrhcILxHP2+jDxTkqTLShYKtgRYSegwIseIopxxc3axlG4z ijn1luE5A1s5Y5XV7zwasOFcbivU0cE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689141745; x=1720677745; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=q8YW+Ak1aFSHS6ap1YknDbUFbazLnvkg+GE8oMd5GIc=; b=K9sFKhG9UwmA2MLqe+gt6pj/ZWkK4Uxl+gBR8cdlqF4wSv9R+0PzbuyG amrqFyd/mSo3jPOvegY0Eg5ta7E2843IDFmI6IiOgzPaZ92W091+PEx1y fcF4uDrJYDnKQH/dnF2poV8jNcybT0aw4SwXASedFeqHygQCyCmVMALS4 N1g0/AjqmYYemqgbs5ffYTm/Fl/QEymtW4JYud2tPaIp5k3bfD0FtKeYU 6GX/CRnd1KQlTIenzd9s+yLHPhPR0Do/w6NadtSr28XvW+hDgyuCIgHuV ORyGuvFwPLKmaDnSmSMzRUu4sZo7t+kuCRhzFE3bNyGBqRZGat0m/nO22 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="349662859" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="349662859" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2023 23:02:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="865994265" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="865994265" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga001.fm.intel.com with ESMTP; 11 Jul 2023 23:02:20 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 3/3] mm: mlock: update mlock_pte_range to handle large folio Date: Wed, 12 Jul 2023 14:01:44 +0800 Message-Id: <20230712060144.3006358-4-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230712060144.3006358-1-fengwei.yin@intel.com> References: <20230712060144.3006358-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 06C9918001A X-Stat-Signature: fn9eqbaf7m9sc96cci886q4foos5f3f6 X-Rspam-User: X-HE-Tag: 1689141744-617302 X-HE-Meta: U2FsdGVkX18ZaW3lgkcbv5BEdSV8NSs9jvD4ot+ZqouxRsGWXjHXYvv5yMwo/nlh20V1E7dext3UGYIfeJhfw7eMz5tlDKrMGSLl80BZlse4VFysIuip7YWKYW7esGqkYHHuvVzSHonz6fCVfKv7WKpVtW24truJLOXJAdwNEf15kcT/X+3yT4SgpIcnGbt7n7Fkt6O4k8wOnBrp2ThR8SrvjioUbHj6AMbvvnsFo5MDii9nv2CiNTetbmOHK8BWdvoULsqAseeHOnILTYnUjYdM41J7/tcivDaf5q9kZnx1pAAK7OGVoRc+ZvJiraEPIXl0OdX7gH5pS0b0PjAuaHYnCzsswHNxwgghHHbcyF41F9TL83E8SN7aum84s9UFQGzsg6yIuTLcM8Lk+D4F69OEeBUif11tb5QGTn/YVONdijcjo/eQ8TgFLQn2FgUxBItJSu+WsFByw5q9vEivwCgdkK6lh/qBT8++ZaQ6bECv69h3/AiXBD9MEcqUHn3lcCpyxxuqvbZ8Iz3cEEFL+rfuwpkDE9vL5bIvGRqzh0DVQbv/AlUAIIdZ1ORRzU8YgMPPXYA4rA8Bu7vYI9sCOHbv30n9L/Dw/pe55ZXygnIAYO/kpwRS6UNXX0u8Q+pRCTngQBF5ryrrTvHqxEfm3xk7lWn+HHKRKMd30LscmGRxuNrpfgMlem8++vjp9ynwuXfh3KEu2XLa6c7OiejizhTXP02AfyxGrpwkwYVUHolHrgTVABzISxGikbj58AhrxKJpnx65cLnK7ASt8t8WuPqTqkpl2ay3FLc2zy1GOaESC+WBEJQLqJKFlvubPU7GdHTSCgPe4KWdDauryyAU2lVxInTiPLWieTT6eYzvMzwi+ozFQJ3uxVYtal81+uXazXH1shUSwlOgOBP9VcBeiwyK9MZAmgVcr6Ay3/p1WYa4puAgNRvo7Yg1dws77fr7dg0cBWQ0U4XozX7G/G3 Fv5rx1LP hIONwR+icK29Kh/26NyldA7X0yLCNElgA6hHFcdvlgEow9O0SVsv+ezCbm+SHgELpt8prpm917UUWreiLk9TiDMWKhfpiOyh2X3di8aVcBKorLm7z8x0RB2pHgzdu4NgNQMWuNaOl1+pOZG1fVK+ECr2ZOD63yMdhZxUT7bu0/jcWUbs3Dc9ELbVx1P8IlhN3SIOpT78LKoJIQK0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Current kernel only lock base size folio during mlock syscall. Add large folio support with following rules: - Only mlock large folio when it's in VM_LOCKED VMA range - If there is cow folio, mlock the cow folio as cow folio is also in VM_LOCKED VMA range. - munlock will apply to the large folio which is in VMA range or cross the VMA boundary. The last rule is used to handle the case that the large folio is mlocked, later the VMA is split in the middle of large folio and this large folio become cross VMA boundary. Signed-off-by: Yin Fengwei --- mm/mlock.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 99 insertions(+), 5 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 0a0c996c5c214..f49e079066870 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -305,6 +305,95 @@ void munlock_folio(struct folio *folio) local_unlock(&mlock_fbatch.lock); } +static inline bool should_mlock_folio(struct folio *folio, + struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_LOCKED) + return (!folio_test_large(folio) || + folio_within_vma(folio, vma)); + + /* + * For unlock, allow munlock large folio which is partially + * mapped to VMA. As it's possible that large folio is + * mlocked and VMA is split later. + * + * During memory pressure, such kind of large folio can + * be split. And the pages are not in VM_LOCKed VMA + * can be reclaimed. + */ + + return true; +} + +static inline unsigned int get_folio_mlock_step(struct folio *folio, + pte_t pte, unsigned long addr, unsigned long end) +{ + unsigned int nr; + + nr = folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte); + return min_t(unsigned int, nr, (end - addr) >> PAGE_SHIFT); +} + +void mlock_folio_range(struct folio *folio, struct vm_area_struct *vma, + pte_t *pte, unsigned long addr, unsigned int nr) +{ + struct folio *cow_folio; + unsigned int step = 1; + + mlock_folio(folio); + if (nr == 1) + return; + + for (; nr > 0; pte += step, addr += (step << PAGE_SHIFT), nr -= step) { + pte_t ptent; + + step = 1; + ptent = ptep_get(pte); + + if (!pte_present(ptent)) + continue; + + cow_folio = vm_normal_folio(vma, addr, ptent); + if (!cow_folio || cow_folio == folio) { + continue; + } + + mlock_folio(cow_folio); + step = get_folio_mlock_step(folio, ptent, + addr, addr + (nr << PAGE_SHIFT)); + } +} + +void munlock_folio_range(struct folio *folio, struct vm_area_struct *vma, + pte_t *pte, unsigned long addr, unsigned int nr) +{ + struct folio *cow_folio; + unsigned int step = 1; + + munlock_folio(folio); + if (nr == 1) + return; + + for (; nr > 0; pte += step, addr += (step << PAGE_SHIFT), nr -= step) { + pte_t ptent; + + step = 1; + ptent = ptep_get(pte); + + if (!pte_present(ptent)) + continue; + + cow_folio = vm_normal_folio(vma, addr, ptent); + if (!cow_folio || cow_folio == folio) { + continue; + } + + munlock_folio(cow_folio); + step = get_folio_mlock_step(folio, ptent, + addr, addr + (nr << PAGE_SHIFT)); + } +} + static int mlock_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) @@ -314,6 +403,7 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, pte_t *start_pte, *pte; pte_t ptent; struct folio *folio; + unsigned int step = 1; ptl = pmd_trans_huge_lock(pmd, vma); if (ptl) { @@ -329,24 +419,28 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, goto out; } - start_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + pte = start_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); if (!start_pte) { walk->action = ACTION_AGAIN; return 0; } - for (pte = start_pte; addr != end; pte++, addr += PAGE_SIZE) { + + for (; addr != end; pte += step, addr += (step << PAGE_SHIFT)) { + step = 1; ptent = ptep_get(pte); if (!pte_present(ptent)) continue; folio = vm_normal_folio(vma, addr, ptent); if (!folio || folio_is_zone_device(folio)) continue; - if (folio_test_large(folio)) + if (!should_mlock_folio(folio, vma)) continue; + + step = get_folio_mlock_step(folio, ptent, addr, end); if (vma->vm_flags & VM_LOCKED) - mlock_folio(folio); + mlock_folio_range(folio, vma, pte, addr, step); else - munlock_folio(folio); + munlock_folio_range(folio, vma, pte, addr, step); } pte_unmap(start_pte); out: