From patchwork Thu Sep 23 17:53:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12513447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA102C433EF for ; Thu, 23 Sep 2021 17:54:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4265260FDC for ; Thu, 23 Sep 2021 17:54:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4265260FDC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B44196B0074; Thu, 23 Sep 2021 13:54:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7D236B0075; Thu, 23 Sep 2021 13:54:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85B2A6B0078; Thu, 23 Sep 2021 13:54:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id 6B2ED6B0074 for ; Thu, 23 Sep 2021 13:54:12 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2AE622BFC4 for ; Thu, 23 Sep 2021 17:54:12 +0000 (UTC) X-FDA: 78619587144.31.3DF1D6C Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf26.hostedemail.com (Postfix) with ESMTP id B545320019C9 for ; Thu, 23 Sep 2021 17:54:11 +0000 (UTC) Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18NHmCNA003111; Thu, 23 Sep 2021 17:54:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=WA11bhVNJt+pm6dGsNJmWCrn0r54C54rTpw+ZUm0VJE=; b=cMgh0VPTjeBySHadA7mZPZBzZIwTu/rcRzvAKX7QvkgRF5DunNliuPglyhC1dtRerRTt p0zy3XtrrDkwQpQGgw7bcZIfmGLbWiVUOZugiAM6bA32VTJDkRUviuoBXcNpQNtTRG+7 ZSTkqoSzMgswvSAKHyYOmmJkvHLg41EIRkj+UYYssEuiyA/jxtbZd3gyicNHrxrx4o/2 SONfpTOgmFk/B8KEta39Bhv4Uh2VG+b+p4W6TbKk684yqIbGqsd090V5VnV9kau/ZlqP Iwn9Q3ryxBBK5/1vYl40X5gaTBmDn4uhUFD1rPovG0FzGjAids8fGK2iS2LHE7U8kUYI Eg== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3b8qvuk6qb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Sep 2021 17:54:09 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 18NHoQlI041678; Thu, 23 Sep 2021 17:54:05 GMT Received: from nam02-dm3-obe.outbound.protection.outlook.com (mail-dm3nam07lp2046.outbound.protection.outlook.com [104.47.56.46]) by userp3020.oracle.com with ESMTP id 3b7q5y7een-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Sep 2021 17:54:04 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oA/lCveX0ZlkthqLLUeuYNPHp6uwZez/3TOsFYGeqSZT2+/w13GVcPGv786t/bXRj7++RvgVaL1mPYH7uiQSkN/Q1jphisZRPdIvhSvA9J3dVu0QK/o0Wjg0Gz2POAPuMoh7KnLx297153dhCRPLnoZF0p/0cuVDFtww6w+Ej4vMceyEQkzrcRYBFhif7cpVOHMPpfKnf416SyPDZU+MEWSmcUzG3hOSPuj9b+kbA3CF3bd+LilFu1oyCz8sGYT/DT9SHY+fiH1sywT6nyN3cs/CNAqk3eiMIYDfU1dGOLecIDZvxI9v+PogVX5alhwnGd6SeczcSxEoL0sIrQwdLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=WA11bhVNJt+pm6dGsNJmWCrn0r54C54rTpw+ZUm0VJE=; b=K6CZhWjt9+gmwc3yC2X0+oQ9jF6rdcMwAkrQKtYJi8M8lELVn3mpWM3v+ukdB1944KuGjuDtiNLgJ8j1TDxelai8AaOg/mMGAoLF8Z11RANBwEXzamzuEMdhzUM7iiYGsO1t2HRfL2iqEw/ozw4lmLf1ik2pF3Z4kFZTrao6AwmYnB2vpn+ykxwVm9XYKe9jQcMH+3gwRgyCpJhx+DzjOJYLUFKhZl1iIdtYurwqFkaLr0g9toikHMN8pFFne5fPsaj0VH0VaxTrvld5wyUdr3jYH2tMKxQbpmVyQSF5sTOwwuK5QD5iMSutiCDhJAkVUqXDfaftgN2QCC9c9eKaSg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WA11bhVNJt+pm6dGsNJmWCrn0r54C54rTpw+ZUm0VJE=; b=TTqcb0oVPKpSb2ztuYZ3HQ3IeTv7hmlhHBOQKQsR18+38E4yt4DujLdLDy+G7Lisnz0UFEsMfXaS963ddcb3BGWOXMOqUV728Oc/1QpgV/4Oz5Qx5zTajUMBMKlDVtdqZ4y6EN/oUcVTQpMqJMG955yKJswLL2iP4uSrDmJXJcw= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by BYAPR10MB3447.namprd10.prod.outlook.com (2603:10b6:a03:8b::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.13; Thu, 23 Sep 2021 17:54:02 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::a4a2:56de:e8db:9f2b]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::a4a2:56de:e8db:9f2b%9]) with mapi id 15.20.4544.015; Thu, 23 Sep 2021 17:54:02 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Michal Hocko , Oscar Salvador , Zi Yan , Muchun Song , Naoya Horiguchi , David Rientjes , "Aneesh Kumar K . V" , Andrew Morton , Mike Kravetz Subject: [PATCH v2 1/4] hugetlb: add demote hugetlb page sysfs interfaces Date: Thu, 23 Sep 2021 10:53:44 -0700 Message-Id: <20210923175347.10727-2-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210923175347.10727-1-mike.kravetz@oracle.com> References: <20210923175347.10727-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR04CA0164.namprd04.prod.outlook.com (2603:10b6:303:85::19) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 Received: from monkey.oracle.com (50.38.35.18) by MW4PR04CA0164.namprd04.prod.outlook.com (2603:10b6:303:85::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.15 via Frontend Transport; Thu, 23 Sep 2021 17:54:01 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2576aeec-6491-471c-d002-08d97ebb2091 X-MS-TrafficTypeDiagnostic: BYAPR10MB3447: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: oFgpTMEaZaiv9+dJXLojAYzmT+ihuO5gkJYqfBNTwe9p20ZQbqsR+yeSICQckxDE619StzRTW4qyaZiYvQMp4itr/D4eu5Kl7gjfKX8zizMu2fLoC7ri6BHs2i8qF/v+0w00ZNY5hI4pfrAT5iSnuY+3hHQSVSQOckmp95e9gxnumG2Pe2SodnuGdAUgE/Gg5s9dnLT1D/ftubkqzsvBtrHa4lL0GjXq2S7RynKJ+yZLoji4bicsCHz0Y8qJfD8HUSsRFVx5ZZH8iT6Tm2ztmW696LKewmU/WRHbSOh17ArPyGwb2/zKRfzqlpeJdrAyLEyaBVtXELgBfAr49KIdZG4hRG6qxj/nRJxINvzzDwUdDUVDOBqBltqEvAm5E2fczauc49obyycPOLu1alwH+4jZP+aZBxBvFHcJXHyIhC7VShkAfeU6vc8zwKZxX6N4/XB1s5xX0vV6EW2zuKARmik381PcRsgt8iStNugY8A+B5cVg9rxZkS41SrOaKFHMoa8ZnJAVvNY4hBp5XSuhjD7CcaUxVQrjCMHorqpXri++/6D7b88mfu4WhzOmJeqmVodLv2eYaklXooPAW7dEqdjEu2qlIHdrB5wEHAm+nt+QwrQheu52QSwCXItyQM3TzopSRt5BuGA8KGpZYOYpGOCOvMv9Tbbpq9p6VBQVf0j60guXZGFbrLkdBPSN9ki/cvOcRkQ+Wd+ElS/UnJ/dXQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(38100700002)(38350700002)(508600001)(26005)(6486002)(1076003)(7416002)(36756003)(186003)(52116002)(83380400001)(2906002)(66476007)(5660300002)(8936002)(4326008)(66556008)(2616005)(956004)(44832011)(54906003)(7696005)(66946007)(6666004)(107886003)(316002)(86362001)(8676002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 9Dkj/V1JWgeBkkTH9WUlPsW/SxzE61oy5Cev6nhzn3Q1/0Y32gD/g9De6KABlezeJxccS81HhG7uMX+oeZTpmJAlnyrKwfbFy/sbbphGU6hCAnaq5QJb5DgjEQNAOKWll5JnvTQ1P7Vq3sP3DjAjW4GlmMCWbypeoxZsgqU/ch23j4cLnfiwemR890/ekSYfgtSbT9k02FGMccou3v75bcco5FXMTgRCd9Thf0A8WOlzuADZBcoVyFdW09q2UezqD9foQ6vUTJWkmQkde+Yr4T1Pg5vUcxz2cedn6LYcm4k61z3qat7wmXnHGOSsqSuWLwnXPmfNBPe+yKF1pcTIWnKWASobEJ+BeMUFWwq/kbLlo9usNmLwR3RGb3EkDHloVmg1PdGG7QQEvCy2+JvXyBHDJCLKKI2W5QFFoxdTVEalTuR2X23I0PTiu6Ad8ZqQ8OgMxQrITtcAsb7rs1R3zdvdRkk2fJwkNf6Othiu8zIa4ubkX7X5GS1F7Ug1Ow90QPtS5bTjYMtM8iqNoASaN+/9XgLvEvKdAEKG0ZBTV9YCwvGJZLfSPU1V26iL80lDkLywbjtJfwxipwjKxMgTW7NkV0g69cq+KfRh6FBGakXEcF4B0ePxZqa6VXw14BSAsigZuNM2AFz+EbtwOfykdTSvJJBaLU3qPRpxGpO/ljJjC0YS2pm8RquhX86M4YcMgkepb1rMlfl7TityZ4MSCU1PNpqn7rMqwmdZyph36MvqCpKoOzTS8ge11hsO6dFZUzxyC3ONYzP/h9JkW14qBs0zrUgslAwUdbQJ3S6yFl4S/ODvKGpeOMFttP7MleBkbw8WQDRPQJ+O5YHnN5m5OdhwVq+A6oG3yc7oswBQOWiwtfi1n7Nmkb2i8TPDxAJp6f6XgvNRQShHr5oSnLKXjrgYSDoRCjMnIImMMvm/Y74s4MCwNcgMYi4cCFJrj2/gxaz46NdnkhKLidrq6BnFiQ65WHJhWu38LQQqg1wcmryUDlk2fG4lr20jTRRB1Y98nCRui0V465f0VQewucwkcrv92rVFyMhZ/lhmNckVrfTMTVxjFeiWV50l3YTPTmvVGvWHqfWVhn+s6267mhCWikR40iAmivYKOvnZc6b7+C/eQMAfvxpmKAxF6PWGdJ3fwqu6q8keB0UJqZU6qDm/DqAO/MAPj7tUL0mlXiHrvOqpOy6tY4ti37BHPC5DTcNGomGEBgBvYOY8Gx1lSGMeb2g884QTrtO4tjkL4xv5UWQnZLzVbD7X2XOs/DV0ierMwamziNb+RGy69PoCaAqBRJ8l7G7SKy0u1GdFpZxlx16DyoASec60XoV3WI9uwGL0 X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2576aeec-6491-471c-d002-08d97ebb2091 X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2021 17:54:02.2008 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: G4BAj4sYBBXgTolo6091WGardAp7FkSl4fW9Muc3/EceeqSHlDU9uQq7YLBrXH/n57JRzEkr3G3Hk5Hk9hcllg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR10MB3447 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10116 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 suspectscore=0 bulkscore=0 malwarescore=0 mlxscore=0 mlxlogscore=999 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109200000 definitions=main-2109230107 X-Proofpoint-ORIG-GUID: kffuToGzzrkIIPgozuBeBfDQ1zzh0ncl X-Proofpoint-GUID: kffuToGzzrkIIPgozuBeBfDQ1zzh0ncl X-Rspamd-Queue-Id: B545320019C9 X-Stat-Signature: 5y1rhamzkpmn9eyhgm5w3ek59gz5uywn Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2021-07-09 header.b=cMgh0VPT; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=TTqcb0oV; spf=none (imf26.hostedemail.com: domain of mike.kravetz@oracle.com has no SPF policy when checking 205.220.177.32) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com X-Rspamd-Server: rspam06 X-HE-Tag: 1632419651-230967 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Two new sysfs files are added to demote hugtlb pages. These files are both per-hugetlb page size and per node. Files are: demote_size - The size in Kb that pages are demoted to. (read-write) demote - The number of huge pages to demote. (write-only) By default, demote_size is the next smallest huge page size. Valid huge page sizes less than huge page size may be written to this file. When huge pages are demoted, they are demoted to this size. Writing a value to demote will result in an attempt to demote that number of hugetlb pages to an appropriate number of demote_size pages. NOTE: Demote interfaces are only provided for huge page sizes if there is a smaller target demote huge page size. For example, on x86 1GB huge pages will have demote interfaces. 2MB huge pages will not have demote interfaces. This patch does not provide full demote functionality. It only provides the sysfs interfaces. It also provides documentation for the new interfaces. Signed-off-by: Mike Kravetz --- Documentation/admin-guide/mm/hugetlbpage.rst | 30 +++- include/linux/hugetlb.h | 1 + mm/hugetlb.c | 155 ++++++++++++++++++- 3 files changed, 183 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst index 8abaeb144e44..0e123a347e1e 100644 --- a/Documentation/admin-guide/mm/hugetlbpage.rst +++ b/Documentation/admin-guide/mm/hugetlbpage.rst @@ -234,8 +234,12 @@ will exist, of the form:: hugepages-${size}kB -Inside each of these directories, the same set of files will exist:: +Inside each of these directories, the set of files contained in ``/proc`` +will exist. In addition, two additional interfaces for demoting huge +pages may exist:: + demote + demote_size nr_hugepages nr_hugepages_mempolicy nr_overcommit_hugepages @@ -243,7 +247,29 @@ Inside each of these directories, the same set of files will exist:: resv_hugepages surplus_hugepages -which function as described above for the default huge page-sized case. +The demote interfaces provide the ability to split a huge page into +smaller huge pages. For example, the x86 architecture supports both +1GB and 2MB huge pages sizes. A 1GB huge page can be split into 512 +2MB huge pages. Demote interfaces are not available for the smallest +huge page size. The demote interfaces are: + +demote_size + is the size of demoted pages. When a page is demoted a corresponding + number of huge pages of demote_size will be created. By default, + demote_size is set to the next smaller huge page size. If there are + multiple smaller huge page sizes, demote_size can be set to any of + these smaller sizes. Only huge page sizes less then the current huge + pages size are allowed. + +demote + is used to demote a number of huge pages. A user with root privileges + can write to this file. It may not be possible to demote the + requested number of huge pages. To determine how many pages were + actually demoted, compare the value of nr_hugepages before and after + writing to the demote interface. demote is a write only interface. + +The interfaces which are the same as in ``/proc`` (all except demote and +demote_size) function as described above for the default huge page-sized case. .. _mem_policy_and_hp_alloc: diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 1faebe1cd0ed..f2c3979efd69 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -596,6 +596,7 @@ struct hstate { int next_nid_to_alloc; int next_nid_to_free; unsigned int order; + unsigned int demote_order; unsigned long mask; unsigned long max_huge_pages; unsigned long nr_huge_pages; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6378c1066459..c76ee0bd6374 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2986,7 +2986,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) static void __init hugetlb_init_hstates(void) { - struct hstate *h; + struct hstate *h, *h2; for_each_hstate(h) { if (minimum_order > huge_page_order(h)) @@ -2995,6 +2995,17 @@ static void __init hugetlb_init_hstates(void) /* oversize hugepages were init'ed in early boot */ if (!hstate_is_gigantic(h)) hugetlb_hstate_alloc_pages(h); + + /* + * Set demote order for each hstate. Note that + * h->demote_order is initially 0. + */ + for_each_hstate(h2) { + if (h2 == h) + continue; + if (h2->order < h->order && h2->order > h->demote_order) + h->demote_order = h2->order; + } } VM_BUG_ON(minimum_order == UINT_MAX); } @@ -3235,9 +3246,29 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, return 0; } +static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) + __must_hold(&hugetlb_lock) +{ + int rc = 0; + + lockdep_assert_held(&hugetlb_lock); + + /* We should never get here if no demote order */ + if (!h->demote_order) + return rc; + + /* + * TODO - demote fucntionality will be added in subsequent patch + */ + return rc; +} + #define HSTATE_ATTR_RO(_name) \ static struct kobj_attribute _name##_attr = __ATTR_RO(_name) +#define HSTATE_ATTR_WO(_name) \ + static struct kobj_attribute _name##_attr = __ATTR_WO(_name) + #define HSTATE_ATTR(_name) \ static struct kobj_attribute _name##_attr = \ __ATTR(_name, 0644, _name##_show, _name##_store) @@ -3433,6 +3464,112 @@ static ssize_t surplus_hugepages_show(struct kobject *kobj, } HSTATE_ATTR_RO(surplus_hugepages); +static ssize_t demote_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t len) +{ + unsigned long nr_demote; + unsigned long nr_available; + nodemask_t nodes_allowed, *n_mask; + struct hstate *h; + int err; + int nid; + + err = kstrtoul(buf, 10, &nr_demote); + if (err) + return err; + h = kobj_to_hstate(kobj, &nid); + + /* Synchronize with other sysfs operations modifying huge pages */ + mutex_lock(&h->resize_lock); + + spin_lock_irq(&hugetlb_lock); + if (nid != NUMA_NO_NODE) { + nr_available = h->free_huge_pages_node[nid]; + init_nodemask_of_node(&nodes_allowed, nid); + n_mask = &nodes_allowed; + } else { + nr_available = h->free_huge_pages; + n_mask = &node_states[N_MEMORY]; + } + nr_available -= h->resv_huge_pages; + if (nr_available <= 0) + goto out; + nr_demote = min(nr_available, nr_demote); + + while (nr_demote) { + if (!demote_pool_huge_page(h, n_mask)) + break; + + /* + * We may have dropped the lock in the routines to + * demote/free a page. Recompute nr_demote as counts could + * have changed and we want to make sure we do not demote + * a reserved huge page. + */ + nr_demote--; + if (nid != NUMA_NO_NODE) + nr_available = h->free_huge_pages_node[nid]; + else + nr_available = h->free_huge_pages; + nr_available -= h->resv_huge_pages; + if (nr_available <= 0) + nr_demote = 0; + else + nr_demote = min(nr_available, nr_demote); + } + +out: + spin_unlock_irq(&hugetlb_lock); + mutex_unlock(&h->resize_lock); + + return len; +} +HSTATE_ATTR_WO(demote); + +static ssize_t demote_size_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct hstate *h; + unsigned long demote_size; + int nid; + + h = kobj_to_hstate(kobj, &nid); + demote_size = h->demote_order; + + return sysfs_emit(buf, "%lukB\n", + (unsigned long)(PAGE_SIZE << h->demote_order) / SZ_1K); +} + +static ssize_t demote_size_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + struct hstate *h, *t_hstate; + unsigned long demote_size; + unsigned int demote_order; + int nid; + + demote_size = (unsigned long)memparse(buf, NULL); + + t_hstate = size_to_hstate(demote_size); + if (!t_hstate) + return -EINVAL; + demote_order = t_hstate->order; + + /* demote order must be smaller hstate order */ + h = kobj_to_hstate(kobj, &nid); + if (demote_order >= h->order) + return -EINVAL; + + /* resize_lock synchronizes access to demote size and writes */ + mutex_lock(&h->resize_lock); + h->demote_order = demote_order; + mutex_unlock(&h->resize_lock); + + return count; +} +HSTATE_ATTR(demote_size); + static struct attribute *hstate_attrs[] = { &nr_hugepages_attr.attr, &nr_overcommit_hugepages_attr.attr, @@ -3449,6 +3586,16 @@ static const struct attribute_group hstate_attr_group = { .attrs = hstate_attrs, }; +static struct attribute *hstate_demote_attrs[] = { + &demote_size_attr.attr, + &demote_attr.attr, + NULL, +}; + +static const struct attribute_group hstate_demote_attr_group = { + .attrs = hstate_demote_attrs, +}; + static int hugetlb_sysfs_add_hstate(struct hstate *h, struct kobject *parent, struct kobject **hstate_kobjs, const struct attribute_group *hstate_attr_group) @@ -3466,6 +3613,12 @@ static int hugetlb_sysfs_add_hstate(struct hstate *h, struct kobject *parent, hstate_kobjs[hi] = NULL; } + if (h->demote_order) { + if (sysfs_create_group(hstate_kobjs[hi], + &hstate_demote_attr_group)) + pr_warn("HugeTLB unable to create demote interfaces for %s\n", h->name); + } + return retval; } From patchwork Thu Sep 23 17:53:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12513445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EB11C433F5 for ; Thu, 23 Sep 2021 17:54:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ED544610D1 for ; Thu, 23 Sep 2021 17:54:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org ED544610D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 885066B0073; Thu, 23 Sep 2021 13:54:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 83207900002; Thu, 23 Sep 2021 13:54:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6AC396B0075; Thu, 23 Sep 2021 13:54:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0148.hostedemail.com [216.40.44.148]) by kanga.kvack.org (Postfix) with ESMTP id 5C1596B0073 for ; Thu, 23 Sep 2021 13:54:10 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 186DE181C9660 for ; Thu, 23 Sep 2021 17:54:10 +0000 (UTC) X-FDA: 78619587060.05.25D8C19 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf22.hostedemail.com (Postfix) with ESMTP id A46491903 for ; Thu, 23 Sep 2021 17:54:09 +0000 (UTC) Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18NHmcjo027816; Thu, 23 Sep 2021 17:54:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=lukYegQNe7HVRS5Ynyv3DvyMb+oBopWKmJgecjXAbKs=; b=txE9LI5SYdlCuQIi75xvGDDQBNEAr1Xns41xnlRZq+BwBB0kQPceJ0a9A2zcEMQra320 Vvoar+g63IEQMZhhG9bW7puFechGWja3VG7RiHEc/gdU7RH06p4L+PBPaAre/x2kSi3d 0h5bw2IWJwRImu4umodcS5X4YNv7ULDVjoCiFRUUU5q17uETpb9VDuss5EUa+gO70Ezz EESCFL+L4QHC0eqsfMHaAfMBS0wGcV1oSIqgBGPQRuWAR2zrPWTvUCxqtrpNSGZyf3vN bBlJOvJGRWff4FAsvOWjMnXv3w/fsON8nO8QhJL8rdMKxF+xbpxrFp+RIbc6ajBmUAHN 9w== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3b8n2v457k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Sep 2021 17:54:07 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 18NHpJAq077938; Thu, 23 Sep 2021 17:54:06 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2102.outbound.protection.outlook.com [104.47.70.102]) by aserp3030.oracle.com with ESMTP id 3b7q5e3pt5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Sep 2021 17:54:06 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nxUokCleofIfVuYB8a3LEv8x3Wg4Sw5F6yWZ2+uQhQ6o9a/g0cIOSYC5Ry+f8dbK2zm4DDTXb69HaXavSWh62pBteprMokiiHSlrjy1+0trUCEI8OI75S7+7sHP3g/gQzByd10YwETrl317btdLPYQfIhR9cYPQ2zn7HineKxgPkt49/tf1WvdfB+YHvaMfmgvdzluzjyXlvJBTiDEB8fXiL75hq7HwSp8wHQdzGEzu0G93jeaKmPHVzoZ/0IIhO+o5WvWtzm2d0Xdc+6kQajfwMxfMla0Zsmzf1O4uMcsEwLmw6a3QW5d3bzvEJFy1wW2DqvSxcbWmgCg3UZmvlTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=lukYegQNe7HVRS5Ynyv3DvyMb+oBopWKmJgecjXAbKs=; b=ReYIt0KnIjKSJ1KM1iCJmNyzl+//MyR6cNt3VLAa7sMj4XAgBAAkTGUpGnThS26dUDhL98JVIIKqD+7qqgl4DTYANqIMNK6oiXSuhdxK0nzJ1iBOaLTfZM0N9+liNhgd3mDBfGdBOnWIShlMzM6rWfM4xh9DGcGRNSABk+0ljyPfCxrpTNs7iLUo2HwEsEPS6X+ISy/WfMQY3tO3GYqmSsIs1xWhh79a6jEADOM6be550ne8zWkHGlx622xmPIcLloRV81T4h0x+LtzwNT/pLuPlQbB41O/GZIdcVkX3UcrQI6Rk9UwfkQQnZdxxVVA83ZdmHUbid14bSR29QbyLXg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lukYegQNe7HVRS5Ynyv3DvyMb+oBopWKmJgecjXAbKs=; b=MxFKu9Lcx1lZTfD9ZUVWeRjQvJU2SfRT/u8SbIPeFJaq4aTaiKK8p08UcqHUfzmdrLMB2gTI+AOX0BRUNyjS07ZwflSR9S9NC58cYhYRy4PlOifmkpyryoUGBh6X4a1VbuSEjMB85yLQwKnVWAoPW7SKJ6mriD2cnuEN2tiGpNo= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by BYAPR10MB3350.namprd10.prod.outlook.com (2603:10b6:a03:152::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.14; Thu, 23 Sep 2021 17:54:04 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::a4a2:56de:e8db:9f2b]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::a4a2:56de:e8db:9f2b%9]) with mapi id 15.20.4544.015; Thu, 23 Sep 2021 17:54:04 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Michal Hocko , Oscar Salvador , Zi Yan , Muchun Song , Naoya Horiguchi , David Rientjes , "Aneesh Kumar K . V" , Andrew Morton , Mike Kravetz Subject: [PATCH v2 2/4] hugetlb: add HPageCma flag and code to free non-gigantic pages in CMA Date: Thu, 23 Sep 2021 10:53:45 -0700 Message-Id: <20210923175347.10727-3-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210923175347.10727-1-mike.kravetz@oracle.com> References: <20210923175347.10727-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR04CA0164.namprd04.prod.outlook.com (2603:10b6:303:85::19) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 Received: from monkey.oracle.com (50.38.35.18) by MW4PR04CA0164.namprd04.prod.outlook.com (2603:10b6:303:85::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.15 via Frontend Transport; Thu, 23 Sep 2021 17:54:03 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: dd52dbd1-4174-45c5-0996-08d97ebb21d4 X-MS-TrafficTypeDiagnostic: BYAPR10MB3350: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: sv2BLTaECap9blNrQTta7G9epIbotIuhU/VJ29dK3ZeIFUMohgigTOJ3R4USXtEODB7VTIBhSX+b1mdCohF9bxg6N/PwYwF0AKh5l8MWpKi9mcqiFC3h62L7RIbwDEm3hJcaoig+ZGXLoJn46QrLb3A514ta1njWKskVLC+yMytiPKH5/V43LPblmUionh2Ywke28BTkuGNkQIG+CBPVS1WQnvle6jZWoQ8dyw67YZ/k0pSMoTpFxUh4aN25zp0nP4VS5agpKmAumKHkPS4ix87ugZfHX6yYKmqdV84EqfYBHorLS0NDRnNQKyMqFCKCHT+XI1f9YUe+8xtXBrGSDSu525d2ReN6JXwDJT3b8Dk3jsb+sm74Qy/ZU5Mais+DZBu/+sQQFBc6JhjBXK8W7odAtm22VWlD1aQGB/iDk/tJAV5HsOh1MwzUFXtr8ubH7YiWZ8t/uEHHBYY9BCKek4zsq2xMIgZHhuJFDBkNR0aJLwr1+DqRR60oVd6yrKHbSavXTApXUWJzQqySZ65ZRd8pdo8pYoNBCsWLE8JArk55fa4f9YqoDH3cVsBKmv8wb2mEPl3c4x/4kkcvO2ksj6WcUsQfk0H6juFCFqkgVpZU90HfiHG79kTiN4RrR+4Flft6MXR8ddn5PSC0IjgHYwa5IqmHSo8fGtSqN9hIFrgfX+FFSB+vXTx0dMDwsosAP5KnU1nC8+pF5ux/OA+sKw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(66946007)(66476007)(38350700002)(508600001)(38100700002)(66556008)(83380400001)(316002)(54906003)(7416002)(8676002)(6666004)(107886003)(26005)(2906002)(44832011)(5660300002)(52116002)(36756003)(86362001)(1076003)(4326008)(8936002)(6486002)(7696005)(956004)(186003)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 7LQ56EcNzD4w2haQGv1H8XJjbGFnSvsYYs2yvHfx+D6dqaC5hWL8SpezHGQ1hsG9gS0tegOkDHEWcSWysF5CWmnvTVPPN/Qol1MljKUUuvMDTFyigynvmLN3fU6cq2SwX1np9oAnBG7FLVbBb4JvbHRoDchgOOPq85N9PZMZxGT2ZMas/NlyOhinyr5JfH4rxxlH4xiUIAXPqnvL/786PW+R9ZTS+ouvJHZSeS7lvGMVM4GYF2w/xqHd3xOmUlgb8Ui3vFOONrkX5Plc1ZWHEjrBJXPfPb9QqDUFNUNnmmTXM8fh8zboqeLOmNYETklq1Mp/0Y+iHPVq/q/P3DOqZT3Vxchj7UO2rQbo3KIBl5hOcgJ98Jslfnsu3DGzPUWvsl30UQAUGQptg9EFhRhzB8+ivCnmVdN5yTmxztYaN/5TPCjLPU7S1dKJ5SVpGQMktnR+JxkRtY/anCr+1OOTmMFiL/yHLV0+tpl+IyFiY8e5GFW06kd9N72lr3IXLBmnPndWrdWfYnpVZQTffkQ/keGyxtbKXwIu1Kj70YZkWqlI/RQYOVb2XTECiMTPKN4WRQZnqF6PPHOMcl2/PPys5KaY2Azuelc0RhxEZCSDyRMTHZDcpHPDiyv/4FQwkf8agKJUB/TKjCRPoxvNe1gyxyh5Pwgy9nz9hGxCe6UQtZ36U22l3RhoXXvP9v1s4efn6iOpLZ0lIYAL59u53gt9dfWC35FLHpK3vCaboM7hwAdWjd0muRzS7fPKATY2fMA/oqY/Djak2YKIVVjMqGocow4W/Ym9N7mbkxCFNSjd3udGgA8roZLFAuixDagUH7hrupNVzRsOeIvYjGPviSeTw4iMtSmKBO1245c2uBMvkDMCpYUyQmrVeOoKe3kMIwP612E80r14IdpSA7PWoDJ7lUHVCPgWwqTqSLhM90ZBFRcf5URLaieYRWPC2nTU14+f7eWmx9iR1xUl4zWXHHzkJNCsc1rtQhEznChAEpFDK0MWTkfKYvQn9zSWvZFcc5fr/bC50P5rRkTnaMz5Xz/SBlicTgr2flI0p/4tD35uTFsYqj6nv7sGURHBPmeZbtLr18exv51IPB5Sbd6vimHmc15FWb1uBIncdm0WcNqCod2aX94S2Q+52+Bsvl8jvy3KDbIs7hyQ1wHuCFTRRpoxVnxGj/fDzYQ7wGFnuHkEPeQ8M3DvMhs8fuu/ZzOVCsed1jplsNw49XFd9PnkG3sf404WCdyDv/rCWPO+lM2H4XvxMi8YeAR9/hm9KLjvqwfVmIm55Kqr3ErSEYflLNdcVd9UJnhn6fkgXQW1O7P61EOk2BvOXBpaPr/QJrZBXQPT X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: dd52dbd1-4174-45c5-0996-08d97ebb21d4 X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2021 17:54:04.3175 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: rbru3XjasP3/ABou75rDxRcrQ3JDWFRjocAyhUpx5nTdDXap35tZB7huX63Im5HHPgG3vmp6VLNeTpa0haSWRQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR10MB3350 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10116 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 mlxscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109200000 definitions=main-2109230107 X-Proofpoint-GUID: SnN2YAK5zynPQrNJGgIfCy8oE-me9jNF X-Proofpoint-ORIG-GUID: SnN2YAK5zynPQrNJGgIfCy8oE-me9jNF X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: A46491903 X-Stat-Signature: rfyu6qzsshagpd7w1hzyf96jpmyn6afe Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2021-07-09 header.b=txE9LI5S; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=MxFKu9Lc; spf=none (imf22.hostedemail.com: domain of mike.kravetz@oracle.com has no SPF policy when checking 205.220.177.32) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com X-HE-Tag: 1632419649-911683 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When huge page demotion is fully implemented, gigantic pages can be demoted to a smaller huge page size. For example, on x86 a 1G page can be demoted to 512 2M pages. However, gigantic pages can potentially be allocated from CMA. If a gigantic page which was allocated from CMA is demoted, the corresponding demoted pages needs to be returned to CMA. In order to track hugetlb pages that need to be returned to CMA, add the hugetlb specific flag HPageCma. Flag is set when a huge page is allocated from CMA and transferred to any demoted pages. Non-gigantic huge page freeing code checks for the flag and takes appropriate action. This also requires a change to CMA reservations for gigantic pages. Currently, the 'order_per_bit' is set to the gigantic page size. However, if gigantic pages can be demoted this needs to be set to the order of the smallest huge page. At CMA reservation time we do not know the size of the smallest huge page size, so use HUGETLB_PAGE_ORDER. Also, prohibit demotion to huge page sizes smaller than HUGETLB_PAGE_ORDER. Signed-off-by: Mike Kravetz --- include/linux/hugetlb.h | 7 +++++ mm/hugetlb.c | 64 +++++++++++++++++++++++++++++------------ 2 files changed, 53 insertions(+), 18 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index f2c3979efd69..08668b9f5f71 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -533,6 +533,11 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, * HPG_freed - Set when page is on the free lists. * Synchronization: hugetlb_lock held for examination and modification. * HPG_vmemmap_optimized - Set when the vmemmap pages of the page are freed. + * HPG_cma - Set if huge page was directly allocated from CMA area via + * cma_alloc. Initially set for gigantic page cma allocations, but can + * be set in non-gigantic pages if gigantic pages are demoted. + * Synchronization: Only accessed or modified when there is only one + * reference to the page at allocation, free or demote time. */ enum hugetlb_page_flags { HPG_restore_reserve = 0, @@ -540,6 +545,7 @@ enum hugetlb_page_flags { HPG_temporary, HPG_freed, HPG_vmemmap_optimized, + HPG_cma, __NR_HPAGEFLAGS, }; @@ -586,6 +592,7 @@ HPAGEFLAG(Migratable, migratable) HPAGEFLAG(Temporary, temporary) HPAGEFLAG(Freed, freed) HPAGEFLAG(VmemmapOptimized, vmemmap_optimized) +HPAGEFLAG(Cma, cma) #ifdef CONFIG_HUGETLB_PAGE diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c76ee0bd6374..c3f7da8f0c68 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1272,6 +1272,7 @@ static void destroy_compound_gigantic_page(struct page *page, atomic_set(compound_pincount_ptr(page), 0); for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) { + p->mapping = NULL; clear_compound_head(p); set_page_refcounted(p); } @@ -1283,16 +1284,12 @@ static void destroy_compound_gigantic_page(struct page *page, static void free_gigantic_page(struct page *page, unsigned int order) { - /* - * If the page isn't allocated using the cma allocator, - * cma_release() returns false. - */ #ifdef CONFIG_CMA - if (cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)) - return; + if (HPageCma(page)) + cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order); + else #endif - - free_contig_range(page_to_pfn(page), 1 << order); + free_contig_range(page_to_pfn(page), 1 << order); } #ifdef CONFIG_CONTIG_ALLOC @@ -1311,8 +1308,10 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask, if (hugetlb_cma[nid]) { page = cma_alloc(hugetlb_cma[nid], nr_pages, huge_page_order(h), true); - if (page) + if (page) { + SetHPageCma(page); return page; + } } if (!(gfp_mask & __GFP_THISNODE)) { @@ -1322,8 +1321,10 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask, page = cma_alloc(hugetlb_cma[node], nr_pages, huge_page_order(h), true); - if (page) + if (page) { + SetHPageCma(page); return page; + } } } } @@ -1480,6 +1481,20 @@ static void __update_and_free_page(struct hstate *h, struct page *page) destroy_compound_gigantic_page(page, huge_page_order(h)); free_gigantic_page(page, huge_page_order(h)); } else { +#ifdef CONFIG_CMA + /* + * Could be a page that was demoted from a gigantic page + * which was allocated in a CMA area. + */ + if (HPageCma(page)) { + destroy_compound_gigantic_page(page, + huge_page_order(h)); + if (!cma_release(hugetlb_cma[page_to_nid(page)], page, + 1 << huge_page_order(h))) + VM_BUG_ON_PAGE(1, page); + return; + } +#endif __free_pages(page, huge_page_order(h)); } } @@ -2997,14 +3012,19 @@ static void __init hugetlb_init_hstates(void) hugetlb_hstate_alloc_pages(h); /* - * Set demote order for each hstate. Note that - * h->demote_order is initially 0. + * Set demote order for each hstate. hstates are not ordered, + * so this is brute force. Note that h->demote_order is + * initially 0. If cma is used for gigantic pages, the smallest + * demote size is HUGETLB_PAGE_ORDER. */ - for_each_hstate(h2) { - if (h2 == h) - continue; - if (h2->order < h->order && h2->order > h->demote_order) - h->demote_order = h2->order; + if (!hugetlb_cma_size || !(h->order <= HUGETLB_PAGE_ORDER)) { + for_each_hstate(h2) { + if (h2 == h) + continue; + if (h2->order < h->order && + h2->order > h->demote_order) + h->demote_order = h2->order; + } } } VM_BUG_ON(minimum_order == UINT_MAX); @@ -3555,6 +3575,8 @@ static ssize_t demote_size_store(struct kobject *kobj, if (!t_hstate) return -EINVAL; demote_order = t_hstate->order; + if (demote_order < HUGETLB_PAGE_ORDER) + return -EINVAL; /* demote order must be smaller hstate order */ h = kobj_to_hstate(kobj, &nid); @@ -6563,7 +6585,13 @@ void __init hugetlb_cma_reserve(int order) size = round_up(size, PAGE_SIZE << order); snprintf(name, sizeof(name), "hugetlb%d", nid); - res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order, + /* + * Note that 'order per bit' is based on smallest size that + * may be returned to CMA allocator in the case of + * huge page demotion. + */ + res = cma_declare_contiguous_nid(0, size, 0, + PAGE_SIZE << HUGETLB_PAGE_ORDER, 0, false, name, &hugetlb_cma[nid], nid); if (res) { From patchwork Thu Sep 23 17:53:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12513451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05B6EC433F5 for ; Thu, 23 Sep 2021 17:54:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9531360FDC for ; Thu, 23 Sep 2021 17:54:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9531360FDC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id F134D6B0078; Thu, 23 Sep 2021 13:54:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC3696B007B; Thu, 23 Sep 2021 13:54:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5655900002; Thu, 23 Sep 2021 13:54:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0169.hostedemail.com [216.40.44.169]) by kanga.kvack.org (Postfix) with ESMTP id B0F336B007B for ; Thu, 23 Sep 2021 13:54:16 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7700818190F95 for ; Thu, 23 Sep 2021 17:54:16 +0000 (UTC) X-FDA: 78619587312.03.87FEFED Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf18.hostedemail.com (Postfix) with ESMTP id 17F694002088 for ; Thu, 23 Sep 2021 17:54:15 +0000 (UTC) Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18NHmH4h012795; Thu, 23 Sep 2021 17:54:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=gZlwb0fG90xtuqOiVDH4l+3Zw7FgGU7tq10bRf/yJpQ=; b=hH4JsUQoJxupbss6WYwdaYyEHXbnhYB6a9rJ4lOott3jQoVTkf36ld5J9+sX0EU39dih ABMle3Xl/QRo7o9pbsz4ufSi2aQbMxFLyXA7/RHTRkBA9/jz3LFq6A7zoEijSA/II9Mu TB+FeTlvGzMpFhibJmY7IP4lraGzfph8kT8vfImEjSdlvdhqXatuzsec8LUvjTe+oIXY W1ibmwXzEwlkazr5tNFPQR5XBnOwxuP9vM0/LpHQ9EwlYm5uhIHSGW89boX66X4w2dTE zhw26X+tS07dINTkr25ITbCnV0ddR7qAmScwL4hL/nNuKNPD8OxAw5aE9peSfex+/fVI Rw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3b8qkrayy7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Sep 2021 17:54:12 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 18NHpO9Y078953; Thu, 23 Sep 2021 17:54:10 GMT Received: from nam02-bn1-obe.outbound.protection.outlook.com (mail-bn1nam07lp2048.outbound.protection.outlook.com [104.47.51.48]) by aserp3030.oracle.com with ESMTP id 3b7q5e3pwk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Sep 2021 17:54:10 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Fyo93/X262Yj8ZmeO7TlWJaK0bFmw3FlwyGa7bfMyumEVDrRefhvp5UgEZ21z6DOsRtDEaohZ4qeFSUxjWChaR0Q2nurDos8yYJAUmKXQ5wlDAlKEpEkGPtUb2T0j704WBvce52vtqboN7KZFLQXJjkBJIm1br1UIrkOFB23DydRMWbzI+l5XFsEEIHOJFJ4HVoU3v9jFTfEQx+tN51ZlDT9CD5pfhFWBtnGmtCh9hCcfKkWfaI2qQtdFtbx5vOAWOB3wnHptMeFbpnzF0If8IguPMhksSce4VZ/V7eQqc6q1JdXC2kggafgGbm8EZY2hTlHupWrcjFZ7HOhm9h40Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=gZlwb0fG90xtuqOiVDH4l+3Zw7FgGU7tq10bRf/yJpQ=; b=oMUj7WiDfBV8K4lL7ytyb9jVxk2xy34rOWMWx4hAZoyuxYJsOZnrjvmxeBIzMbPZaVPwZxKavsUtXbdmJlVtjWZXDu3GxeQW4ZJeqI7w4rVFMorPG72h4/StJVWEJ9Y8XOgCKaHRHYS8TKvrvurf0iRjUjTcwCXaZ98MaMupCqjwIxZXHgcDyrLWHAs5qTZ0nSMektxfTh/evTPnOhxxtEO3Rvw4gEwIuJBZP5X1jDi7RN9LCmM3esJakiQwj5gxkswc7gDwyp3nJa66OK/796gEAijoFdKbUs3LYlNgr4jOOhj2Z/twtN3r9YiwQ9CchDi5txDsdq0Q/P7KxI5uog== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gZlwb0fG90xtuqOiVDH4l+3Zw7FgGU7tq10bRf/yJpQ=; b=vxQ6RmTfNk9uimG9MnDEnp56oiY3W181dNncCdUPlqbJDpMQh9HidbIf3z1C7UB1LvDGEP9CHr5NkgIJAez5Np0d8aWnRaaFPm5nBE2Tiid3v7pxH/yEfiFyD68xDP4F6EBeMFu/yYGGU+d8PfRZCw5a+gs6811uRnC5nJ557GU= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by BYAPR10MB2998.namprd10.prod.outlook.com (2603:10b6:a03:84::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.18; Thu, 23 Sep 2021 17:54:06 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::a4a2:56de:e8db:9f2b]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::a4a2:56de:e8db:9f2b%9]) with mapi id 15.20.4544.015; Thu, 23 Sep 2021 17:54:06 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Michal Hocko , Oscar Salvador , Zi Yan , Muchun Song , Naoya Horiguchi , David Rientjes , "Aneesh Kumar K . V" , Andrew Morton , Mike Kravetz Subject: [PATCH v2 3/4] hugetlb: add demote bool to gigantic page routines Date: Thu, 23 Sep 2021 10:53:46 -0700 Message-Id: <20210923175347.10727-4-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210923175347.10727-1-mike.kravetz@oracle.com> References: <20210923175347.10727-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR04CA0164.namprd04.prod.outlook.com (2603:10b6:303:85::19) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 Received: from monkey.oracle.com (50.38.35.18) by MW4PR04CA0164.namprd04.prod.outlook.com (2603:10b6:303:85::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.15 via Frontend Transport; Thu, 23 Sep 2021 17:54:05 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2cbd9b85-fded-4dc8-9e46-08d97ebb231e X-MS-TrafficTypeDiagnostic: BYAPR10MB2998: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: XFabla1TbEtv3xEYINlU/CMy+ySRnEaHYUva15o9MI3HRuetbQ5pxOqNRmgzV0efw11HA6//OL15Fn4XqB+lY5X6Jmu9Lwdt6CRoh6+pEpUGnZKdHr2fscNskfd0FwxIbsrSfU5zF6kCBseRMCcS5Ob+zUMAkWvaLtcp760yMAqA700HSIkHt73kngUlAom53pO3qIazULgT4+vbQB4++y7FZN94Lhgfy2FuVW3QGKAkvc8h4Yc18mTxUxt5OS6lYvLKMmZqPS1uPl49vgCgzvFggq3FEwuIQVeawncN+mP3/wbSZNX8NjWbPkpKb5q+SCQTlAy9dsazsXZQEFJlgBqbH+5ahry/1YKYEkY+DVRbniqWmCafx1wjvlxwU2QWSP/24J+gV8b0zcX6U5ILm/IbxnFF8bJuMzMPTvwypQOXGvjeoRaQTLQlA3g5g3UP7PRJyZiGI6GK6maV4QvW/vI0hbTMZfvFHofi/CiRmCRg7rkxz0RJUaMH9XW9MLPLUwO39Ic3CkWEbZuBmedgGrsji148Fx6YYDDwtrs67EPkVbHqOkP0mhHjFnXfhXwFpuJi5aa2hx3lrSUhpbQAeOCsVrfIKiQ6eXSDgO1cThoah1/DrFZ2vBt4t3DITV/QWZ2lxEotb0RUMgwjMwTnvfRVO6nDsAdB2PJF/B65IpfX4jzRHpj20qk/I0Q1DMga3sVRlaDtRAAADfNu107Jlg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(2616005)(2906002)(6666004)(4326008)(38100700002)(52116002)(956004)(8676002)(7696005)(107886003)(186003)(1076003)(83380400001)(508600001)(8936002)(36756003)(66476007)(6486002)(5660300002)(26005)(38350700002)(54906003)(316002)(44832011)(66556008)(66946007)(86362001)(7416002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: VgoLC2SbYGBQ1YZX5sIdv3IG89lPLGETktLhgk4f14BhIBXAghxTgq0Rl8AOsTO0X6qpr6eN4ISi8wg7/k1JGVZnCkpCQ0UXXt7UKotORaSb9i2EeQWGCkQwQ7goY0KOs92B7aYN8ihoIdUqb8OPM26IygRG2KgU2hsJZGU2mdhBTBaCEAD5bdWI7t/j4WJNmCJDREtKqdq97wv7sqbWsUlWfZG+ZSYCP6MkJio0Jg35h2nRwfJb4YZg5I5WpYtT24x0MYpiPy7poaqDVoFVCAeVqxtu9G4N7pqqfpXV7bYmTLMiS5y10QunH6NMMcgrF4spo4o1ERftvMuv6/kFmkm4+uF62wxnoN8uwJ71Mq1P0DpVgCnVee6qqBLL+VrjLXQ4IGLrKv8cAlQVISAFx5KFZBreuBpS9gMYh6dEaQhAahs+txcVDZAxw7RJxoppxHmtghVPI/jXuIPW7znthoOg3gcJ7LRTthoCLkVdk4fWlqsa2AEUxSHGlQv7kY/zR5ixwNzCCq5d3Si3NaJrwpQxtgZoUUTZ5mVJPfxU0lg0rMSD8uAdPUFOAF3o8XILOWVQYV3pfBA2d1SFGFdcOqWUT+Prm94lwiRZmK6MghaLU/cp2CH81Va5nU3W/kobo6yp7dTN0JFFPGY5o9N6YYJBpHdYo0KhDNUdZgT81KaTZxpwVmAg/SVeUin7ESQJqSJB7lQiTq4CHuIa8w1vwlPBuYC7kHBthAe+D9NluOjL1veIe5kCXXRpG2KNiFZjXN27ZjhaP/kLZl09SKpb1Fb77A5TakkHMkAOrO5q7vVKFCY8X1AmW9PzHv8U+fLgG5t9rTIvlKiHo+s1vIUKkHLeMvW4R5zRf6oDP/H1uYuyD3Xfw05zzvdd1qVcEGy3Rr9vt3XeEo4eNdlzxPeSjeYuvDADN2vuN3yHXFLee3fpg/7TRdM14msmMxL17xIsZUsmcrXamt+59SjcBZk8r4DqijKTGWAsdoc27y5ocqlvY5ceOKv70hAr5YbR9JdD7HKtdxwv05uCyK5cOta7A84jGjtIPIYY5h41ZZy8q4H2IjLIeOd5V1iDfJv6nLbCrxbSKQYA/FJ3pQPWfVQ4ZWgJ4JFo7qW06L1A6mLjzX0ATdDr/6BzV7BHMTs6hGgkvGbFNVKl3tQ43xV7VFogg3P0xD6BKNNyXXZ2Ua2xWyg/XXPb8XCfsGNB+PEtjKBzGH/kWWcXweoKzmSeFMYA9K9j0E30zk8W19bHRiZ5ZA8hSsdfwCCCqvV+1Yp9rG94qUWoq4PCGc1Efmbg+juq6tK1A2QeYOADtiuom4NV7NAAJDZzxITHn2GJuv5bEXmj X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2cbd9b85-fded-4dc8-9e46-08d97ebb231e X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2021 17:54:06.4302 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: zcZ3Zj/1RYh/NKbgbVFW4MBIXK+ALmEXtWyaVNdivg31kyglhWixJ1jUF8ofMKJqNNNXodYOfo0V4hyYm65MTQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR10MB2998 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10116 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 mlxscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109200000 definitions=main-2109230107 X-Proofpoint-ORIG-GUID: wEBoui576XMAK-pQKVKbtVj8zwPWunv9 X-Proofpoint-GUID: wEBoui576XMAK-pQKVKbtVj8zwPWunv9 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 17F694002088 X-Stat-Signature: 5ww6a9z4zdn11947pgdt67ta5wi1a46a Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2021-07-09 header.b=hH4JsUQo; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=vxQ6RmTf; spf=none (imf18.hostedemail.com: domain of mike.kravetz@oracle.com has no SPF policy when checking 205.220.177.32) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com X-HE-Tag: 1632419655-157943 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The routines remove_hugetlb_page and destroy_compound_gigantic_page will remove a gigantic page and make the set of base pages ready to be returned to a lower level allocator. In the process of doing this, they make all base pages reference counted. The routine prep_compound_gigantic_page creates a gigantic page from a set of base pages. It assumes that all these base pages are reference counted. During demotion, a gigantic page will be split into huge pages of a smaller size. This logically involves use of the routines, remove_hugetlb_page, and destroy_compound_gigantic_page followed by prep_compound*_page for each smaller huge page. When pages are reference counted (ref count >= 0), additional speculative ref counts could be taken. This could result in errors while demoting a huge page. Quite a bit of code would need to be created to handle all possible issues. Instead of dealing with the possibility of speculative ref counts, avoid the possibility by keeping ref counts at zero during the demote process. Add a boolean 'demote' to the routines remove_hugetlb_page, destroy_compound_gigantic_page and prep_compound_gigantic_page. If the boolean is set, the remove and destroy routines will not reference count pages and the prep routine will not expect reference counted pages. '*_for_demote' wrappers of the routines will be added in a subsequent patch where this functionality is used. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 54 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 43 insertions(+), 11 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c3f7da8f0c68..2317d411243d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1261,8 +1261,8 @@ static int hstate_next_node_to_free(struct hstate *h, nodemask_t *nodes_allowed) nr_nodes--) #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE -static void destroy_compound_gigantic_page(struct page *page, - unsigned int order) +static void __destroy_compound_gigantic_page(struct page *page, + unsigned int order, bool demote) { int i; int nr_pages = 1 << order; @@ -1274,7 +1274,8 @@ static void destroy_compound_gigantic_page(struct page *page, for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) { p->mapping = NULL; clear_compound_head(p); - set_page_refcounted(p); + if (!demote) + set_page_refcounted(p); } set_compound_order(page, 0); @@ -1282,6 +1283,12 @@ static void destroy_compound_gigantic_page(struct page *page, __ClearPageHead(page); } +static void destroy_compound_gigantic_page(struct page *page, + unsigned int order) +{ + __destroy_compound_gigantic_page(page, order, false); +} + static void free_gigantic_page(struct page *page, unsigned int order) { #ifdef CONFIG_CMA @@ -1354,12 +1361,15 @@ static inline void destroy_compound_gigantic_page(struct page *page, /* * Remove hugetlb page from lists, and update dtor so that page appears - * as just a compound page. A reference is held on the page. + * as just a compound page. + * + * A reference is held on the page, except in the case of demote. * * Must be called with hugetlb lock held. */ -static void remove_hugetlb_page(struct hstate *h, struct page *page, - bool adjust_surplus) +static void __remove_hugetlb_page(struct hstate *h, struct page *page, + bool adjust_surplus, + bool demote) { int nid = page_to_nid(page); @@ -1397,8 +1407,12 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page, * * This handles the case where more than one ref is held when and * after update_and_free_page is called. + * + * In the case of demote we do not ref count the page as it will soon + * be turned into a page of smaller size. */ - set_page_refcounted(page); + if (!demote) + set_page_refcounted(page); if (hstate_is_gigantic(h)) set_compound_page_dtor(page, NULL_COMPOUND_DTOR); else @@ -1408,6 +1422,12 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page, h->nr_huge_pages_node[nid]--; } +static void remove_hugetlb_page(struct hstate *h, struct page *page, + bool adjust_surplus) +{ + __remove_hugetlb_page(h, page, adjust_surplus, false); +} + static void add_hugetlb_page(struct hstate *h, struct page *page, bool adjust_surplus) { @@ -1679,7 +1699,8 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) spin_unlock_irq(&hugetlb_lock); } -static bool prep_compound_gigantic_page(struct page *page, unsigned int order) +static bool __prep_compound_gigantic_page(struct page *page, unsigned int order, + bool demote) { int i, j; int nr_pages = 1 << order; @@ -1717,10 +1738,16 @@ static bool prep_compound_gigantic_page(struct page *page, unsigned int order) * the set of pages can not be converted to a gigantic page. * The caller who allocated the pages should then discard the * pages using the appropriate free interface. + * + * In the case of demote, the ref count will be zero. */ - if (!page_ref_freeze(p, 1)) { - pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n"); - goto out_error; + if (!demote) { + if (!page_ref_freeze(p, 1)) { + pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n"); + goto out_error; + } + } else { + VM_BUG_ON_PAGE(page_count(p), p); } set_page_count(p, 0); set_compound_head(p, page); @@ -1745,6 +1772,11 @@ static bool prep_compound_gigantic_page(struct page *page, unsigned int order) return false; } +static bool prep_compound_gigantic_page(struct page *page, unsigned int order) +{ + return __prep_compound_gigantic_page(page, order, false); +} + /* * PageHuge() only returns true for hugetlbfs pages, but not for normal or * transparent huge pages. See the PageTransHuge() documentation for more From patchwork Thu Sep 23 17:53:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 12513449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C4D3C433F5 for ; Thu, 23 Sep 2021 17:54:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 017CC60FDC for ; Thu, 23 Sep 2021 17:54:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 017CC60FDC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id AF3E76B0075; Thu, 23 Sep 2021 13:54:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7BD96B0078; Thu, 23 Sep 2021 13:54:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8CEC46B007B; Thu, 23 Sep 2021 13:54:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0123.hostedemail.com [216.40.44.123]) by kanga.kvack.org (Postfix) with ESMTP id 78CE66B0075 for ; Thu, 23 Sep 2021 13:54:16 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 186B882499A8 for ; Thu, 23 Sep 2021 17:54:16 +0000 (UTC) X-FDA: 78619587312.15.91C6E41 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf13.hostedemail.com (Postfix) with ESMTP id 9EBFF13DC96A for ; Thu, 23 Sep 2021 17:54:15 +0000 (UTC) Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18NHmG2b012761; Thu, 23 Sep 2021 17:54:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=huwv76S06oC6GgrDsbSUKD9Mfu7EDsj+USSFwjbVOqw=; b=MDeENHIGavqLvMIO8NEYxBsMPFOiD11RWVpWy62mrjQwwFgJom65W2izyAw0seMvaiuc eRoP6AHfEfEuyuM4vgdpB14Z5CBIgByuuTDsO+NTtscKjjJ2suoLMWJ8PN5QftK9Ksbt g2zLoteIow8+Kzc2qzAHcbtxVLa96w/sOTPOa8nn+Ygbs51xeuLF6j5VoAk1ftISF1XK ocC75aei9OVHsKmj6KoaCtVE393MIGRxK2ESr1hNQYpKCTWkl9/6+8LrKHsIqgoPkiwd TH/C0m0OXckV9N0OWcP0uQVtN4D/njPg2q2ZoFjXButFPcwf9a+fFlsgUSGSlNdniao9 oA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3b8qkrayyd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Sep 2021 17:54:13 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 18NHpO9Z078953; Thu, 23 Sep 2021 17:54:11 GMT Received: from nam02-bn1-obe.outbound.protection.outlook.com (mail-bn1nam07lp2048.outbound.protection.outlook.com [104.47.51.48]) by aserp3030.oracle.com with ESMTP id 3b7q5e3pwk-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Sep 2021 17:54:11 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LgUhaO5oBCUBQlxhEE01+jHrpXEmEnxR5cjXDL+3bs+QabiUEHmv5viXNVWuOPBkeDzCItl8iRZ6gLKVW9usF/PrSqOVHbnlMdzkVG/cKp9BPv4X7Xm2MLX1DefXeXnImFz28a3VanhHJEWr1fPcQeIgu8xoVu5gbhEuNyBgdf9GgKRoS5/Gtl68HDMngv1SyuMltGIJhaaAXgLjf4SbbHDHnOVJELOKlYeiOkoAhhLx7MSak2rR2XaE2AvSoxFyjKmEQ5jPXHJlX6FzWV5ukH7MbDSHf7HmnT2IhQI1xo9Bft3x9cyunPSXH2h/VJajrdTpip/puS62MoOeLiJZ7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=huwv76S06oC6GgrDsbSUKD9Mfu7EDsj+USSFwjbVOqw=; b=lZ+Uyfx6c7+KQAZUgGyGJsYu4iNaI6Rc4pSyI+KDgDA4bYNaUCKUAtwPbfwVntCXL1FiDowuRo1imQl8zkwBF0MwtXmwUhXvbSE1TOSWypXPFDNz5fPSvSuHNJzaMpD/VvN7LBc5n46XReNQ5IxF/Tmmt54yQODnYglN2x4yDomdFaUOhwcPaCIZvHJEJ8bSLaWxxiwvdaAoIisfec2Tow0+9LoMTybNwuwx8en+0tN5MDngJwPWKI5yJK3MMkoAuT6sMrV1MN/jKccc5zhfj8jEjunkMRbnIoR6B2f2cdyUZXKv1zSf/mZNHYe41fdYEAfJrP+p9nv0GZcp5W+NTw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=huwv76S06oC6GgrDsbSUKD9Mfu7EDsj+USSFwjbVOqw=; b=ZOUmBdgQoelhmk2gKJXyrU7vBK4gclV01ARDCHgh8UQjldWBNcTuZ9ych452XuNmfCyIUhdMvIkNkLtekaEBHBxS4wMkyVh8ViTYpgiT5wxwAqh9LwW/IqxVeiIgjBSc0qFnlLNvotfYxD5BsEeFVLewyoyQwMoPZKabokZCbSw= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by BYAPR10MB2998.namprd10.prod.outlook.com (2603:10b6:a03:84::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.18; Thu, 23 Sep 2021 17:54:08 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::a4a2:56de:e8db:9f2b]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::a4a2:56de:e8db:9f2b%9]) with mapi id 15.20.4544.015; Thu, 23 Sep 2021 17:54:08 +0000 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , Michal Hocko , Oscar Salvador , Zi Yan , Muchun Song , Naoya Horiguchi , David Rientjes , "Aneesh Kumar K . V" , Andrew Morton , Mike Kravetz Subject: [PATCH v2 4/4] hugetlb: add hugetlb demote page support Date: Thu, 23 Sep 2021 10:53:47 -0700 Message-Id: <20210923175347.10727-5-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210923175347.10727-1-mike.kravetz@oracle.com> References: <20210923175347.10727-1-mike.kravetz@oracle.com> X-ClientProxiedBy: MW4PR04CA0164.namprd04.prod.outlook.com (2603:10b6:303:85::19) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 Received: from monkey.oracle.com (50.38.35.18) by MW4PR04CA0164.namprd04.prod.outlook.com (2603:10b6:303:85::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.15 via Frontend Transport; Thu, 23 Sep 2021 17:54:07 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 442e2ced-ac2f-47d4-83f5-08d97ebb2413 X-MS-TrafficTypeDiagnostic: BYAPR10MB2998: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:6108; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: h/nRfO6fDsOthSrKWYONGj0XyQhRq+IrVsvGzujLh6M+klxgeCOHScOjOD6hNcNZTn+gxVYZqdjLFXDBCNx72mtjbaC0p23dOPOM/PDrUvSF7tbsqrS6HESmsFeeXVIbvwhK/PajUVwMJ/0n17fNUHRI/2ZSlc0wtc/GWnhFspIlVINV4NSPqdUnL9/I2+PdSMZO45Y1JdIATH/HzkJTlXDk7Rf00H/pkxROMCNEJxCJjM7G/i1VuMJrWLvw+gSBiTl6wwMtTnpapj/ZqDwrc4JiTutITYKtLKe9cGCGCC5GQtxBqvQJ5U3sxXZWC1Op4dF6m+0OYWREXMVTDH/FMWV2m0c5ztKLvPxmv5lU9PP4D2pPyzy3WclFkXQLXTsiMsuh7uVObNSkKlD6nNs4H0w/9AQ3wPCp5AC4HRe6iffOwutOSbLQCvgF5psibKNG2ELWK+/8/WDAgtznG3zP7rr8fXI6VTGrItcFc+ctg/w+S6a/WZasLVxTfQj3OCipvK1mxUlpMkM5/ZGah3eirOfa9HkDpLFm6NIE1I6ChqFUTnewYYsOzdoioYYbBhajgatAcaMgiAJwrW33JEVQxusmR8RDLNtTLmn7GDKWULJxaTF5Ah9OZedB+yLpxQ6k8eYRsJdblpl6oA81EynHn4V63HphAzHYjUZXd24z1R2r2CCHKtuID2EJ+5OctQMJceRtKBO5OVGM2r2D9joi4Q== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(366004)(2616005)(2906002)(6666004)(4326008)(38100700002)(52116002)(956004)(8676002)(7696005)(107886003)(186003)(1076003)(83380400001)(508600001)(8936002)(36756003)(66476007)(6486002)(5660300002)(26005)(38350700002)(54906003)(316002)(44832011)(66556008)(66946007)(86362001)(7416002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: R4ql2SjPPKZdxIwYYoheaBkcR1pKjxTqJhJaGByIzZvDCa8AxxhYrTphwadL9tXrIdpllL3N1ssHOIV0zA9m967RSfJDyb3pQS1Jht85j3VBF430GnlLSkUIw0ba1qexaNVZ6OV2FKzucg1uGMvIKStQB2gXo3lzkl7QFWhmQXzWydV6phDW971WK2vBaoshEKClRAgJzkSZuGhSnRv+usURM6DFNOGDbifq4Mhnt64b1ArhUd2qFVK0ZDjrl1DqOiBfPzVrewWu6uSJKvwSscL7Akp9h3MgkGCbAVRdQlYCQ1dHNXHimznpOtxh+UiUc59b94uAKor5MzVAPvKp0xMZXP/tqzvzT0GfgamtC56bUZxIzvLCYx4zZk9sAiv+VbX+br5xQ+oZhh7DeJ+rTTxH3Ma3Zcn8y7/HUmS9U/FnG86sD1ph7vcCW6TCoeHWUYBMT+QfwKJ7ro9YNtUszMGb91hnGp2vIyeTtTRB2vlpaYl3CMGmyAj6FxwQtt7rPptLD7/mb3DjKjV0XQjoPHWwPbK4ge+gR31am5I2iZ86Lodl2ZyhQaM8yYP6+WLUFxdiuraD/2GDw/lAz658Ip4jzXpIZlP1D8kZ+svqMv4MScvMcLQBCCIUOWYol5x4Zzbl2+yx84EeNxkXNexKirggrC9KW6c08K6/JP3ziQSHNH6aexqC9LOPCQ7degwXm7ZNTM9dlLgkAOCTJjcX+Ivbvm9DeN9W1y6c4jfA3Ku4E2hZxeGJWUMQ064s32YLHBcuTaLbITbOhHVGtOwnmqAoW6VxgeYOUUuuuAnB/ODSRn5CK5HJHUIuyixpEGMiVFB2mYYHm2g2RrBdD48OeC+LS1S3kaHMNcaJ5KmZair+dt4f9Sl49wbIR47cWIocn1sRpye7sI2x3T4/oVnT43sFS04KcvMWPOkZwUNJAPPCVZ+4y1tcOQYDr2+AiMT/Q0mnHs1n2fxysH+S9U8ZI4Gy59ZgmymtvS3i3BaQ5LEfjxdPgKUPpMIHlJ8p2uOfhAYTcTVBjjcZGhNVQBwM8RkjjuKkqOGSKELeknKFxXHx7/Q2bf80qIEVwY2PWnnBM4DMdaodKfiTERx913sO0eYxbbIQdNXKfhhnm3widdoQtt+a7CQNs8Wu6Ak7IVBZXaoGFQZezk13HA1Cnoqojp1rdNDevLJCd7UErLZxXClSuMFC50VDXbTOOHHN3Qxt0VRjjLLTPa+/yHtm8BRq1dgD4vHs++dSjpbQTo6s4bVuI7at27xxWVUI2iAixMzgjpsxQ7trBUVgBBJVeoAR8JgP683qw2bNprWl78n0DZBWIZk3HyPeEN/lttXv8x3T X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 442e2ced-ac2f-47d4-83f5-08d97ebb2413 X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2021 17:54:08.0690 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: xHPtqFEOui8CeuDFxl7tgFcPK8NjhBmURMaCsd9PDws37E8FvodS1Wb5KqOBiRX1G9+RNQCGx5dMJe7eeof6og== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR10MB2998 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10116 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 mlxscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109200000 definitions=main-2109230107 X-Proofpoint-ORIG-GUID: jjb-B-MvH75By613x3GRP2EgaZlDO_mN X-Proofpoint-GUID: jjb-B-MvH75By613x3GRP2EgaZlDO_mN X-Rspamd-Queue-Id: 9EBFF13DC96A X-Stat-Signature: 8hh14teksokriuf8ffq7ht95jfse8b44 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2021-07-09 header.b=MDeENHIG; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=ZOUmBdgQ; spf=none (imf13.hostedemail.com: domain of mike.kravetz@oracle.com has no SPF policy when checking 205.220.177.32) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (policy=none) header.from=oracle.com X-Rspamd-Server: rspam06 X-HE-Tag: 1632419655-433148 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Demote page functionality will split a huge page into a number of huge pages of a smaller size. For example, on x86 a 1GB huge page can be demoted into 512 2M huge pages. Demotion is done 'in place' by simply splitting the huge page. Added '*_for_demote' wrappers for remove_hugetlb_page, destroy_compound_gigantic_page and prep_compound_gigantic_page for use by demote code. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 75 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2317d411243d..ab7bd0434057 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1260,7 +1260,7 @@ static int hstate_next_node_to_free(struct hstate *h, nodemask_t *nodes_allowed) ((node = hstate_next_node_to_free(hs, mask)) || 1); \ nr_nodes--) -#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE +/* used to demote non-gigantic_huge pages as well */ static void __destroy_compound_gigantic_page(struct page *page, unsigned int order, bool demote) { @@ -1283,6 +1283,13 @@ static void __destroy_compound_gigantic_page(struct page *page, __ClearPageHead(page); } +static void destroy_compound_gigantic_page_for_demote(struct page *page, + unsigned int order) +{ + __destroy_compound_gigantic_page(page, order, true); +} + +#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE static void destroy_compound_gigantic_page(struct page *page, unsigned int order) { @@ -1428,6 +1435,12 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page, __remove_hugetlb_page(h, page, adjust_surplus, false); } +static void remove_hugetlb_page_for_demote(struct hstate *h, struct page *page, + bool adjust_surplus) +{ + __remove_hugetlb_page(h, page, adjust_surplus, true); +} + static void add_hugetlb_page(struct hstate *h, struct page *page, bool adjust_surplus) { @@ -1777,6 +1790,12 @@ static bool prep_compound_gigantic_page(struct page *page, unsigned int order) return __prep_compound_gigantic_page(page, order, false); } +static bool prep_compound_gigantic_page_for_demote(struct page *page, + unsigned int order) +{ + return __prep_compound_gigantic_page(page, order, true); +} + /* * PageHuge() only returns true for hugetlbfs pages, but not for normal or * transparent huge pages. See the PageTransHuge() documentation for more @@ -3298,9 +3317,55 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, return 0; } +static int demote_free_huge_page(struct hstate *h, struct page *page) +{ + int i, nid = page_to_nid(page); + struct hstate *target_hstate; + bool cma_page = HPageCma(page); + + target_hstate = size_to_hstate(PAGE_SIZE << h->demote_order); + + remove_hugetlb_page_for_demote(h, page, false); + spin_unlock_irq(&hugetlb_lock); + + if (alloc_huge_page_vmemmap(h, page)) { + /* Allocation of vmemmmap failed, we can not demote page */ + spin_lock_irq(&hugetlb_lock); + set_page_refcounted(page); + add_hugetlb_page(h, page, false); + return 1; + } + + /* + * Use destroy_compound_gigantic_page_for_demote for all huge page + * sizes as it will not ref count pages. + */ + destroy_compound_gigantic_page_for_demote(page, huge_page_order(h)); + + for (i = 0; i < pages_per_huge_page(h); + i += pages_per_huge_page(target_hstate)) { + if (hstate_is_gigantic(target_hstate)) + prep_compound_gigantic_page_for_demote(page + i, + target_hstate->order); + else + prep_compound_page(page + i, target_hstate->order); + set_page_private(page + i, 0); + set_page_refcounted(page + i); + prep_new_huge_page(target_hstate, page + i, nid); + if (cma_page) + SetHPageCma(page + i); + put_page(page + i); + } + + spin_lock_irq(&hugetlb_lock); + return 0; +} + static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) __must_hold(&hugetlb_lock) { + int nr_nodes, node; + struct page *page; int rc = 0; lockdep_assert_held(&hugetlb_lock); @@ -3309,9 +3374,15 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) if (!h->demote_order) return rc; - /* - * TODO - demote fucntionality will be added in subsequent patch - */ + for_each_node_mask_to_free(h, nr_nodes, node, nodes_allowed) { + if (!list_empty(&h->hugepage_freelists[node])) { + page = list_entry(h->hugepage_freelists[node].next, + struct page, lru); + rc = !demote_free_huge_page(h, page); + break; + } + } + return rc; }