From patchwork Mon Sep 25 08:21:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13397475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48CDBCE7A89 for ; Mon, 25 Sep 2023 08:21:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B21196B00D0; Mon, 25 Sep 2023 04:21:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AA68B6B00D9; Mon, 25 Sep 2023 04:21:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91DC96B00DF; Mon, 25 Sep 2023 04:21:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 7BB876B00D0 for ; Mon, 25 Sep 2023 04:21:23 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3CEE4160B04 for ; Mon, 25 Sep 2023 08:21:23 +0000 (UTC) X-FDA: 81274425246.29.F0F538E Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) by imf05.hostedemail.com (Postfix) with ESMTP id 76B8110000C for ; Mon, 25 Sep 2023 08:21:21 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=t9LdxQ4Z; spf=pass (imf05.hostedemail.com: domain of hughd@google.com designates 209.85.128.182 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695630081; a=rsa-sha256; cv=none; b=JHHpLn2cTB7swE/y9aTjk/SK+QF59/KZPWXIlkk3Se83LDJeMZyZ1N28QAKaFBd2YWOlUk 7DA2Y5K4A/AYmBcquXIlmr0HszGF135hbmefUxQu1R1PgSY/0mzcy2iiu2BhdCldT0Bigx w3Kn+hC9le8lu+KwXPcFqUidqQged/Y= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=t9LdxQ4Z; spf=pass (imf05.hostedemail.com: domain of hughd@google.com designates 209.85.128.182 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695630081; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=51Cow/pgH6uhDfisazzoiSztYmzsaEmF2Mt51ptIi+M=; b=e7emN2SG2UQYbRC7e7XW2fNogdX2JHc0cxmv1FzhibwAPu34OuT0f9MsisSKdvL3TEt8mc 4e7N9OL+KP+NkXZo39E2X8FVPheRxsxJBG2F8Yf0oanifw6S9rCaJ3mFRZVFAmlpmuyYoW jgoZJ4jlMOy7j96UecGCDLtu8QTmWkQ= Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-59f57ad6126so30889497b3.3 for ; Mon, 25 Sep 2023 01:21:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695630080; x=1696234880; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=51Cow/pgH6uhDfisazzoiSztYmzsaEmF2Mt51ptIi+M=; b=t9LdxQ4ZXdxJiWgjesMw2L3uayXBtqhrVddxo/sCic4PEfa1DDnF4mX9+xfhXhNU1u j+JxKHAYW1h8nVXz+wB4gYnCxKmS2vEDnJ0zafzGUZABoE5nd12qI9mO9iXfNGsY+aHx seKCzeahNVeLUYfLMC4O3DoMex7VbMGtK2Np8qV2ilS0/czvJpQJkiF4Vy7tThbfydWz PhN03TeawGgmtKGugcAZYEHHXR/Z8fjefJ7r+Yc9ikxe3f/H7+iNQHmE2nJ2mGE31YWg GAi+B8VpPbWA+XwharRaERjLM/8hlVQVADa9ybyb8ZdkUPARMDiF6pE2h4NDWnZZdzAL iT7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695630080; x=1696234880; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=51Cow/pgH6uhDfisazzoiSztYmzsaEmF2Mt51ptIi+M=; b=Qbv3/lAAbgcfcaBb7iV2PiuS4huBaPYfkgfeVYS1MLKOi6KZXJLxAw+GjfqlzF6XWF YVipm/I8N1nx3fLNngoKUGKY7ZQTzG1jIWLOJMSSkUz2ZAQM56e3OHfPECxZ7lqYkbIh DceVw0PXpSpGeVA2gMRCCL04J7LSDSfq6i9njuQIZNpAdsrHwZXO3n0XuT/1CY0eqR60 HIqgSIxw8KsnduE3MiBP81yry4187ZZGvishIVzDKHau5H/3dy0i7yQXjGVZ5p5uRMt2 pY/JaSFd0Nu76G6reGWkIflgRbWOJBbkW9YBG2Fqi2Qyik0ylz/hjyPhALBatO23VrHE EwtQ== X-Gm-Message-State: AOJu0YwQRL4rg7A7RGdbimlikd6p0/eez4vWIrG9bYnkEbVT4xvc856F YUbRG8uXHR6YRQ17vgQ3F8Dckg== X-Google-Smtp-Source: AGHT+IGr7os4V6FD7A0Dc+N5PVL7LECactrhqyVufW3y5mQ7Ilx2z2M+P22nvGF/P0KFm7TJWuxF3Q== X-Received: by 2002:a0d:f884:0:b0:59b:dbb7:5c74 with SMTP id i126-20020a0df884000000b0059bdbb75c74mr6123161ywf.32.1695630080347; Mon, 25 Sep 2023 01:21:20 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id h67-20020a0dc546000000b00583b144fe51sm2289914ywd.118.2023.09.25.01.21.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 01:21:19 -0700 (PDT) Date: Mon, 25 Sep 2023 01:21:10 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Andrew Morton cc: Andi Kleen , Christoph Lameter , Matthew Wilcox , Mike Kravetz , David Hildenbrand , Suren Baghdasaryan , Yang Shi , Sidhartha Kumar , Vishal Moola , Kefeng Wang , Greg Kroah-Hartman , Tejun Heo , Mel Gorman , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 01/12] hugetlbfs: drop shared NUMA mempolicy pretence In-Reply-To: <2d872cef-7787-a7ca-10e-9d45a64c80b4@google.com> Message-ID: <47a562a-6998-4dc6-5df-3834d2f2f411@google.com> References: <2d872cef-7787-a7ca-10e-9d45a64c80b4@google.com> MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 76B8110000C X-Stat-Signature: eogojzcrf8ipseu3shjih335sroxjb51 X-Rspam-User: X-HE-Tag: 1695630081-990046 X-HE-Meta: U2FsdGVkX1/WkVFb8B3nEwcqH8eXQ6laWmdSRFI3To3cwFnaOjMtfXvi64/MIrQhTukQBzhcUJvgNAxGzewp+nehPN2KbCPSMZRZ3/6t1m/ZLatN1czy5ojS0BuDJYryUjQ9B+0cGLWy0N6ubHyN0aMH0DknZ22IMdYYTJrTZ81sQ6Qw5kbZW0dUAa3BDoaT2FYXjcQ+ySNzX4ep+CS2RRYkpT8FYpEnr2F8YbHmCPvaHyKhQATat0k9qDNVxNZE++pxxgC2VBpKs6X2SivXaiK6Ep70pUXEdf2DqbmXMChnAAGq3weHweYX6pbD4pHn0fL3L/C/jgsC00ZhdflD8TMQhKFug2f3d2LxQNPKSRFdq/Lchwj7kFduh0+iWQiqfcLFomtAPugrY8j07IqcOrN/19IJXtPQXZOAyRT2gUCNCcJEi1R7qstOxSNdz1cRrpV8ayg0Z0u3a8WfcMgvYQ5eChI524K8EjxeTchl8KGr0+GvZHKPI8M25SpqAYrolpRi5KDFyv3lFBUyJ1DwsT0F8iI13hMo/EUKFnj5/zCnTs1HsgfPlcjiltv95wQGH361p+We/off2v/uyiOb6REuHGq0AkzHdYxS56JDHuWeVWbHgFoIP0CqjIjz/gAB/Eb9JDZbW6SljoaOakLYlQJFLMNvlcLehwF5u6pPBeOfl1e0NNSmLrEc6YLvtvn++oSTkrIRZU5sYlDLbE1GX/DK0/W+HwYIP17kBhAkwf+4SAAnrtAqy3OZ1WQBGrQ53H1O3q2a4fFb5w7hRbnb0/hcW+v0DGX6m/0+tXsqygeV8+Wvhh9cPvcB8seIokg4H1pnnDZKdlWeSHnEBd48VPk1t25tdRXdQk6LsDxO5ex/O3aQsWmN9oUlOo3LmIuNYXoZ0HJ9cJx5dFo5XSib9oAcrRtpL7MouFI2y64m/FcuZFzJrATmCdB5r4q7aglFtfB+w+uFn84MlNflY0G qLZINZDU aTR9P4NuQxsiqSysA5R41/nttmmKhNTWaJ5whFA57Ref1PqmR+zk4oA6nA1NasFyqP3M4xtfOsKcSsfenjIgYeMzQc2JJBCELKz1A9Cav3xiAFD/ZEW3R5c/qqmLHcOfjGTjg0yEE42H53a2thBRW+m44i03+o7uo0vfdnzkq0xhRF3KAIpOSWT7yDi56eoyfk5MfEPsJkV3VL9nuE/71vJCqiLp60gLMvki3vWcU52AQxu7HDoVA11f575hKHEAvyPVhEDkDHwiK2ImaGIDmwjWGmurlUfmQaVXqXByQR2YYFMY2wlYWIhnjXucQ0TC0SJVWCyhtStvEhmZznbdyOWlzz7JWl/uDCnIVpECc7fFO8xXfo48uZECsSIV+RRiug5p/MHvD8lt1QOzATjpddOY/hriH8EbJUS75rEZqTQHP8eayAO5HUwND876nXWQhZpLnyQD/pVf8x90= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: hugetlbfs_fallocate() goes through the motions of pasting a shared NUMA mempolicy onto its pseudo-vma, but how could there ever be a shared NUMA mempolicy for this file? hugetlb_vm_ops has never offered a set_policy method, and hugetlbfs_parse_param() has never supported any mpol options for a mount-wide default policy. It's just an illusion: clean it away so as not to confuse others, giving us more freedom to adjust shmem's set_policy/get_policy implementation. But hugetlbfs_inode_info is still required, just to accommodate seals. Yes, shared NUMA mempolicy support could be added to hugetlbfs, with a set_policy method and/or mpol mount option (Andi's first posting did include an admitted-unsatisfactory hugetlb_set_policy()); but it seems that nobody has bothered to add that in the nineteen years since v2.6.7 made it possible, and there is at least one company that has invested enough into hugetlbfs, that I guess they have learnt well enough how to manage its NUMA, without needing shared mempolicy. Signed-off-by: Hugh Dickins Reviewed-by: Matthew Wilcox (Oracle) --- fs/hugetlbfs/inode.c | 41 +---------------------------------------- include/linux/hugetlb.h | 2 -- 2 files changed, 1 insertion(+), 42 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 316c4cebd3f3..ffee27b10d42 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -83,29 +83,6 @@ static const struct fs_parameter_spec hugetlb_fs_parameters[] = { {} }; -#ifdef CONFIG_NUMA -static inline void hugetlb_set_vma_policy(struct vm_area_struct *vma, - struct inode *inode, pgoff_t index) -{ - vma->vm_policy = mpol_shared_policy_lookup(&HUGETLBFS_I(inode)->policy, - index); -} - -static inline void hugetlb_drop_vma_policy(struct vm_area_struct *vma) -{ - mpol_cond_put(vma->vm_policy); -} -#else -static inline void hugetlb_set_vma_policy(struct vm_area_struct *vma, - struct inode *inode, pgoff_t index) -{ -} - -static inline void hugetlb_drop_vma_policy(struct vm_area_struct *vma) -{ -} -#endif - /* * Mask used when checking the page offset value passed in via system * calls. This value will be converted to a loff_t which is signed. @@ -852,8 +829,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, /* * Initialize a pseudo vma as this is required by the huge page - * allocation routines. If NUMA is configured, use page index - * as input to create an allocation policy. + * allocation routines. */ vma_init(&pseudo_vma, mm); vm_flags_init(&pseudo_vma, VM_HUGETLB | VM_MAYSHARE | VM_SHARED); @@ -901,9 +877,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, * folios in these areas, we need to consume the reserves * to keep reservation accounting consistent. */ - hugetlb_set_vma_policy(&pseudo_vma, inode, index); folio = alloc_hugetlb_folio(&pseudo_vma, addr, 0); - hugetlb_drop_vma_policy(&pseudo_vma); if (IS_ERR(folio)) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); error = PTR_ERR(folio); @@ -1282,18 +1256,6 @@ static struct inode *hugetlbfs_alloc_inode(struct super_block *sb) hugetlbfs_inc_free_inodes(sbinfo); return NULL; } - - /* - * Any time after allocation, hugetlbfs_destroy_inode can be called - * for the inode. mpol_free_shared_policy is unconditionally called - * as part of hugetlbfs_destroy_inode. So, initialize policy here - * in case of a quick call to destroy. - * - * Note that the policy is initialized even if we are creating a - * private inode. This simplifies hugetlbfs_destroy_inode. - */ - mpol_shared_policy_init(&p->policy, NULL); - return &p->vfs_inode; } @@ -1305,7 +1267,6 @@ static void hugetlbfs_free_inode(struct inode *inode) static void hugetlbfs_destroy_inode(struct inode *inode) { hugetlbfs_inc_free_inodes(HUGETLBFS_SB(inode->i_sb)); - mpol_free_shared_policy(&HUGETLBFS_I(inode)->policy); } static const struct address_space_operations hugetlbfs_aops = { diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5b2626063f4f..6522eb3cd007 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -30,7 +30,6 @@ void free_huge_folio(struct folio *folio); #ifdef CONFIG_HUGETLB_PAGE -#include #include #include @@ -512,7 +511,6 @@ static inline struct hugetlbfs_sb_info *HUGETLBFS_SB(struct super_block *sb) } struct hugetlbfs_inode_info { - struct shared_policy policy; struct inode vfs_inode; unsigned int seals; };