From patchwork Tue Oct 22 21:34:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Price X-Patchwork-Id: 13846185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AE4FCDD0FA for ; Tue, 22 Oct 2024 21:34:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 021956B008A; Tue, 22 Oct 2024 17:34:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F13906B00B4; Tue, 22 Oct 2024 17:34:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D8D016B00AA; Tue, 22 Oct 2024 17:34:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id BC68E6B0088 for ; Tue, 22 Oct 2024 17:34:53 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7BF4EC0595 for ; Tue, 22 Oct 2024 21:34:35 +0000 (UTC) X-FDA: 82702542762.10.5538589 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) by imf30.hostedemail.com (Postfix) with ESMTP id 5D4218001B for ; Tue, 22 Oct 2024 21:34:20 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=MRtZUore; spf=pass (imf30.hostedemail.com: domain of gourry@gourry.net designates 209.85.160.171 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729632739; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=uBYLsYsNywobQxK/h1515j3uFY4kB27yt4x47cST8Cs=; b=xcFKXmuVMUQhr3hXb1IVdRfxkfVVx8s3tituT77RZWKsY319V68ZZh6urk4FNSik3QpRtk Vt2w8p6imIUx3nb2B2HOe0vQ8mEMWXidIB0OQHpMUI/WO8AwruUftVcuyEPWBKAf4o99uB l0lPqJpSSCfTDiZ0tBbAkPpHEt0q9gU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729632739; a=rsa-sha256; cv=none; b=JmCijx2W+iyeY1WQouc0MG7/zt/jrQAkpwmbrt2AVoPuVvqrFSlFEKGAZcRSuzTCukuCud sPd+CREnpnFDBfNVWU28pbvPDClGEtuQBnRz37DMAsEATA+EPFPxXpFH/2n5JvxIGBf/ka iqkpnKL1ZvpP+UmHfVkuef344lSoidg= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gourry.net header.s=google header.b=MRtZUore; spf=pass (imf30.hostedemail.com: domain of gourry@gourry.net designates 209.85.160.171 as permitted sender) smtp.mailfrom=gourry@gourry.net; dmarc=none Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-46097806aaeso41931001cf.2 for ; Tue, 22 Oct 2024 14:34:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1729632890; x=1730237690; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=uBYLsYsNywobQxK/h1515j3uFY4kB27yt4x47cST8Cs=; b=MRtZUoreplY0g5dLjAeedI8Aw1Oh77t/dFk+5SwVPQ/jUHuGFVECSdZ5RDJ6t4Rz5K 1qRxcSdhX/L6p7VBhHteWz+cmxRuOra0mvglSSkSILwO0pFSwhF7xYuk+9sKxk+llmBH fJ57As8I67X3lKD14btVn+Dy4oXzo0cLP6bcjy+eQWLl2sSg7VbXkur1uUqx6nYviYHZ 1Y+1MoWI9t6xjQ6Hg4ssr1RfciSrixBqFdVFIkEqbKnzCwpwxDNpsuxChmV/7yX9r/NX 7J420iPIgl/bnNd1iMccmhYQUfGlawq0YKG528jf4OgtIgb4QJ8viEOT2ZCpZih65jyf wfhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729632890; x=1730237690; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uBYLsYsNywobQxK/h1515j3uFY4kB27yt4x47cST8Cs=; b=Z9sS+B+XJ4WY1chua5fR6lCjFyEPE9NXlS+Ux1XgJ52hbnkZaxBs9IBtGUqPppd9SQ 2D+IBikHjzD100l0Y/IGxU5A/yvmjtmr/BGw6ZEU7y7tegJW3r9PwFqYxNGGgW6Dq47W qf5iNabGhivImiSG9/zQ/QofNjBsBaQI7N3ZKglZzgsj85CagaPRO4vjSvSfXhZXr0gF b8Wh8LDAabryM9+Yk6jUWsopM+bdD49yNmqE2KKeNf4VwbFl7reCFkKmTd54P32erPN8 vaChsH+EmBJF5QJzviQKgWEHjBVWVBEJTvAd/WWUuQ6+bang/gie66nmUkpB44NXhkZM JnJw== X-Forwarded-Encrypted: i=1; AJvYcCXHbh8RvA5W2M4woOIG80WZXU1W+8WXhgalVn5lo1zFQhURpUrpiCzyEl1/rK50iC1bmixJ0bSJEw==@kvack.org X-Gm-Message-State: AOJu0YxG9/k39dNNRYKJG7TahJfzdKHXBk733w1cIio/47y/SkeCzQ40 TB8mvQznOIy2kY5/U7j707OT7HIKMwl+MbLj4ZzEqsEWRWo2Ml1+NE01jnmoROc= X-Google-Smtp-Source: AGHT+IHM2E05s+anctMoYNkUuZe4zTvXeeDfOgrPz64DGCsYwykO2+8iSWDjYvWyYxuMCWUxh0riuA== X-Received: by 2002:ac8:7f53:0:b0:461:13ca:b2a with SMTP id d75a77b69052e-461145ba9e8mr4394361cf.15.1729632890514; Tue, 22 Oct 2024 14:34:50 -0700 (PDT) Received: from PC2K9PVX.TheFacebook.com (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-460d3c62f4dsm33845841cf.28.2024.10.22.14.34.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Oct 2024 14:34:49 -0700 (PDT) From: Gregory Price To: x86@kernel.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org Cc: linux-cxl@kvack.org, Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, rrichter@amd.com, Terry.Bowman@amd.com, dave.jiang@intel.com, ira.weiny@intel.com, alison.schofield@intel.com, gourry@gourry.net, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, rafael@kernel.org, lenb@kernel.org, david@redhat.com, osalvador@suse.de, gregkh@linuxfoundation.org, akpm@linux-foundation.org, rppt@kernel.org Subject: [PATCH v3 0/3] memory,x86,acpi: hotplug memory alignment advisement Date: Tue, 22 Oct 2024 17:34:47 -0400 Message-ID: <20241022213450.15041-1-gourry@gourry.net> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Stat-Signature: ijz36nsgdn63szq64nzn787tqo1f3xro X-Rspamd-Queue-Id: 5D4218001B X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729632860-323607 X-HE-Meta: U2FsdGVkX1+8DmaB0Z1btk1mHXP3ayED8qZwMtom55H0qWZasy8FnidqhcGwaUqKLmt6B9o2zBrlNj52T4vchEf5F3WNdL7g0wFkEg34CSFhNFiAP21bIIfXPc+/ox/Kmsqsfe+Pa0ia75GMqtrxQ0UkZFsbYjsvS7MUYJL+2XACZHqB9Du4Ec55knhO0zn722ds9DXrU33wGVaQ6bxgYJEtTzgIiofNFkAWGAav9PIYVF+l5fTc/s/QnVWyM3Xrxmt3qD8KGWRKXwZMIboqHXEcCls44YSSyYRA+2816ollc1qTbpjCsctsdXbeDtGnQhsmojbGhERrgEa5Jg1HleHj4La8M5z85oPrj5Afn+V0gazBDjsDRsHgcoScP72jOg/Vprq9lZBluePi2HilatelDNRWYUp1Ysb5D1MVBWGhq3qaJ7+oE96qS+P8xNuZRDpLuGEN7wWdFvXdy8/LpDQ0jFtZDzT4f8WXyDvj3J4McOY4xs9cEgP+gh/CUz8g8ze2nnl2rh7c+zadi7Cbpj8no1fC4mn8vvEOe5oV2pX7mQWyTXYNJ1lTsDTSxEsyhu4vEXlsxZnjhBR9c70OhEwDmEqzuUrNvx+aYkZHAJefDnd7u30ftxiPxWnMih0X3yjCcow7zbasJ/vcZI799zDqIlOejYNQdHfZu+KlSJdhskUdjI6QXms7aZ8I2i69meEDljzyYhpyu0zyOFC/E2p+NTIWmZU/L56fVZ+s83AcfXaxD6FT3gSDGIyhC/W1XCzWWF/DyM/JdBxWWc2b0t2A9bcp50S8JGsjyC2b7TsuYfNL8u/CLrbaRM0NAZsCV7Hx0sjtyWdjT1tIybKcU9DDcFvW1xaTPEVKKiGeAZUGLkTqg8p7euqexMOQBBTPkXOaHiTQuM5BHEAR8cWGSv8KGQBNj8wdVhTYbIlidypNOFI2dqG69XPpKhVlySZ+cB7hTAmouZLUk8uBtwC WXam1YaJ QZO2cSgf3iUF3uZxFdaZa7s4AiXSj/wvBwSolTySqJOPfUttS32YlVGz6kgDPyUZnE+dUuKwomSaqwXKSmVo+t676R6c7WVosBHzdnQk2nwT0EERp/446PX/kPdBSxPMzjj1dhnOxtIdfJ3C0hm0QTG1KUq75UIOScE3U8se29rTqYo4XSD0pjIU4uzCkUbpOSIi5o9YQRkPquimv5hyoJNofjyK/JbpVr+kIwlAx9G0j+XGiaIAYMcQiNVlrdhScvEc2GV5rB1UWgeL5J1KxfKShKZO8aFuNq0hvUJVjV6VGeatoimtOKPzK9zpCBBIXPs32ckhzNH8hFw9JdQ7IkMl2JJGS+Brd2bu0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When physical address regions are not aligned to memory block size, the misaligned portion is lost (stranded capacity). Block size (min/max/selected) is architecture defined. Most architectures tend to use the minimum block size or some simplistic heurist. On x86, memory block size increases up to 2GB, and is otherwise fitted to the alignment of non-hotplug (special purpose memory). CXL exposes its memory for management through the ACPI CEDT (CXL Early Detection Table) in a field called the CXL Fixed Memory Window. Per the CXL specification, this memory must be aligned to at least 256MB. When a CFMW aligns on a size less than the block size, this causes a loss of up to 2GB per CFMW on x86. It is not uncommon for CFMW to be allocated per-device - though this behavior is BIOS defined. This patch set provides 3 things: 1) implement advise/probe functions in driverse/base/memory.c to report/probe architecture agnostic hotplug memory alignment advice. 2) update x86 memblock size logic to consider the hotplug advice 3) add code in acpi/numa/srat.c to report CFMW alignment advice The advisement interfaces are design to be called during arch_init code prior to allocator and smp_init. start_kernel will call these through setup_arch() (via acpi and mm/init_64.c on x86), which occurs prior to mm_core_init and smp_init - so no need for atomics. There's an attempt to signal callers to advise() that probe has already occurred, but this is predicated on the notion that probe() actually occurs (which presently only happens on x86 and acpi logic). This is to assist debugging future users. Once probe is called the first time, it will always return the same value. Interfaces return -EBUSY and 0 respectively on systems without hotplug. Suggested-by: Ira Weiny Suggested-by: David Hildenbrand Suggested-by: Dan Williams Signed-off-by: Gregory Price Gregory Price (3): memory: implement memory_block_advise/probe_max_size x86: probe memory block size advisement value during mm init acpi,srat: give memory block size advice based on CFMWS alignment arch/x86/mm/init_64.c | 14 ++++++++----- drivers/acpi/numa/srat.c | 33 ++++++++++++++++++++++++++++++ drivers/base/memory.c | 43 ++++++++++++++++++++++++++++++++++++++++ include/linux/memory.h | 10 ++++++++++ 4 files changed, 95 insertions(+), 5 deletions(-)