From patchwork Wed May 24 17:18:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 13254339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0950C77B73 for ; Wed, 24 May 2023 17:19:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 109B8900003; Wed, 24 May 2023 13:19:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B8BE900002; Wed, 24 May 2023 13:19:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC26E900003; Wed, 24 May 2023 13:19:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D80B4900002 for ; Wed, 24 May 2023 13:19:14 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 95927C0982 for ; Wed, 24 May 2023 17:19:14 +0000 (UTC) X-FDA: 80825809428.21.E6A5F8B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf20.hostedemail.com (Postfix) with ESMTP id E71441C000B for ; Wed, 24 May 2023 17:19:11 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf20.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684948752; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=dARcBmXQLSDx0lN1xcRFDDZZucxbFiIPH6W92cCYhoA=; b=7LovujEzxLK4O9C71MwSitsX7FD0DMQgXl6xG4YvLR0Wtzo+Z5Zx7Vfe5WZdhhQsZEDknc VhK4Y5CdMAQwPLaSgXg8A9MkoDU5eL2Q0OC2gvUySck6RvWDTVgekeE9xKWbVpX3UsFb/w k3W6XGBer2AkEu4J8Kjbf4mIZcsWYGU= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf20.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684948752; a=rsa-sha256; cv=none; b=CQ3KA8MChAss8ZWYgakUQgAgHCTOR+be4Kf7+k0nq5E5t13gkEuztUWN6172GFXb982Sde JAytD4uZd+ptMt0UD1SUQof8etkjPT8W2z0bPpFid/MVFYDNyUHwqOcVgjWSFN6AdDY+h3 LS1iv7FPML52btOh9Fzqge7JrnGy51M= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A9963638B6; Wed, 24 May 2023 17:19:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 627E8C433D2; Wed, 24 May 2023 17:19:06 +0000 (UTC) From: Catalin Marinas To: Linus Torvalds , Christoph Hellwig , Robin Murphy Cc: Arnd Bergmann , Greg Kroah-Hartman , Will Deacon , Marc Zyngier , Andrew Morton , Herbert Xu , Ard Biesheuvel , Isaac Manjarres , Saravana Kannan , Alasdair Kergon , Daniel Vetter , Joerg Roedel , Mark Brown , Mike Snitzer , "Rafael J. Wysocki" , linux-mm@kvack.org, iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org Subject: [PATCH v5 00/15] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Date: Wed, 24 May 2023 18:18:49 +0100 Message-Id: <20230524171904.3967031-1-catalin.marinas@arm.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-Rspamd-Queue-Id: E71441C000B X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: qsiwebgdjbt9wmbsbxsbuoss1rt99dj4 X-HE-Tag: 1684948751-449780 X-HE-Meta: U2FsdGVkX1+AIHnD5hFyraw126dZM6eJhgqE1v1AAGIk8uQJxCHaUA90hzZPO/du3GgQLPepda76Jp0k9cXCMoS8bbP3V+YC4v6d3UYGgUfV7JNld0SFoxxnJxK6hmzc+fNAtCX9r6JoVn4M71wvOvpfxq2ywvmO5mNg4c6qWpdH2XLNAvY5YRjMEyiCfo1X90vQJZna6mMuV9Dezd+ArBGYF1ol3RsPJDpx6DHWscHFixYpiWOBg4muFssVQDrxRhh6aYn1HKNt7cB9LYEHz+4xLdHCzuSbFOiZ9TvphkHzL4fGnFkqsm5mlsaDS9pB+aB8OSX2daRofHzDhdXvIh7Y+Bi65j8NrkOEoImaPvzImufXcoMPL0VAEn/q/l0EW3ZPXDwPO1evpyh5YgRBwRSJQWKxoTLEuuqzKFm3Qry4KSqpIwKjd3RbdfyO/OsCakz+h9sHA82Y56TphV+2uFerfyWcG4awg95V2s3sGyD22/BuEDjmwUjKXnBQeeICYnujpwjdIrHFKweBJH1tO1KpVxt1U9Ynoc1glQej4foQS8rlCBp+ma5LamAedG0Oe/94ZDi9QAjUlUUtgHkNBfCiO8t29iZReZa/fjpmuVYRC+uMAo0pZJ5h0NIs7+/nRJMvYQjhUuCnG8eR5Z0HdvmLegpYZNHtp6NBKbdAK2MNvGIH2T6sx2ETyCmKnOwnq2cfDfLhC5e5/rmiHIdlaF3ROksponaynilKAAdmna6SMZUia+5NBJCIl95AXJMAqvLqTPB+KXjl4Qy7c06jSGQO/gFAJP7zzQ0cQauIFMIXGxsl22hff1kYnYctjG/Ncn1a7TF8MH8ThvDjh3hsOdRVWYKR8dnTsdEEnXBYXytt3/5vL7w/8j3qdJ4nTBysBblbwOH42CaIDzIP+mK+s4I3zJZJwV2U6fKk7598NFx1Ro+iO3uRrGERwwukKHfAIBoJjCXYMZXITjWn5qF +5rH4MVL 1dTyF8Xm9RA0SkscmcxlRBGOwc4H9nYUmTuVsXzrrQ65HvdLgFzjmJysOa1V79lnjgHL3CzdYxpFcWEqny2A4ExT0uQBqd8XfCx49EN7/k7mnB8KCaUKZKDqS+c1DY9Cc3ujcTQG56BQLl+kPv25OBd6GHg91f6WWpspeOkJfZzGuvoTpfFq70s2n+0dYN6LvdQdD+0Dl8EPmtdXs2+tn7JkSn9HJtpJ6w8pefwnbBXl3AMjN/hhMi9p4PEMEcX2k3JG0NpMmRKiVtFySlE1mIBuN6pzp6S9rLda+J4gusd94lwH3CPkT0Sbs+lsKyxExTnitE1km0aOhLaLDMs8HnLbZVqhg5FxnA87G5ivJCNYDItjsEJHJr8DlVvb4HUySCKKS1ocPUzMQmeCmGhDy3Ql9/w8aHRdf5YeuR5x2v7NY4Fl3sc8xeAM3tJO9tLBvZL1BWCxzG+gVgKDCYFxbQWzNfDBBdZBtI9NJ5XGtBF4vdXJfiOM8p3wXuwM6ya5wFyQ8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, Another version of the series reducing the kmalloc() minimum alignment on arm64 to 8 (from 128). Other architectures can easily opt in by defining ARCH_KMALLOC_MINALIGN as 8 and selecting DMA_BOUNCE_UNALIGNED_KMALLOC. The first 10 patches decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN and, for arm64, limit the kmalloc() caches to those aligned to the run-time probed cache_line_size(). On arm64 we gain the kmalloc-{64,192} caches. The subsequent patches (11 to 15) further reduce the kmalloc() caches to kmalloc-{8,16,32,96} if the default swiotlb is present by bouncing small buffers in the DMA API. Changes since v4: - Following Robin's suggestions, reworked the iommu handling so that the buffer size checks are done in the dev_use_swiotlb() and dev_use_sg_swiotlb() functions (together with dev_is_untrusted()). The sync operations can now check for the SG_DMA_USE_SWIOTLB flag. Since this flag is no longer specific to kmalloc() bouncing (covers dev_is_untrusted() as well), the sg_is_dma_use_swiotlb() and sg_dma_mark_use_swiotlb() functions are always defined if CONFIG_SWIOTLB. - Dropped ARCH_WANT_KMALLOC_DMA_BOUNCE, only left the DMA_BOUNCE_UNALIGNED_KMALLOC option, selectable by the arch code. The NEED_SG_DMA_FLAGS is now selected by IOMMU_DMA if SWIOTLB. - Rather than adding another config option, allow dma_get_cache_alignment() to be overridden by the arch code (Christoph's suggestion). - Added a comment to the dma_kmalloc_needs_bounce() function on the heuristics behind the bouncing. - Added acked-by/reviewed-by tags (not adding Ard's tested-by yet as there were some changes). The updated patches are also available on this branch: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/kmalloc-minalign Thanks. Catalin Marinas (14): mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN dma: Allow dma_get_cache_alignment() to be overridden by the arch code mm/slab: Simplify create_kmalloc_cache() args and make it static mm/slab: Limit kmalloc() minimum alignment to dma_get_cache_alignment() drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/md: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN arm64: Allow kmalloc() caches aligned to the smaller cache_line_size() dma-mapping: Force bouncing if the kmalloc() size is not cache-line-aligned iommu/dma: Force bouncing if the size is not cacheline-aligned mm: slab: Reduce the kmalloc() minimum alignment if DMA bouncing possible arm64: Enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64 Robin Murphy (1): scatterlist: Add dedicated config for DMA flags arch/arm64/Kconfig | 1 + arch/arm64/include/asm/cache.h | 3 ++ arch/arm64/mm/init.c | 7 +++- drivers/base/devres.c | 6 ++-- drivers/gpu/drm/drm_managed.c | 6 ++-- drivers/iommu/Kconfig | 1 + drivers/iommu/dma-iommu.c | 50 +++++++++++++++++++++++----- drivers/md/dm-crypt.c | 2 +- drivers/pci/Kconfig | 1 + drivers/spi/spidev.c | 2 +- drivers/usb/core/buffer.c | 8 ++--- include/linux/dma-map-ops.h | 61 ++++++++++++++++++++++++++++++++++ include/linux/dma-mapping.h | 4 ++- include/linux/scatterlist.h | 29 +++++++++++++--- include/linux/slab.h | 14 ++++++-- kernel/dma/Kconfig | 7 ++++ kernel/dma/direct.h | 3 +- mm/slab.c | 6 +--- mm/slab.h | 5 ++- mm/slab_common.c | 46 +++++++++++++++++++------ 20 files changed, 213 insertions(+), 49 deletions(-)