From patchwork Thu May 18 17:33:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 13247148 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50E6FC7EE23 for ; Thu, 18 May 2023 17:34:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E6CEE900004; Thu, 18 May 2023 13:34:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E1DAC900003; Thu, 18 May 2023 13:34:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE4DC900004; Thu, 18 May 2023 13:34:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BF31E900003 for ; Thu, 18 May 2023 13:34:12 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 94A2AAE353 for ; Thu, 18 May 2023 17:34:12 +0000 (UTC) X-FDA: 80804074344.22.BFEB696 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf14.hostedemail.com (Postfix) with ESMTP id AC17710000C for ; Thu, 18 May 2023 17:34:10 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf14.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684431250; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=1nZHiuecZj76Mbn0LflJauaxvwbbCn6g33UwI47N5mc=; b=ss3tUrrpko9OURTWoTg33LIyW3l5OCQR6muKw8VjxvTy6WrllFUpg/U/7UW/z4hvONI9ZO WkYSETjtZ/MT2nuoGv+Q94D/GgbyrGXs/CEoSmpjsYH9ZrVtOD5od9+sH+t0epKrzMJdxo lcKacfhch/Swv8H4itI8tabbjH9Q1EQ= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none); spf=pass (imf14.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684431250; a=rsa-sha256; cv=none; b=swW/YKhJCwyed//CA4Zp0/XvAp5ofrHTL+xgSdTicNMNkr5dS47lU0FYizR7lTZD09Dxoa dSL7f1Jy9kerltmxKmKcax0fCO5jB6ZpmiXs4M7qadT155aZ1A4Ye/d5FwohvNWN0WyLtT aw4Oc7bLcNZ4EDKqpQkq/GVPXLSzt1c= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 90130642F3; Thu, 18 May 2023 17:34:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 785F2C433D2; Thu, 18 May 2023 17:34:05 +0000 (UTC) From: Catalin Marinas To: Linus Torvalds , Arnd Bergmann , Christoph Hellwig , Greg Kroah-Hartman Cc: Will Deacon , Marc Zyngier , Andrew Morton , Herbert Xu , Ard Biesheuvel , Isaac Manjarres , Saravana Kannan , Alasdair Kergon , Daniel Vetter , Joerg Roedel , Mark Brown , Mike Snitzer , "Rafael J. Wysocki" , Robin Murphy , linux-mm@kvack.org, iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 00/15] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Date: Thu, 18 May 2023 18:33:48 +0100 Message-Id: <20230518173403.1150549-1-catalin.marinas@arm.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-Rspamd-Queue-Id: AC17710000C X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: t6rd9thu7by447wkekfrkz4b39djkh41 X-HE-Tag: 1684431250-923180 X-HE-Meta: U2FsdGVkX19JraYGZobQDeao4kBwoTEExkwWkfNCzNZ9v65XUpt62OcZIcN9GmdnRwC4711XSxciGiVGH27/vSvQ6yrUq13guMEknspKkNjh+iWh1uWRo2fHwW1rWSKBDk/40esl8EzTEDQF71pKrJTg6ncjB8YCtO089TPz3U9jaLjoMHrnwT2dvK4NJ5v8rW8buDEhyx+VyZQXsVvR6nKIX06HuVZD4PSqRpPUXzbwvtpKqptgvEL/zP+LQUrhjCQLu1eDbg8dQodbZY4iek8Dzl01bH07duvNnJ/8SQNuBk+Hzv7jkVar9zXW9IC0wkuqAnfBhLMrjKeVd4sNUhESmq5PRHA3eQVtmELj/Qu8H+3+JaXs+UXqcfc+4ZUpA13eFusKUIHoMFSX7pmQc5JkZsreeUT8klbSAgzQHVTM5MxYxUtPFUtaSB/+nDyCxtUjZQx9e+LtYVxiA/q/AYNMOAp5/So3ft9+FeWLFKPahx0BGyAku1ENtJJZj/aes/GTqiqcb/krBvSYqLn9fZmQ9ywmAUU7mIT5GhthKR8EeuqjwePS+sHMjG9P4IX1gcvSgkghrR+BiyCowYc57FPOCrruJjiMWzB1VuRdVVdjKPz10kGrWb9TLSvMjzebt+Rp0pZp8etNKaNNPQdMoHrFN6yW3sgOAxp2oOOt/h/BkxWuJAzsuPkCtd4KDpfGTLrxm6RtxpboqEK/qVhv8s0HukX9vthytxyDOSU33Bx4ON0CrGCJI0At4HpLqKvlaAtaG1105EFFy7UhAcQDqw10zw45vU4j4pe+Q1gh6FhwaszKEB8nUiZMdDSSL0qz4s9HwCNCycUHglAW2MUYlCdG3bJH5sAGeQDCo1CYc/RK+DDYZntbr81jwI3z9NCzEGljcIfMNmV74it8GFkrsHQkmdOGP8ViNlt6xCCEq/aWceAaQvKj0gTF2Ng93MKMj5q4/PNITUp8SdBlBWQ uhJTFfmt mygIHFMq/84In9kqf3pCz1AZX0arDldMIQRo+nDrUzjukdkBJKOYeOYCPzOAM8sPmbdNgADJmXVykRxC+9jjjovtY3iM4EI/64wNg77RXrqc23oQxorx8yXfirrYZkedCaaN9Gk/8JUi60O40tbZV4Ko3x0aXOFhRojqKPRQTQohCO9kXByW5f5zlpU2Zg2BQ0lQpsGlA8sRpxh3ARR2u2yyjIBlb1DtudZzPYd6/k2aE8dszf1wcAIq4NsgN38eB6/ysN4WiNoLsLuki6zhXWWbRy1V7geqY7RK7z0TTpiR1Cithwh+StTXQXN1a1NInWIlHD38xwNxNbwcU5W30Xo1mnA1AxmDnHgkGyphz0GFNInqRO1KxVKp0y6znGc0YlmQ5CXm2SPby771a+9TOpvB6a/s6/J1Pev0qgUHxmV7fXglAk0lELIK1UsqybHF4kpqXD6F9FS04Fz3lwTw8tZfM144dt2G8LkIxMvhxvw6Mc8pZwQpp90zj/rICI74QhoEJ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, That's the fourth version of the series reducing the kmalloc() minimum alignment on arm64 to 8 (from 128). The first 10 patches decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN and, for arm64, it limits the kmalloc() caches to those aligned to the run-time probed cache_line_size(). The advantage on arm64 is that we gain the kmalloc-{64,192} caches. The subsequent patches (11 to 15) further reduce the kmalloc() caches to kmalloc-{8,16,32,96} if the default swiotlb is present by bouncing small buffers in the DMA API. For iommu, following discussions with Robin, we concluded that it's still simpler to walk the sg list if the device is non-coherent and follow the bouncing path when any of the elements may originate from a small kmalloc() allocation. Main changes since v3: - Reorganise the series so that the first 10 patches could be applied before the DMA bouncing. They are still useful on arm64 reducing the kmalloc() alignment to 64. - There is no dma_sg_kmalloc_needs_bounce() function, it has been unrolled in the iommu_dma_sync_sg_for_device() function. - No crypto changes needed following Herbert's reworking of the crypto code (thanks!). The patches are also available on this branch: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/kmalloc-minalign Thanks. Catalin Marinas (14): mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN dma: Allow dma_get_cache_alignment() to return the smaller cache_line_size() mm/slab: Simplify create_kmalloc_cache() args and make it static mm/slab: Limit kmalloc() minimum alignment to dma_get_cache_alignment() drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN drivers/md: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN arm64: Allow kmalloc() caches aligned to the smaller cache_line_size() dma-mapping: Force bouncing if the kmalloc() size is not cache-line-aligned iommu/dma: Force bouncing if the size is not cacheline-aligned mm: slab: Reduce the kmalloc() minimum alignment if DMA bouncing possible arm64: Enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64 Robin Murphy (1): scatterlist: Add dedicated config for DMA flags arch/arm64/Kconfig | 2 ++ arch/arm64/include/asm/cache.h | 1 + arch/arm64/mm/init.c | 7 ++++- drivers/base/devres.c | 6 ++--- drivers/gpu/drm/drm_managed.c | 6 ++--- drivers/iommu/dma-iommu.c | 25 ++++++++++++++---- drivers/md/dm-crypt.c | 2 +- drivers/pci/Kconfig | 1 + drivers/spi/spidev.c | 2 +- drivers/usb/core/buffer.c | 8 +++--- include/linux/dma-map-ops.h | 48 ++++++++++++++++++++++++++++++++++ include/linux/dma-mapping.h | 4 ++- include/linux/scatterlist.h | 29 +++++++++++++++++--- include/linux/slab.h | 16 +++++++++--- kernel/dma/Kconfig | 19 ++++++++++++++ kernel/dma/direct.h | 3 ++- mm/slab.c | 6 +---- mm/slab.h | 5 ++-- mm/slab_common.c | 43 +++++++++++++++++++++++------- 19 files changed, 188 insertions(+), 45 deletions(-) Tested-by: Ard Biesheuvel # tx2