From patchwork Wed Feb 1 12:52:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 13124368 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EE8A2C636CD for ; Wed, 1 Feb 2023 14:02:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=a4yQAxOq0HqiN4eLHzP6eyU+B5GofpwMTs5Z4l/9wIU=; b=b6aZEF1j2Lj93b 0gf/DFrLm1lBeh+pyK/AEwz2m0LMPdD6ahLSWwMvdjPAMyUvaTkm8tbD6u9H9uQMNxBD6nXRRHsBK 7uzuVG3+agMkePeuIlKuc8M4Sm+/72PQ2ns9fNhracsXhz1vzNLVT0r2dv2NnackLJwSUwTlc9FDB cpvWdSug4EHZigySpjoIEVobn0p1tQdrCcGVbuMXwNYIqhTMHutIxJG3S1jyYFsLFc5bchuB3dzKl mjrk4RSdePnVY13OVZbNL/TVz5KTMver21MBvJOBYqNkXG/txubSw/midiPraSH6zTbaM7m3DWK2T iiO0AAP/yTwaMBjkGXew==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pNDgH-00CB5c-Oy; Wed, 01 Feb 2023 14:01:46 +0000 Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pNChv-00BnFM-MM for linux-arm-kernel@lists.infradead.org; Wed, 01 Feb 2023 12:59:31 +0000 Received: by mail-wr1-x430.google.com with SMTP id q5so17271196wrv.0 for ; Wed, 01 Feb 2023 04:59:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OuHcxpVYr3Mx9lIj5wDy3+e1q2UTRpHhp+/IieOHuMM=; b=da+v6VgpDeN6FK5hSMa+DLtsoeQtkApU4NaNy2syMiAmX8Iqjb48CUGoR4mJ+k76sV a1zKNFsJINtPnKiNKoOf0OOQWfNkPo2G9yiV+/ZNMaF7qqS5AkfsC9nH5AioCDd2eI4x 8U0Mvq+FOTvhdyaH/q0r2wbC1Kxs0MPkAft37Vj35ci750GDENuNKYJyKa9jmcYV/yGm DPw1bgL3ayKyE7Axc9+V1xmdG1/cm5foLijKzffoSoEYOgggT9bqH6NUC19UrMGu5WSa M0j/yK7QsQpudLXKjsWaPEprDzcDkhSKbNRzdjQ6NZCxi2ISJ5WWFtrDrjgxj7IATtUq n4qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OuHcxpVYr3Mx9lIj5wDy3+e1q2UTRpHhp+/IieOHuMM=; b=eClv+cndIlfqd4dbPftfat9ijB9wWBgPLVsG4mVtXckWDlA5mlMbDBUDMh9E3CYYWO tRZP9mDfSp4o9gZ2BtaXEAunLoXHWQE3j7dyhV9o+ZAWWDerTWTalhF2D4Cul5PAuDMg 26NBbaH6dKLtQd6tSkaqyHIznscLjnWM2HTQMh3MFn/yzMBn3YESyUwCTQdtlef1IAl9 PDkurLztCOz4J6iHgGcOjE0AYc6fdqnNjXGNJqPECofFimV0wgVCUDbm+b++k2lMJgRU zzhvsBAH9DOjIrxHC4gsZWosZVK9rw2xduuk8ix6+AbozQMuAXM8PpNEamCjNoHvefBj aQNA== X-Gm-Message-State: AO0yUKUlxGPAH/vob6T7jj/Yg8JHMc1FKX6KZSNTAZs0A6WifCbsJZDi 5ofzv6Umt4gNofVB/qIDgj+YXQ== X-Google-Smtp-Source: AK7set82dUTG0tmBxSoD8WSSqdVps8oF3AS/Q946jw+uC7L84Wn7Z/Xb4Etl1FEZovrukY5QNtyCXQ== X-Received: by 2002:a5d:570b:0:b0:2bf:c01a:f24f with SMTP id a11-20020a5d570b000000b002bfc01af24fmr1586643wrv.5.1675256359104; Wed, 01 Feb 2023 04:59:19 -0800 (PST) Received: from localhost.localdomain (054592b0.skybroadband.com. [5.69.146.176]) by smtp.gmail.com with ESMTPSA id m15-20020a056000024f00b002bfae16ee2fsm17972811wrz.111.2023.02.01.04.59.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Feb 2023 04:59:18 -0800 (PST) From: Jean-Philippe Brucker To: maz@kernel.org, catalin.marinas@arm.com, will@kernel.org, joro@8bytes.org Cc: robin.murphy@arm.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com, smostafa@google.com, dbrazdil@google.com, ryan.roberts@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, iommu@lists.linux.dev, Jean-Philippe Brucker Subject: [RFC PATCH 01/45] iommu/io-pgtable-arm: Split the page table driver Date: Wed, 1 Feb 2023 12:52:45 +0000 Message-Id: <20230201125328.2186498-2-jean-philippe@linaro.org> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230201125328.2186498-1-jean-philippe@linaro.org> References: <20230201125328.2186498-1-jean-philippe@linaro.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230201_045923_796085_F025EF57 X-CRM114-Status: GOOD ( 26.89 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org To allow the KVM IOMMU driver to populate page tables using the io-pgtable-arm code, move the shared bits into io-pgtable-arm-common.c. Here we move the bulk of the common code, and a subsequent patch handles the bits that require more care. phys_to_virt() and virt_to_phys() do need special handling here because the hypervisor will have its own version. It will also implement its own version of __arm_lpae_alloc_pages(), __arm_lpae_free_pages() and __arm_lpae_sync_pte() since the hypervisor needs some assistance for allocating pages. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/Makefile | 2 +- drivers/iommu/io-pgtable-arm.h | 30 - include/linux/io-pgtable-arm.h | 187 ++++++ .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +- drivers/iommu/io-pgtable-arm-common.c | 500 ++++++++++++++ drivers/iommu/io-pgtable-arm.c | 634 +----------------- 6 files changed, 696 insertions(+), 659 deletions(-) delete mode 100644 drivers/iommu/io-pgtable-arm.h create mode 100644 include/linux/io-pgtable-arm.h create mode 100644 drivers/iommu/io-pgtable-arm-common.c diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index f461d0651385..c616acf534f8 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -7,7 +7,7 @@ obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o -obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o +obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o io-pgtable-arm-common.o obj-$(CONFIG_IOMMU_IO_PGTABLE_DART) += io-pgtable-dart.o obj-$(CONFIG_IOASID) += ioasid.o obj-$(CONFIG_IOMMU_IOVA) += iova.o diff --git a/drivers/iommu/io-pgtable-arm.h b/drivers/iommu/io-pgtable-arm.h deleted file mode 100644 index ba7cfdf7afa0..000000000000 --- a/drivers/iommu/io-pgtable-arm.h +++ /dev/null @@ -1,30 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-only */ -#ifndef IO_PGTABLE_ARM_H_ -#define IO_PGTABLE_ARM_H_ - -#define ARM_LPAE_TCR_TG0_4K 0 -#define ARM_LPAE_TCR_TG0_64K 1 -#define ARM_LPAE_TCR_TG0_16K 2 - -#define ARM_LPAE_TCR_TG1_16K 1 -#define ARM_LPAE_TCR_TG1_4K 2 -#define ARM_LPAE_TCR_TG1_64K 3 - -#define ARM_LPAE_TCR_SH_NS 0 -#define ARM_LPAE_TCR_SH_OS 2 -#define ARM_LPAE_TCR_SH_IS 3 - -#define ARM_LPAE_TCR_RGN_NC 0 -#define ARM_LPAE_TCR_RGN_WBWA 1 -#define ARM_LPAE_TCR_RGN_WT 2 -#define ARM_LPAE_TCR_RGN_WB 3 - -#define ARM_LPAE_TCR_PS_32_BIT 0x0ULL -#define ARM_LPAE_TCR_PS_36_BIT 0x1ULL -#define ARM_LPAE_TCR_PS_40_BIT 0x2ULL -#define ARM_LPAE_TCR_PS_42_BIT 0x3ULL -#define ARM_LPAE_TCR_PS_44_BIT 0x4ULL -#define ARM_LPAE_TCR_PS_48_BIT 0x5ULL -#define ARM_LPAE_TCR_PS_52_BIT 0x6ULL - -#endif /* IO_PGTABLE_ARM_H_ */ diff --git a/include/linux/io-pgtable-arm.h b/include/linux/io-pgtable-arm.h new file mode 100644 index 000000000000..594b5030b450 --- /dev/null +++ b/include/linux/io-pgtable-arm.h @@ -0,0 +1,187 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef IO_PGTABLE_H_ +#define IO_PGTABLE_H_ + +#include + +extern bool selftest_running; + +typedef u64 arm_lpae_iopte; + +struct arm_lpae_io_pgtable { + struct io_pgtable iop; + + int pgd_bits; + int start_level; + int bits_per_level; + + void *pgd; +}; + +/* Struct accessors */ +#define io_pgtable_to_data(x) \ + container_of((x), struct arm_lpae_io_pgtable, iop) + +#define io_pgtable_ops_to_data(x) \ + io_pgtable_to_data(io_pgtable_ops_to_pgtable(x)) + +/* + * Calculate the right shift amount to get to the portion describing level l + * in a virtual address mapped by the pagetable in d. + */ +#define ARM_LPAE_LVL_SHIFT(l,d) \ + (((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level) + \ + ilog2(sizeof(arm_lpae_iopte))) + +#define ARM_LPAE_GRANULE(d) \ + (sizeof(arm_lpae_iopte) << (d)->bits_per_level) +#define ARM_LPAE_PGD_SIZE(d) \ + (sizeof(arm_lpae_iopte) << (d)->pgd_bits) + +#define ARM_LPAE_PTES_PER_TABLE(d) \ + (ARM_LPAE_GRANULE(d) >> ilog2(sizeof(arm_lpae_iopte))) + +/* + * Calculate the index at level l used to map virtual address a using the + * pagetable in d. + */ +#define ARM_LPAE_PGD_IDX(l,d) \ + ((l) == (d)->start_level ? (d)->pgd_bits - (d)->bits_per_level : 0) + +#define ARM_LPAE_LVL_IDX(a,l,d) \ + (((u64)(a) >> ARM_LPAE_LVL_SHIFT(l,d)) & \ + ((1 << ((d)->bits_per_level + ARM_LPAE_PGD_IDX(l,d))) - 1)) + +/* Calculate the block/page mapping size at level l for pagetable in d. */ +#define ARM_LPAE_BLOCK_SIZE(l,d) (1ULL << ARM_LPAE_LVL_SHIFT(l,d)) + +/* Page table bits */ +#define ARM_LPAE_PTE_TYPE_SHIFT 0 +#define ARM_LPAE_PTE_TYPE_MASK 0x3 + +#define ARM_LPAE_PTE_TYPE_BLOCK 1 +#define ARM_LPAE_PTE_TYPE_TABLE 3 +#define ARM_LPAE_PTE_TYPE_PAGE 3 + +#define ARM_LPAE_PTE_ADDR_MASK GENMASK_ULL(47,12) + +#define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63) +#define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53) +#define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10) +#define ARM_LPAE_PTE_SH_NS (((arm_lpae_iopte)0) << 8) +#define ARM_LPAE_PTE_SH_OS (((arm_lpae_iopte)2) << 8) +#define ARM_LPAE_PTE_SH_IS (((arm_lpae_iopte)3) << 8) +#define ARM_LPAE_PTE_NS (((arm_lpae_iopte)1) << 5) +#define ARM_LPAE_PTE_VALID (((arm_lpae_iopte)1) << 0) + +#define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) << 2) +/* Ignore the contiguous bit for block splitting */ +#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52) +#define ARM_LPAE_PTE_ATTR_MASK (ARM_LPAE_PTE_ATTR_LO_MASK | \ + ARM_LPAE_PTE_ATTR_HI_MASK) +/* Software bit for solving coherency races */ +#define ARM_LPAE_PTE_SW_SYNC (((arm_lpae_iopte)1) << 55) + +/* Stage-1 PTE */ +#define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6) +#define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)2) << 6) +#define ARM_LPAE_PTE_ATTRINDX_SHIFT 2 +#define ARM_LPAE_PTE_nG (((arm_lpae_iopte)1) << 11) + +/* Stage-2 PTE */ +#define ARM_LPAE_PTE_HAP_FAULT (((arm_lpae_iopte)0) << 6) +#define ARM_LPAE_PTE_HAP_READ (((arm_lpae_iopte)1) << 6) +#define ARM_LPAE_PTE_HAP_WRITE (((arm_lpae_iopte)2) << 6) +#define ARM_LPAE_PTE_MEMATTR_OIWB (((arm_lpae_iopte)0xf) << 2) +#define ARM_LPAE_PTE_MEMATTR_NC (((arm_lpae_iopte)0x5) << 2) +#define ARM_LPAE_PTE_MEMATTR_DEV (((arm_lpae_iopte)0x1) << 2) + +/* Register bits */ +#define ARM_LPAE_VTCR_SL0_MASK 0x3 + +#define ARM_LPAE_TCR_T0SZ_SHIFT 0 + +#define ARM_LPAE_TCR_TG0_4K 0 +#define ARM_LPAE_TCR_TG0_64K 1 +#define ARM_LPAE_TCR_TG0_16K 2 + +#define ARM_LPAE_TCR_TG1_16K 1 +#define ARM_LPAE_TCR_TG1_4K 2 +#define ARM_LPAE_TCR_TG1_64K 3 + +#define ARM_LPAE_TCR_SH_NS 0 +#define ARM_LPAE_TCR_SH_OS 2 +#define ARM_LPAE_TCR_SH_IS 3 + +#define ARM_LPAE_TCR_RGN_NC 0 +#define ARM_LPAE_TCR_RGN_WBWA 1 +#define ARM_LPAE_TCR_RGN_WT 2 +#define ARM_LPAE_TCR_RGN_WB 3 + +#define ARM_LPAE_TCR_PS_32_BIT 0x0ULL +#define ARM_LPAE_TCR_PS_36_BIT 0x1ULL +#define ARM_LPAE_TCR_PS_40_BIT 0x2ULL +#define ARM_LPAE_TCR_PS_42_BIT 0x3ULL +#define ARM_LPAE_TCR_PS_44_BIT 0x4ULL +#define ARM_LPAE_TCR_PS_48_BIT 0x5ULL +#define ARM_LPAE_TCR_PS_52_BIT 0x6ULL + +#define ARM_LPAE_VTCR_PS_SHIFT 16 +#define ARM_LPAE_VTCR_PS_MASK 0x7 + +#define ARM_LPAE_MAIR_ATTR_SHIFT(n) ((n) << 3) +#define ARM_LPAE_MAIR_ATTR_MASK 0xff +#define ARM_LPAE_MAIR_ATTR_DEVICE 0x04 +#define ARM_LPAE_MAIR_ATTR_NC 0x44 +#define ARM_LPAE_MAIR_ATTR_INC_OWBRWA 0xf4 +#define ARM_LPAE_MAIR_ATTR_WBRWA 0xff +#define ARM_LPAE_MAIR_ATTR_IDX_NC 0 +#define ARM_LPAE_MAIR_ATTR_IDX_CACHE 1 +#define ARM_LPAE_MAIR_ATTR_IDX_DEV 2 +#define ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE 3 + +#define ARM_MALI_LPAE_TTBR_ADRMODE_TABLE (3u << 0) +#define ARM_MALI_LPAE_TTBR_READ_INNER BIT(2) +#define ARM_MALI_LPAE_TTBR_SHARE_OUTER BIT(4) + +#define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL +#define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL + +#define ARM_LPAE_MAX_LEVELS 4 + +#define iopte_type(pte) \ + (((pte) >> ARM_LPAE_PTE_TYPE_SHIFT) & ARM_LPAE_PTE_TYPE_MASK) + +#define iopte_prot(pte) ((pte) & ARM_LPAE_PTE_ATTR_MASK) + +static inline bool iopte_leaf(arm_lpae_iopte pte, int lvl, + enum io_pgtable_fmt fmt) +{ + if (lvl == (ARM_LPAE_MAX_LEVELS - 1) && fmt != ARM_MALI_LPAE) + return iopte_type(pte) == ARM_LPAE_PTE_TYPE_PAGE; + + return iopte_type(pte) == ARM_LPAE_PTE_TYPE_BLOCK; +} + +#define __arm_lpae_virt_to_phys __pa +#define __arm_lpae_phys_to_virt __va + +/* Generic functions */ +int arm_lpae_map_pages(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t paddr, size_t pgsize, size_t pgcount, + int iommu_prot, gfp_t gfp, size_t *mapped); +size_t arm_lpae_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova, + size_t pgsize, size_t pgcount, + struct iommu_iotlb_gather *gather); +phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, + unsigned long iova); +void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl, + arm_lpae_iopte *ptep); + +/* Host/hyp-specific functions */ +void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, struct io_pgtable_cfg *cfg); +void __arm_lpae_free_pages(void *pages, size_t size, struct io_pgtable_cfg *cfg); +void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int num_entries, + struct io_pgtable_cfg *cfg); + +#endif /* IO_PGTABLE_H_ */ diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index a5a63b1c947e..df288f29a5c1 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -3,6 +3,7 @@ * Implementation of the IOMMU SVA API for the ARM SMMUv3 */ +#include #include #include #include @@ -11,7 +12,6 @@ #include "arm-smmu-v3.h" #include "../../iommu-sva.h" -#include "../../io-pgtable-arm.h" struct arm_smmu_mmu_notifier { struct mmu_notifier mn; diff --git a/drivers/iommu/io-pgtable-arm-common.c b/drivers/iommu/io-pgtable-arm-common.c new file mode 100644 index 000000000000..74d962712d15 --- /dev/null +++ b/drivers/iommu/io-pgtable-arm-common.c @@ -0,0 +1,500 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * CPU-agnostic ARM page table allocator. + * A copy of this library is embedded in the KVM nVHE image. + * + * Copyright (C) 2022 Arm Limited + * + * Author: Will Deacon + */ + +#include + +#include +#include + +#define iopte_deref(pte, d) __arm_lpae_phys_to_virt(iopte_to_paddr(pte, d)) + +static arm_lpae_iopte paddr_to_iopte(phys_addr_t paddr, + struct arm_lpae_io_pgtable *data) +{ + arm_lpae_iopte pte = paddr; + + /* Of the bits which overlap, either 51:48 or 15:12 are always RES0 */ + return (pte | (pte >> (48 - 12))) & ARM_LPAE_PTE_ADDR_MASK; +} + +static phys_addr_t iopte_to_paddr(arm_lpae_iopte pte, + struct arm_lpae_io_pgtable *data) +{ + u64 paddr = pte & ARM_LPAE_PTE_ADDR_MASK; + + if (ARM_LPAE_GRANULE(data) < SZ_64K) + return paddr; + + /* Rotate the packed high-order bits back to the top */ + return (paddr | (paddr << (48 - 12))) & (ARM_LPAE_PTE_ADDR_MASK << 4); +} + +static void __arm_lpae_clear_pte(arm_lpae_iopte *ptep, struct io_pgtable_cfg *cfg) +{ + + *ptep = 0; + + if (!cfg->coherent_walk) + __arm_lpae_sync_pte(ptep, 1, cfg); +} + +static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, + struct iommu_iotlb_gather *gather, + unsigned long iova, size_t size, size_t pgcount, + int lvl, arm_lpae_iopte *ptep); + +static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, + phys_addr_t paddr, arm_lpae_iopte prot, + int lvl, int num_entries, arm_lpae_iopte *ptep) +{ + arm_lpae_iopte pte = prot; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data); + int i; + + if (data->iop.fmt != ARM_MALI_LPAE && lvl == ARM_LPAE_MAX_LEVELS - 1) + pte |= ARM_LPAE_PTE_TYPE_PAGE; + else + pte |= ARM_LPAE_PTE_TYPE_BLOCK; + + for (i = 0; i < num_entries; i++) + ptep[i] = pte | paddr_to_iopte(paddr + i * sz, data); + + if (!cfg->coherent_walk) + __arm_lpae_sync_pte(ptep, num_entries, cfg); +} + +static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, + unsigned long iova, phys_addr_t paddr, + arm_lpae_iopte prot, int lvl, int num_entries, + arm_lpae_iopte *ptep) +{ + int i; + + for (i = 0; i < num_entries; i++) + if (iopte_leaf(ptep[i], lvl, data->iop.fmt)) { + /* We require an unmap first */ + WARN_ON(!selftest_running); + return -EEXIST; + } else if (iopte_type(ptep[i]) == ARM_LPAE_PTE_TYPE_TABLE) { + /* + * We need to unmap and free the old table before + * overwriting it with a block entry. + */ + arm_lpae_iopte *tblp; + size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data); + + tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data); + if (__arm_lpae_unmap(data, NULL, iova + i * sz, sz, 1, + lvl, tblp) != sz) { + WARN_ON(1); + return -EINVAL; + } + } + + __arm_lpae_init_pte(data, paddr, prot, lvl, num_entries, ptep); + return 0; +} + +static arm_lpae_iopte arm_lpae_install_table(arm_lpae_iopte *table, + arm_lpae_iopte *ptep, + arm_lpae_iopte curr, + struct arm_lpae_io_pgtable *data) +{ + arm_lpae_iopte old, new; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + + new = paddr_to_iopte(__arm_lpae_virt_to_phys(table), data) | + ARM_LPAE_PTE_TYPE_TABLE; + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS) + new |= ARM_LPAE_PTE_NSTABLE; + + /* + * Ensure the table itself is visible before its PTE can be. + * Whilst we could get away with cmpxchg64_release below, this + * doesn't have any ordering semantics when !CONFIG_SMP. + */ + dma_wmb(); + + old = cmpxchg64_relaxed(ptep, curr, new); + + if (cfg->coherent_walk || (old & ARM_LPAE_PTE_SW_SYNC)) + return old; + + /* Even if it's not ours, there's no point waiting; just kick it */ + __arm_lpae_sync_pte(ptep, 1, cfg); + if (old == curr) + WRITE_ONCE(*ptep, new | ARM_LPAE_PTE_SW_SYNC); + + return old; +} + +int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova, + phys_addr_t paddr, size_t size, size_t pgcount, + arm_lpae_iopte prot, int lvl, arm_lpae_iopte *ptep, + gfp_t gfp, size_t *mapped) +{ + arm_lpae_iopte *cptep, pte; + size_t block_size = ARM_LPAE_BLOCK_SIZE(lvl, data); + size_t tblsz = ARM_LPAE_GRANULE(data); + struct io_pgtable_cfg *cfg = &data->iop.cfg; + int ret = 0, num_entries, max_entries, map_idx_start; + + /* Find our entry at the current level */ + map_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data); + ptep += map_idx_start; + + /* If we can install a leaf entry at this level, then do so */ + if (size == block_size) { + max_entries = ARM_LPAE_PTES_PER_TABLE(data) - map_idx_start; + num_entries = min_t(int, pgcount, max_entries); + ret = arm_lpae_init_pte(data, iova, paddr, prot, lvl, num_entries, ptep); + if (!ret) + *mapped += num_entries * size; + + return ret; + } + + /* We can't allocate tables at the final level */ + if (WARN_ON(lvl >= ARM_LPAE_MAX_LEVELS - 1)) + return -EINVAL; + + /* Grab a pointer to the next level */ + pte = READ_ONCE(*ptep); + if (!pte) { + cptep = __arm_lpae_alloc_pages(tblsz, gfp, cfg); + if (!cptep) + return -ENOMEM; + + pte = arm_lpae_install_table(cptep, ptep, 0, data); + if (pte) + __arm_lpae_free_pages(cptep, tblsz, cfg); + } else if (!cfg->coherent_walk && !(pte & ARM_LPAE_PTE_SW_SYNC)) { + __arm_lpae_sync_pte(ptep, 1, cfg); + } + + if (pte && !iopte_leaf(pte, lvl, data->iop.fmt)) { + cptep = iopte_deref(pte, data); + } else if (pte) { + /* We require an unmap first */ + WARN_ON(!selftest_running); + return -EEXIST; + } + + /* Rinse, repeat */ + return __arm_lpae_map(data, iova, paddr, size, pgcount, prot, lvl + 1, + cptep, gfp, mapped); +} + +static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, + int prot) +{ + arm_lpae_iopte pte; + + if (data->iop.fmt == ARM_64_LPAE_S1 || + data->iop.fmt == ARM_32_LPAE_S1) { + pte = ARM_LPAE_PTE_nG; + if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) + pte |= ARM_LPAE_PTE_AP_RDONLY; + if (!(prot & IOMMU_PRIV)) + pte |= ARM_LPAE_PTE_AP_UNPRIV; + } else { + pte = ARM_LPAE_PTE_HAP_FAULT; + if (prot & IOMMU_READ) + pte |= ARM_LPAE_PTE_HAP_READ; + if (prot & IOMMU_WRITE) + pte |= ARM_LPAE_PTE_HAP_WRITE; + } + + /* + * Note that this logic is structured to accommodate Mali LPAE + * having stage-1-like attributes but stage-2-like permissions. + */ + if (data->iop.fmt == ARM_64_LPAE_S2 || + data->iop.fmt == ARM_32_LPAE_S2) { + if (prot & IOMMU_MMIO) + pte |= ARM_LPAE_PTE_MEMATTR_DEV; + else if (prot & IOMMU_CACHE) + pte |= ARM_LPAE_PTE_MEMATTR_OIWB; + else + pte |= ARM_LPAE_PTE_MEMATTR_NC; + } else { + if (prot & IOMMU_MMIO) + pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV + << ARM_LPAE_PTE_ATTRINDX_SHIFT); + else if (prot & IOMMU_CACHE) + pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE + << ARM_LPAE_PTE_ATTRINDX_SHIFT); + } + + /* + * Also Mali has its own notions of shareability wherein its Inner + * domain covers the cores within the GPU, and its Outer domain is + * "outside the GPU" (i.e. either the Inner or System domain in CPU + * terms, depending on coherency). + */ + if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE) + pte |= ARM_LPAE_PTE_SH_IS; + else + pte |= ARM_LPAE_PTE_SH_OS; + + if (prot & IOMMU_NOEXEC) + pte |= ARM_LPAE_PTE_XN; + + if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS) + pte |= ARM_LPAE_PTE_NS; + + if (data->iop.fmt != ARM_MALI_LPAE) + pte |= ARM_LPAE_PTE_AF; + + return pte; +} + +int arm_lpae_map_pages(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t paddr, size_t pgsize, size_t pgcount, + int iommu_prot, gfp_t gfp, size_t *mapped) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct io_pgtable_cfg *cfg = &data->iop.cfg; + arm_lpae_iopte *ptep = data->pgd; + int ret, lvl = data->start_level; + arm_lpae_iopte prot; + long iaext = (s64)iova >> cfg->ias; + + if (WARN_ON(!pgsize || (pgsize & cfg->pgsize_bitmap) != pgsize)) + return -EINVAL; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext || paddr >> cfg->oas)) + return -ERANGE; + + /* If no access, then nothing to do */ + if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) + return 0; + + prot = arm_lpae_prot_to_pte(data, iommu_prot); + ret = __arm_lpae_map(data, iova, paddr, pgsize, pgcount, prot, lvl, + ptep, gfp, mapped); + /* + * Synchronise all PTE updates for the new mapping before there's + * a chance for anything to kick off a table walk for the new iova. + */ + wmb(); + + return ret; +} + +void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl, + arm_lpae_iopte *ptep) +{ + arm_lpae_iopte *start, *end; + unsigned long table_size; + + if (lvl == data->start_level) + table_size = ARM_LPAE_PGD_SIZE(data); + else + table_size = ARM_LPAE_GRANULE(data); + + start = ptep; + + /* Only leaf entries at the last level */ + if (lvl == ARM_LPAE_MAX_LEVELS - 1) + end = ptep; + else + end = (void *)ptep + table_size; + + while (ptep != end) { + arm_lpae_iopte pte = *ptep++; + + if (!pte || iopte_leaf(pte, lvl, data->iop.fmt)) + continue; + + __arm_lpae_free_pgtable(data, lvl + 1, iopte_deref(pte, data)); + } + + __arm_lpae_free_pages(start, table_size, &data->iop.cfg); +} + +static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data, + struct iommu_iotlb_gather *gather, + unsigned long iova, size_t size, + arm_lpae_iopte blk_pte, int lvl, + arm_lpae_iopte *ptep, size_t pgcount) +{ + struct io_pgtable_cfg *cfg = &data->iop.cfg; + arm_lpae_iopte pte, *tablep; + phys_addr_t blk_paddr; + size_t tablesz = ARM_LPAE_GRANULE(data); + size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data); + int ptes_per_table = ARM_LPAE_PTES_PER_TABLE(data); + int i, unmap_idx_start = -1, num_entries = 0, max_entries; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return 0; + + tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg); + if (!tablep) + return 0; /* Bytes unmapped */ + + if (size == split_sz) { + unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data); + max_entries = ptes_per_table - unmap_idx_start; + num_entries = min_t(int, pgcount, max_entries); + } + + blk_paddr = iopte_to_paddr(blk_pte, data); + pte = iopte_prot(blk_pte); + + for (i = 0; i < ptes_per_table; i++, blk_paddr += split_sz) { + /* Unmap! */ + if (i >= unmap_idx_start && i < (unmap_idx_start + num_entries)) + continue; + + __arm_lpae_init_pte(data, blk_paddr, pte, lvl, 1, &tablep[i]); + } + + pte = arm_lpae_install_table(tablep, ptep, blk_pte, data); + if (pte != blk_pte) { + __arm_lpae_free_pages(tablep, tablesz, cfg); + /* + * We may race against someone unmapping another part of this + * block, but anything else is invalid. We can't misinterpret + * a page entry here since we're never at the last level. + */ + if (iopte_type(pte) != ARM_LPAE_PTE_TYPE_TABLE) + return 0; + + tablep = iopte_deref(pte, data); + } else if (unmap_idx_start >= 0) { + for (i = 0; i < num_entries; i++) + io_pgtable_tlb_add_page(&data->iop, gather, iova + i * size, size); + + return num_entries * size; + } + + return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl, tablep); +} + +static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, + struct iommu_iotlb_gather *gather, + unsigned long iova, size_t size, size_t pgcount, + int lvl, arm_lpae_iopte *ptep) +{ + arm_lpae_iopte pte; + struct io_pgtable *iop = &data->iop; + int i = 0, num_entries, max_entries, unmap_idx_start; + + /* Something went horribly wrong and we ran out of page table */ + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return 0; + + unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data); + ptep += unmap_idx_start; + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return 0; + + /* If the size matches this level, we're in the right place */ + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + max_entries = ARM_LPAE_PTES_PER_TABLE(data) - unmap_idx_start; + num_entries = min_t(int, pgcount, max_entries); + + while (i < num_entries) { + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + break; + + __arm_lpae_clear_pte(ptep, &iop->cfg); + + if (!iopte_leaf(pte, lvl, iop->fmt)) { + /* Also flush any partial walks */ + io_pgtable_tlb_flush_walk(iop, iova + i * size, size, + ARM_LPAE_GRANULE(data)); + __arm_lpae_free_pgtable(data, lvl + 1, iopte_deref(pte, data)); + } else if (!iommu_iotlb_gather_queued(gather)) { + io_pgtable_tlb_add_page(iop, gather, iova + i * size, size); + } + + ptep++; + i++; + } + + return i * size; + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + /* + * Insert a table at the next level to map the old region, + * minus the part we want to unmap + */ + return arm_lpae_split_blk_unmap(data, gather, iova, size, pte, + lvl + 1, ptep, pgcount); + } + + /* Keep on walkin' */ + ptep = iopte_deref(pte, data); + return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl + 1, ptep); +} + +size_t arm_lpae_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova, + size_t pgsize, size_t pgcount, + struct iommu_iotlb_gather *gather) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct io_pgtable_cfg *cfg = &data->iop.cfg; + arm_lpae_iopte *ptep = data->pgd; + long iaext = (s64)iova >> cfg->ias; + + if (WARN_ON(!pgsize || (pgsize & cfg->pgsize_bitmap) != pgsize || !pgcount)) + return 0; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext)) + return 0; + + return __arm_lpae_unmap(data, gather, iova, pgsize, pgcount, + data->start_level, ptep); +} + +phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, + unsigned long iova) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + arm_lpae_iopte pte, *ptep = data->pgd; + int lvl = data->start_level; + + do { + /* Valid IOPTE pointer? */ + if (!ptep) + return 0; + + /* Grab the IOPTE we're interested in */ + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + + /* Valid entry? */ + if (!pte) + return 0; + + /* Leaf entry? */ + if (iopte_leaf(pte, lvl, data->iop.fmt)) + goto found_translation; + + /* Take it to the next level */ + ptep = iopte_deref(pte, data); + } while (++lvl < ARM_LPAE_MAX_LEVELS); + + /* Ran out of page tables to walk */ + return 0; + +found_translation: + iova &= (ARM_LPAE_BLOCK_SIZE(lvl, data) - 1); + return iopte_to_paddr(pte, data) | iova; +} diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 72dcdd468cf3..db42aed6ad7b 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* * CPU-agnostic ARM page table allocator. + * Host-specific functions. The rest is in io-pgtable-arm-common.c. * * Copyright (C) 2014 ARM Limited * @@ -11,7 +12,7 @@ #include #include -#include +#include #include #include #include @@ -20,175 +21,17 @@ #include -#include "io-pgtable-arm.h" - #define ARM_LPAE_MAX_ADDR_BITS 52 #define ARM_LPAE_S2_MAX_CONCAT_PAGES 16 -#define ARM_LPAE_MAX_LEVELS 4 - -/* Struct accessors */ -#define io_pgtable_to_data(x) \ - container_of((x), struct arm_lpae_io_pgtable, iop) - -#define io_pgtable_ops_to_data(x) \ - io_pgtable_to_data(io_pgtable_ops_to_pgtable(x)) - -/* - * Calculate the right shift amount to get to the portion describing level l - * in a virtual address mapped by the pagetable in d. - */ -#define ARM_LPAE_LVL_SHIFT(l,d) \ - (((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level) + \ - ilog2(sizeof(arm_lpae_iopte))) -#define ARM_LPAE_GRANULE(d) \ - (sizeof(arm_lpae_iopte) << (d)->bits_per_level) -#define ARM_LPAE_PGD_SIZE(d) \ - (sizeof(arm_lpae_iopte) << (d)->pgd_bits) - -#define ARM_LPAE_PTES_PER_TABLE(d) \ - (ARM_LPAE_GRANULE(d) >> ilog2(sizeof(arm_lpae_iopte))) - -/* - * Calculate the index at level l used to map virtual address a using the - * pagetable in d. - */ -#define ARM_LPAE_PGD_IDX(l,d) \ - ((l) == (d)->start_level ? (d)->pgd_bits - (d)->bits_per_level : 0) - -#define ARM_LPAE_LVL_IDX(a,l,d) \ - (((u64)(a) >> ARM_LPAE_LVL_SHIFT(l,d)) & \ - ((1 << ((d)->bits_per_level + ARM_LPAE_PGD_IDX(l,d))) - 1)) - -/* Calculate the block/page mapping size at level l for pagetable in d. */ -#define ARM_LPAE_BLOCK_SIZE(l,d) (1ULL << ARM_LPAE_LVL_SHIFT(l,d)) - -/* Page table bits */ -#define ARM_LPAE_PTE_TYPE_SHIFT 0 -#define ARM_LPAE_PTE_TYPE_MASK 0x3 - -#define ARM_LPAE_PTE_TYPE_BLOCK 1 -#define ARM_LPAE_PTE_TYPE_TABLE 3 -#define ARM_LPAE_PTE_TYPE_PAGE 3 - -#define ARM_LPAE_PTE_ADDR_MASK GENMASK_ULL(47,12) - -#define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63) -#define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53) -#define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10) -#define ARM_LPAE_PTE_SH_NS (((arm_lpae_iopte)0) << 8) -#define ARM_LPAE_PTE_SH_OS (((arm_lpae_iopte)2) << 8) -#define ARM_LPAE_PTE_SH_IS (((arm_lpae_iopte)3) << 8) -#define ARM_LPAE_PTE_NS (((arm_lpae_iopte)1) << 5) -#define ARM_LPAE_PTE_VALID (((arm_lpae_iopte)1) << 0) - -#define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) << 2) -/* Ignore the contiguous bit for block splitting */ -#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52) -#define ARM_LPAE_PTE_ATTR_MASK (ARM_LPAE_PTE_ATTR_LO_MASK | \ - ARM_LPAE_PTE_ATTR_HI_MASK) -/* Software bit for solving coherency races */ -#define ARM_LPAE_PTE_SW_SYNC (((arm_lpae_iopte)1) << 55) - -/* Stage-1 PTE */ -#define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6) -#define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)2) << 6) -#define ARM_LPAE_PTE_ATTRINDX_SHIFT 2 -#define ARM_LPAE_PTE_nG (((arm_lpae_iopte)1) << 11) - -/* Stage-2 PTE */ -#define ARM_LPAE_PTE_HAP_FAULT (((arm_lpae_iopte)0) << 6) -#define ARM_LPAE_PTE_HAP_READ (((arm_lpae_iopte)1) << 6) -#define ARM_LPAE_PTE_HAP_WRITE (((arm_lpae_iopte)2) << 6) -#define ARM_LPAE_PTE_MEMATTR_OIWB (((arm_lpae_iopte)0xf) << 2) -#define ARM_LPAE_PTE_MEMATTR_NC (((arm_lpae_iopte)0x5) << 2) -#define ARM_LPAE_PTE_MEMATTR_DEV (((arm_lpae_iopte)0x1) << 2) - -/* Register bits */ -#define ARM_LPAE_VTCR_SL0_MASK 0x3 - -#define ARM_LPAE_TCR_T0SZ_SHIFT 0 - -#define ARM_LPAE_VTCR_PS_SHIFT 16 -#define ARM_LPAE_VTCR_PS_MASK 0x7 - -#define ARM_LPAE_MAIR_ATTR_SHIFT(n) ((n) << 3) -#define ARM_LPAE_MAIR_ATTR_MASK 0xff -#define ARM_LPAE_MAIR_ATTR_DEVICE 0x04 -#define ARM_LPAE_MAIR_ATTR_NC 0x44 -#define ARM_LPAE_MAIR_ATTR_INC_OWBRWA 0xf4 -#define ARM_LPAE_MAIR_ATTR_WBRWA 0xff -#define ARM_LPAE_MAIR_ATTR_IDX_NC 0 -#define ARM_LPAE_MAIR_ATTR_IDX_CACHE 1 -#define ARM_LPAE_MAIR_ATTR_IDX_DEV 2 -#define ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE 3 - -#define ARM_MALI_LPAE_TTBR_ADRMODE_TABLE (3u << 0) -#define ARM_MALI_LPAE_TTBR_READ_INNER BIT(2) -#define ARM_MALI_LPAE_TTBR_SHARE_OUTER BIT(4) - -#define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL -#define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL - -/* IOPTE accessors */ -#define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d)) - -#define iopte_type(pte) \ - (((pte) >> ARM_LPAE_PTE_TYPE_SHIFT) & ARM_LPAE_PTE_TYPE_MASK) - -#define iopte_prot(pte) ((pte) & ARM_LPAE_PTE_ATTR_MASK) - -struct arm_lpae_io_pgtable { - struct io_pgtable iop; - - int pgd_bits; - int start_level; - int bits_per_level; - - void *pgd; -}; - -typedef u64 arm_lpae_iopte; - -static inline bool iopte_leaf(arm_lpae_iopte pte, int lvl, - enum io_pgtable_fmt fmt) -{ - if (lvl == (ARM_LPAE_MAX_LEVELS - 1) && fmt != ARM_MALI_LPAE) - return iopte_type(pte) == ARM_LPAE_PTE_TYPE_PAGE; - - return iopte_type(pte) == ARM_LPAE_PTE_TYPE_BLOCK; -} - -static arm_lpae_iopte paddr_to_iopte(phys_addr_t paddr, - struct arm_lpae_io_pgtable *data) -{ - arm_lpae_iopte pte = paddr; - - /* Of the bits which overlap, either 51:48 or 15:12 are always RES0 */ - return (pte | (pte >> (48 - 12))) & ARM_LPAE_PTE_ADDR_MASK; -} - -static phys_addr_t iopte_to_paddr(arm_lpae_iopte pte, - struct arm_lpae_io_pgtable *data) -{ - u64 paddr = pte & ARM_LPAE_PTE_ADDR_MASK; - - if (ARM_LPAE_GRANULE(data) < SZ_64K) - return paddr; - - /* Rotate the packed high-order bits back to the top */ - return (paddr | (paddr << (48 - 12))) & (ARM_LPAE_PTE_ADDR_MASK << 4); -} - -static bool selftest_running = false; +bool selftest_running = false; static dma_addr_t __arm_lpae_dma_addr(void *pages) { return (dma_addr_t)virt_to_phys(pages); } -static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, - struct io_pgtable_cfg *cfg) +void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, struct io_pgtable_cfg *cfg) { struct device *dev = cfg->iommu_dev; int order = get_order(size); @@ -225,8 +68,7 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, return NULL; } -static void __arm_lpae_free_pages(void *pages, size_t size, - struct io_pgtable_cfg *cfg) +void __arm_lpae_free_pages(void *pages, size_t size, struct io_pgtable_cfg *cfg) { if (!cfg->coherent_walk) dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages), @@ -234,299 +76,13 @@ static void __arm_lpae_free_pages(void *pages, size_t size, free_pages((unsigned long)pages, get_order(size)); } -static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int num_entries, - struct io_pgtable_cfg *cfg) +void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int num_entries, + struct io_pgtable_cfg *cfg) { dma_sync_single_for_device(cfg->iommu_dev, __arm_lpae_dma_addr(ptep), sizeof(*ptep) * num_entries, DMA_TO_DEVICE); } -static void __arm_lpae_clear_pte(arm_lpae_iopte *ptep, struct io_pgtable_cfg *cfg) -{ - - *ptep = 0; - - if (!cfg->coherent_walk) - __arm_lpae_sync_pte(ptep, 1, cfg); -} - -static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, - struct iommu_iotlb_gather *gather, - unsigned long iova, size_t size, size_t pgcount, - int lvl, arm_lpae_iopte *ptep); - -static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, - phys_addr_t paddr, arm_lpae_iopte prot, - int lvl, int num_entries, arm_lpae_iopte *ptep) -{ - arm_lpae_iopte pte = prot; - struct io_pgtable_cfg *cfg = &data->iop.cfg; - size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data); - int i; - - if (data->iop.fmt != ARM_MALI_LPAE && lvl == ARM_LPAE_MAX_LEVELS - 1) - pte |= ARM_LPAE_PTE_TYPE_PAGE; - else - pte |= ARM_LPAE_PTE_TYPE_BLOCK; - - for (i = 0; i < num_entries; i++) - ptep[i] = pte | paddr_to_iopte(paddr + i * sz, data); - - if (!cfg->coherent_walk) - __arm_lpae_sync_pte(ptep, num_entries, cfg); -} - -static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, - unsigned long iova, phys_addr_t paddr, - arm_lpae_iopte prot, int lvl, int num_entries, - arm_lpae_iopte *ptep) -{ - int i; - - for (i = 0; i < num_entries; i++) - if (iopte_leaf(ptep[i], lvl, data->iop.fmt)) { - /* We require an unmap first */ - WARN_ON(!selftest_running); - return -EEXIST; - } else if (iopte_type(ptep[i]) == ARM_LPAE_PTE_TYPE_TABLE) { - /* - * We need to unmap and free the old table before - * overwriting it with a block entry. - */ - arm_lpae_iopte *tblp; - size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data); - - tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data); - if (__arm_lpae_unmap(data, NULL, iova + i * sz, sz, 1, - lvl, tblp) != sz) { - WARN_ON(1); - return -EINVAL; - } - } - - __arm_lpae_init_pte(data, paddr, prot, lvl, num_entries, ptep); - return 0; -} - -static arm_lpae_iopte arm_lpae_install_table(arm_lpae_iopte *table, - arm_lpae_iopte *ptep, - arm_lpae_iopte curr, - struct arm_lpae_io_pgtable *data) -{ - arm_lpae_iopte old, new; - struct io_pgtable_cfg *cfg = &data->iop.cfg; - - new = paddr_to_iopte(__pa(table), data) | ARM_LPAE_PTE_TYPE_TABLE; - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS) - new |= ARM_LPAE_PTE_NSTABLE; - - /* - * Ensure the table itself is visible before its PTE can be. - * Whilst we could get away with cmpxchg64_release below, this - * doesn't have any ordering semantics when !CONFIG_SMP. - */ - dma_wmb(); - - old = cmpxchg64_relaxed(ptep, curr, new); - - if (cfg->coherent_walk || (old & ARM_LPAE_PTE_SW_SYNC)) - return old; - - /* Even if it's not ours, there's no point waiting; just kick it */ - __arm_lpae_sync_pte(ptep, 1, cfg); - if (old == curr) - WRITE_ONCE(*ptep, new | ARM_LPAE_PTE_SW_SYNC); - - return old; -} - -static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova, - phys_addr_t paddr, size_t size, size_t pgcount, - arm_lpae_iopte prot, int lvl, arm_lpae_iopte *ptep, - gfp_t gfp, size_t *mapped) -{ - arm_lpae_iopte *cptep, pte; - size_t block_size = ARM_LPAE_BLOCK_SIZE(lvl, data); - size_t tblsz = ARM_LPAE_GRANULE(data); - struct io_pgtable_cfg *cfg = &data->iop.cfg; - int ret = 0, num_entries, max_entries, map_idx_start; - - /* Find our entry at the current level */ - map_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data); - ptep += map_idx_start; - - /* If we can install a leaf entry at this level, then do so */ - if (size == block_size) { - max_entries = ARM_LPAE_PTES_PER_TABLE(data) - map_idx_start; - num_entries = min_t(int, pgcount, max_entries); - ret = arm_lpae_init_pte(data, iova, paddr, prot, lvl, num_entries, ptep); - if (!ret) - *mapped += num_entries * size; - - return ret; - } - - /* We can't allocate tables at the final level */ - if (WARN_ON(lvl >= ARM_LPAE_MAX_LEVELS - 1)) - return -EINVAL; - - /* Grab a pointer to the next level */ - pte = READ_ONCE(*ptep); - if (!pte) { - cptep = __arm_lpae_alloc_pages(tblsz, gfp, cfg); - if (!cptep) - return -ENOMEM; - - pte = arm_lpae_install_table(cptep, ptep, 0, data); - if (pte) - __arm_lpae_free_pages(cptep, tblsz, cfg); - } else if (!cfg->coherent_walk && !(pte & ARM_LPAE_PTE_SW_SYNC)) { - __arm_lpae_sync_pte(ptep, 1, cfg); - } - - if (pte && !iopte_leaf(pte, lvl, data->iop.fmt)) { - cptep = iopte_deref(pte, data); - } else if (pte) { - /* We require an unmap first */ - WARN_ON(!selftest_running); - return -EEXIST; - } - - /* Rinse, repeat */ - return __arm_lpae_map(data, iova, paddr, size, pgcount, prot, lvl + 1, - cptep, gfp, mapped); -} - -static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, - int prot) -{ - arm_lpae_iopte pte; - - if (data->iop.fmt == ARM_64_LPAE_S1 || - data->iop.fmt == ARM_32_LPAE_S1) { - pte = ARM_LPAE_PTE_nG; - if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) - pte |= ARM_LPAE_PTE_AP_RDONLY; - if (!(prot & IOMMU_PRIV)) - pte |= ARM_LPAE_PTE_AP_UNPRIV; - } else { - pte = ARM_LPAE_PTE_HAP_FAULT; - if (prot & IOMMU_READ) - pte |= ARM_LPAE_PTE_HAP_READ; - if (prot & IOMMU_WRITE) - pte |= ARM_LPAE_PTE_HAP_WRITE; - } - - /* - * Note that this logic is structured to accommodate Mali LPAE - * having stage-1-like attributes but stage-2-like permissions. - */ - if (data->iop.fmt == ARM_64_LPAE_S2 || - data->iop.fmt == ARM_32_LPAE_S2) { - if (prot & IOMMU_MMIO) - pte |= ARM_LPAE_PTE_MEMATTR_DEV; - else if (prot & IOMMU_CACHE) - pte |= ARM_LPAE_PTE_MEMATTR_OIWB; - else - pte |= ARM_LPAE_PTE_MEMATTR_NC; - } else { - if (prot & IOMMU_MMIO) - pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV - << ARM_LPAE_PTE_ATTRINDX_SHIFT); - else if (prot & IOMMU_CACHE) - pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE - << ARM_LPAE_PTE_ATTRINDX_SHIFT); - } - - /* - * Also Mali has its own notions of shareability wherein its Inner - * domain covers the cores within the GPU, and its Outer domain is - * "outside the GPU" (i.e. either the Inner or System domain in CPU - * terms, depending on coherency). - */ - if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE) - pte |= ARM_LPAE_PTE_SH_IS; - else - pte |= ARM_LPAE_PTE_SH_OS; - - if (prot & IOMMU_NOEXEC) - pte |= ARM_LPAE_PTE_XN; - - if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_NS) - pte |= ARM_LPAE_PTE_NS; - - if (data->iop.fmt != ARM_MALI_LPAE) - pte |= ARM_LPAE_PTE_AF; - - return pte; -} - -static int arm_lpae_map_pages(struct io_pgtable_ops *ops, unsigned long iova, - phys_addr_t paddr, size_t pgsize, size_t pgcount, - int iommu_prot, gfp_t gfp, size_t *mapped) -{ - struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); - struct io_pgtable_cfg *cfg = &data->iop.cfg; - arm_lpae_iopte *ptep = data->pgd; - int ret, lvl = data->start_level; - arm_lpae_iopte prot; - long iaext = (s64)iova >> cfg->ias; - - if (WARN_ON(!pgsize || (pgsize & cfg->pgsize_bitmap) != pgsize)) - return -EINVAL; - - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) - iaext = ~iaext; - if (WARN_ON(iaext || paddr >> cfg->oas)) - return -ERANGE; - - /* If no access, then nothing to do */ - if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) - return 0; - - prot = arm_lpae_prot_to_pte(data, iommu_prot); - ret = __arm_lpae_map(data, iova, paddr, pgsize, pgcount, prot, lvl, - ptep, gfp, mapped); - /* - * Synchronise all PTE updates for the new mapping before there's - * a chance for anything to kick off a table walk for the new iova. - */ - wmb(); - - return ret; -} - -static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl, - arm_lpae_iopte *ptep) -{ - arm_lpae_iopte *start, *end; - unsigned long table_size; - - if (lvl == data->start_level) - table_size = ARM_LPAE_PGD_SIZE(data); - else - table_size = ARM_LPAE_GRANULE(data); - - start = ptep; - - /* Only leaf entries at the last level */ - if (lvl == ARM_LPAE_MAX_LEVELS - 1) - end = ptep; - else - end = (void *)ptep + table_size; - - while (ptep != end) { - arm_lpae_iopte pte = *ptep++; - - if (!pte || iopte_leaf(pte, lvl, data->iop.fmt)) - continue; - - __arm_lpae_free_pgtable(data, lvl + 1, iopte_deref(pte, data)); - } - - __arm_lpae_free_pages(start, table_size, &data->iop.cfg); -} - static void arm_lpae_free_pgtable(struct io_pgtable *iop) { struct arm_lpae_io_pgtable *data = io_pgtable_to_data(iop); @@ -535,182 +91,6 @@ static void arm_lpae_free_pgtable(struct io_pgtable *iop) kfree(data); } -static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data, - struct iommu_iotlb_gather *gather, - unsigned long iova, size_t size, - arm_lpae_iopte blk_pte, int lvl, - arm_lpae_iopte *ptep, size_t pgcount) -{ - struct io_pgtable_cfg *cfg = &data->iop.cfg; - arm_lpae_iopte pte, *tablep; - phys_addr_t blk_paddr; - size_t tablesz = ARM_LPAE_GRANULE(data); - size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data); - int ptes_per_table = ARM_LPAE_PTES_PER_TABLE(data); - int i, unmap_idx_start = -1, num_entries = 0, max_entries; - - if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) - return 0; - - tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg); - if (!tablep) - return 0; /* Bytes unmapped */ - - if (size == split_sz) { - unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data); - max_entries = ptes_per_table - unmap_idx_start; - num_entries = min_t(int, pgcount, max_entries); - } - - blk_paddr = iopte_to_paddr(blk_pte, data); - pte = iopte_prot(blk_pte); - - for (i = 0; i < ptes_per_table; i++, blk_paddr += split_sz) { - /* Unmap! */ - if (i >= unmap_idx_start && i < (unmap_idx_start + num_entries)) - continue; - - __arm_lpae_init_pte(data, blk_paddr, pte, lvl, 1, &tablep[i]); - } - - pte = arm_lpae_install_table(tablep, ptep, blk_pte, data); - if (pte != blk_pte) { - __arm_lpae_free_pages(tablep, tablesz, cfg); - /* - * We may race against someone unmapping another part of this - * block, but anything else is invalid. We can't misinterpret - * a page entry here since we're never at the last level. - */ - if (iopte_type(pte) != ARM_LPAE_PTE_TYPE_TABLE) - return 0; - - tablep = iopte_deref(pte, data); - } else if (unmap_idx_start >= 0) { - for (i = 0; i < num_entries; i++) - io_pgtable_tlb_add_page(&data->iop, gather, iova + i * size, size); - - return num_entries * size; - } - - return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl, tablep); -} - -static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, - struct iommu_iotlb_gather *gather, - unsigned long iova, size_t size, size_t pgcount, - int lvl, arm_lpae_iopte *ptep) -{ - arm_lpae_iopte pte; - struct io_pgtable *iop = &data->iop; - int i = 0, num_entries, max_entries, unmap_idx_start; - - /* Something went horribly wrong and we ran out of page table */ - if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) - return 0; - - unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data); - ptep += unmap_idx_start; - pte = READ_ONCE(*ptep); - if (WARN_ON(!pte)) - return 0; - - /* If the size matches this level, we're in the right place */ - if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { - max_entries = ARM_LPAE_PTES_PER_TABLE(data) - unmap_idx_start; - num_entries = min_t(int, pgcount, max_entries); - - while (i < num_entries) { - pte = READ_ONCE(*ptep); - if (WARN_ON(!pte)) - break; - - __arm_lpae_clear_pte(ptep, &iop->cfg); - - if (!iopte_leaf(pte, lvl, iop->fmt)) { - /* Also flush any partial walks */ - io_pgtable_tlb_flush_walk(iop, iova + i * size, size, - ARM_LPAE_GRANULE(data)); - __arm_lpae_free_pgtable(data, lvl + 1, iopte_deref(pte, data)); - } else if (!iommu_iotlb_gather_queued(gather)) { - io_pgtable_tlb_add_page(iop, gather, iova + i * size, size); - } - - ptep++; - i++; - } - - return i * size; - } else if (iopte_leaf(pte, lvl, iop->fmt)) { - /* - * Insert a table at the next level to map the old region, - * minus the part we want to unmap - */ - return arm_lpae_split_blk_unmap(data, gather, iova, size, pte, - lvl + 1, ptep, pgcount); - } - - /* Keep on walkin' */ - ptep = iopte_deref(pte, data); - return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl + 1, ptep); -} - -static size_t arm_lpae_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova, - size_t pgsize, size_t pgcount, - struct iommu_iotlb_gather *gather) -{ - struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); - struct io_pgtable_cfg *cfg = &data->iop.cfg; - arm_lpae_iopte *ptep = data->pgd; - long iaext = (s64)iova >> cfg->ias; - - if (WARN_ON(!pgsize || (pgsize & cfg->pgsize_bitmap) != pgsize || !pgcount)) - return 0; - - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) - iaext = ~iaext; - if (WARN_ON(iaext)) - return 0; - - return __arm_lpae_unmap(data, gather, iova, pgsize, pgcount, - data->start_level, ptep); -} - -static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, - unsigned long iova) -{ - struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); - arm_lpae_iopte pte, *ptep = data->pgd; - int lvl = data->start_level; - - do { - /* Valid IOPTE pointer? */ - if (!ptep) - return 0; - - /* Grab the IOPTE we're interested in */ - ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); - pte = READ_ONCE(*ptep); - - /* Valid entry? */ - if (!pte) - return 0; - - /* Leaf entry? */ - if (iopte_leaf(pte, lvl, data->iop.fmt)) - goto found_translation; - - /* Take it to the next level */ - ptep = iopte_deref(pte, data); - } while (++lvl < ARM_LPAE_MAX_LEVELS); - - /* Ran out of page tables to walk */ - return 0; - -found_translation: - iova &= (ARM_LPAE_BLOCK_SIZE(lvl, data) - 1); - return iopte_to_paddr(pte, data) | iova; -} - static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes;