From patchwork Wed Mar 10 09:06:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62920C433E0 for ; Wed, 10 Mar 2021 09:11:04 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C673C64FF1 for ; Wed, 10 Mar 2021 09:11:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C673C64FF1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ycruSaNxl2srmd8TLZbSwR673lRLF85DaGdq63hbvRA=; b=GasdHA7Ev4CeliEknz3y74l9M jo4lUcNutaSVj58rwYS0enYMwDczSED3bt/0Drdbbnho4f1THbsXTmP+IpY1YAkYBOaLV3rSI8gFz ArL1UK6Hpuz+ISGm4zVqGLGM9VKC2Rh9O3iT7X27L8X5RauWvQGD1AzZRyhUtim9fVkIgzJZ3FcFX iqtpu0W67nmLfwOUxlar9nyUBIsS7eXTVUEiQ4eWZ2XRiN3mfdLrQbVooU6WZ02ePFvEj+a1q6fiz oVZcV9vcJppLxXllCNKA12QpdPdwRU/NK0HsowRbWzNX7y1goOpwGOa8UN0zYdMj21F7aOHCkFwod BhjhVK0hg==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJuqD-006PiH-8X; Wed, 10 Mar 2021 09:09:17 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuoA-006OrT-Sf for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:16 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3z0lTdzrTKk; Wed, 10 Mar 2021 17:05:03 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:19 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , , Jean-Philippe Brucker Subject: [PATCH v2 01/11] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update Date: Wed, 10 Mar 2021 17:06:04 +0800 Message-ID: <20210310090614.26668-2-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090712_365163_E60BF57F X-CRM114-Status: GOOD ( 20.61 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Jean-Philippe Brucker If the SMMU supports it and the kernel was built with HTTU support, enable hardware update of access and dirty flags. This is essential for shared page tables, to reduce the number of access faults on the fault queue. Normal DMA with io-pgtables doesn't currently use the access or dirty flags. We can enable HTTU even if CPUs don't support it, because the kernel always checks for HW dirty bit and updates the PTE flags atomically. Signed-off-by: Jean-Philippe Brucker --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 ++++ 3 files changed, 50 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index bb251cab61f3..ae075e675892 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -121,10 +121,12 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm) if (err) goto out_free_asid; + /* HA and HD will be filtered out later if not supported by the SMMU */ tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - vabits_actual) | FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) | FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) | FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) | + CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD | CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64; switch (PAGE_SIZE) { diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8594b4a83043..b6d965504f44 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1012,10 +1012,17 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, * this substream's traffic */ } else { /* (1) and (2) */ + u64 tcr = cd->tcr; + cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); cdptr[2] = 0; cdptr[3] = cpu_to_le64(cd->mair); + if (!(smmu->features & ARM_SMMU_FEAT_HD)) + tcr &= ~CTXDESC_CD_0_TCR_HD; + if (!(smmu->features & ARM_SMMU_FEAT_HA)) + tcr &= ~CTXDESC_CD_0_TCR_HA; + /* * STE is live, and the SMMU might read dwords of this CD in any * order. Ensure that it observes valid values before reading @@ -1023,7 +1030,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, */ arm_smmu_sync_cd(smmu_domain, ssid, true); - val = cd->tcr | + val = tcr | #ifdef __BIG_ENDIAN CTXDESC_CD_0_ENDI | #endif @@ -3196,6 +3203,28 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) return 0; } +static void arm_smmu_get_httu(struct arm_smmu_device *smmu, u32 reg) +{ + u32 fw_features = smmu->features & (ARM_SMMU_FEAT_HA | ARM_SMMU_FEAT_HD); + u32 features = 0; + + switch (FIELD_GET(IDR0_HTTU, reg)) { + case IDR0_HTTU_ACCESS_DIRTY: + features |= ARM_SMMU_FEAT_HD; + fallthrough; + case IDR0_HTTU_ACCESS: + features |= ARM_SMMU_FEAT_HA; + } + + if (smmu->dev->of_node) + smmu->features |= features; + else if (features != fw_features) + /* ACPI IORT sets the HTTU bits */ + dev_warn(smmu->dev, + "IDR0.HTTU overridden by FW configuration (0x%x)\n", + fw_features); +} + static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) { u32 reg; @@ -3256,6 +3285,8 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) smmu->features |= ARM_SMMU_FEAT_E2H; } + arm_smmu_get_httu(smmu, reg); + /* * The coherency feature as set by FW is used in preference to the ID * register, but warn on mismatch. @@ -3441,6 +3472,14 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev, if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE) smmu->features |= ARM_SMMU_FEAT_COHERENCY; + switch (FIELD_GET(ACPI_IORT_SMMU_V3_HTTU_OVERRIDE, iort_smmu->flags)) { + case IDR0_HTTU_ACCESS_DIRTY: + smmu->features |= ARM_SMMU_FEAT_HD; + fallthrough; + case IDR0_HTTU_ACCESS: + smmu->features |= ARM_SMMU_FEAT_HA; + } + return 0; } #else diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index f985817c967a..26d6b935b383 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -33,6 +33,9 @@ #define IDR0_ASID16 (1 << 12) #define IDR0_ATS (1 << 10) #define IDR0_HYP (1 << 9) +#define IDR0_HTTU GENMASK(7, 6) +#define IDR0_HTTU_ACCESS 1 +#define IDR0_HTTU_ACCESS_DIRTY 2 #define IDR0_COHACC (1 << 4) #define IDR0_TTF GENMASK(3, 2) #define IDR0_TTF_AARCH64 2 @@ -285,6 +288,9 @@ #define CTXDESC_CD_0_TCR_IPS GENMASK_ULL(34, 32) #define CTXDESC_CD_0_TCR_TBI0 (1ULL << 38) +#define CTXDESC_CD_0_TCR_HA (1UL << 43) +#define CTXDESC_CD_0_TCR_HD (1UL << 42) + #define CTXDESC_CD_0_AA64 (1UL << 41) #define CTXDESC_CD_0_S (1UL << 44) #define CTXDESC_CD_0_R (1UL << 45) @@ -607,6 +613,8 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_BTM (1 << 16) #define ARM_SMMU_FEAT_SVA (1 << 17) #define ARM_SMMU_FEAT_E2H (1 << 18) +#define ARM_SMMU_FEAT_HA (1 << 19) +#define ARM_SMMU_FEAT_HD (1 << 20) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) From patchwork Wed Mar 10 09:06:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F7C3C433E0 for ; Wed, 10 Mar 2021 09:09:04 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 72D4564FEF for ; Wed, 10 Mar 2021 09:09:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 72D4564FEF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=e+wxvmS5X4nJJoW93J3KtJCcXftWwp14wa1t8jenvmU=; b=QpvqyLNx5lv3xGj1eEjXQ7jFT k1oITC1bEfBU8Sjb/YTzBqVcqywR3h5f7wTRgUOQBg2XUrAE3pRncP+cH3qeFwh3f3TSQk+NWVRmk Aqj2sme0m3AXZ0iRJQOA1TCifMSw6XVypst+pLaKoGBwgQ4G5P9ElyKnn8DXv89Cuq+wLk1aYaHT6 Twe3U7hPXDBOzjT6sG4dyXowsrCQd9nRwNL3NYfvKXxg58XSt3n7Iy3OdxkA5ra0kDXLJgnvEMCCN ChR/b9d7XBknRAMd4fKk0r3Ljj5Cd67PXgf8zRiS72iu3mgZIF5GTTznDrRRDg+VSX10xT66JBkiU 7+ybopTuQ==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJuoJ-006Owg-Sw; Wed, 10 Mar 2021 09:07:20 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuo2-006Onj-Ln for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:06 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3z1DfBzrTKp; Wed, 10 Mar 2021 17:05:03 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:19 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 02/11] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping Date: Wed, 10 Mar 2021 17:06:05 +0800 Message-ID: <20210310090614.26668-3-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090703_377497_096CFDAF X-CRM114-Status: GOOD ( 15.41 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun If HTTU is supported, we enable HA/HD bits in the SMMU CD (stage 1 mapping), and set DBM bit for writable TTD. The dirty state information is encoded using the access permission bits AP[2] (stage 1) or S2AP[1] (stage 2) in conjunction with the DBM (Dirty Bit Modifier) bit, where DBM means writable and AP[2]/ S2AP[1] means dirty. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Use a new quirk flag named IO_PGTABLE_QUIRK_ARM_HD to transfer SMMU HD feature to io-pgtable. (Robin) - Rebase on Jean's HTTU patch(#1). --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++ drivers/iommu/io-pgtable-arm.c | 7 ++++++- include/linux/io-pgtable.h | 3 +++ 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index b6d965504f44..369c0ea7a104 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1921,6 +1921,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) | FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) | FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) | + CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD | CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64; cfg->cd.mair = pgtbl_cfg->arm_lpae_s1_cfg.mair; @@ -2026,6 +2027,8 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, if (smmu_domain->non_strict) pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT; + if (smmu->features & ARM_SMMU_FEAT_HD) + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD; pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain); if (!pgtbl_ops) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 87def58e79b5..94d790b8ed27 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -72,6 +72,7 @@ #define ARM_LPAE_PTE_NSTABLE (((arm_lpae_iopte)1) << 63) #define ARM_LPAE_PTE_XN (((arm_lpae_iopte)3) << 53) +#define ARM_LPAE_PTE_DBM (((arm_lpae_iopte)1) << 51) #define ARM_LPAE_PTE_AF (((arm_lpae_iopte)1) << 10) #define ARM_LPAE_PTE_SH_NS (((arm_lpae_iopte)0) << 8) #define ARM_LPAE_PTE_SH_OS (((arm_lpae_iopte)2) << 8) @@ -81,7 +82,7 @@ #define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) << 2) /* Ignore the contiguous bit for block splitting */ -#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52) +#define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)13) << 51) #define ARM_LPAE_PTE_ATTR_MASK (ARM_LPAE_PTE_ATTR_LO_MASK | \ ARM_LPAE_PTE_ATTR_HI_MASK) /* Software bit for solving coherency races */ @@ -379,6 +380,7 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova, static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, int prot) { + struct io_pgtable_cfg *cfg = &data->iop.cfg; arm_lpae_iopte pte; if (data->iop.fmt == ARM_64_LPAE_S1 || @@ -386,6 +388,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, pte = ARM_LPAE_PTE_nG; if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) pte |= ARM_LPAE_PTE_AP_RDONLY; + else if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_HD) + pte |= ARM_LPAE_PTE_DBM; + if (!(prot & IOMMU_PRIV)) pte |= ARM_LPAE_PTE_AP_UNPRIV; } else { diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index a4c9ca2c31f1..64cee6831c97 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -82,6 +82,8 @@ struct io_pgtable_cfg { * * IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the outer-cacheability * attributes set in the TCR for a non-coherent page-table walker. + * + * IO_PGTABLE_QUIRK_ARM_HD: Support hardware management of dirty status. */ #define IO_PGTABLE_QUIRK_ARM_NS BIT(0) #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1) @@ -89,6 +91,7 @@ struct io_pgtable_cfg { #define IO_PGTABLE_QUIRK_NON_STRICT BIT(4) #define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5) #define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA BIT(6) + #define IO_PGTABLE_QUIRK_ARM_HD BIT(7) unsigned long quirks; unsigned long pgsize_bitmap; unsigned int ias; From patchwork Wed Mar 10 09:06:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CA90C433DB for ; Wed, 10 Mar 2021 09:09:23 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9D20764F2D for ; Wed, 10 Mar 2021 09:09:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D20764F2D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=JVt8BwxuK88nSE2K+jE1rmwX4C5ln47dPuIT7m0OBzg=; b=YyF+T1P78y8xvXb/BeV4fNbC9 fbmMWXo8mvvO/RJ6+u0pgv+aLQKIc9SXVKHrf4QBuySWwT6juZoALRm8tSErpmDsnHbDEcX+r1VJ6 /8Jsge82xP+sS06ZIDuMHuAPAJK90CafCP470DQgPnXZr4uw89VUAFV4nTzaXRsfpcNUXvr6LksDR fhYbbp2ZuGDOg7Vwfqnx8UmMOKtbqsh4K22iUrTvk4ysEkB3epnK179aDpAnfp/X1pVANW4cLXwPn /t+ktUHEjS8QOOjnDSyFp7SNY0tdpyJ5cSvjmT3tyP+8di2H0E9M7In8ABV5+72CxMO62RwpGjEB7 PBIoTcgPw==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJuor-006P4v-4D; Wed, 10 Mar 2021 09:07:54 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuo4-006Onl-2U for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:07 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3y6rjKzrTKP; Wed, 10 Mar 2021 17:05:02 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:20 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 03/11] iommu/arm-smmu-v3: Add feature detection for BBML Date: Wed, 10 Mar 2021 17:06:06 +0800 Message-ID: <20210310090614.26668-4-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090705_487022_B49A278F X-CRM114-Status: GOOD ( 13.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun When altering a translation table descriptor of some specific reasons, we require break-before-make procedure. But it might cause problems when the TTD is alive. The I/O streams might not tolerate translation faults. If the SMMU supports BBM level 1 or BBM level 2, we can change the block size without using break-before-make sequence. This adds feature detection for BBML, none functional change expected. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Use two new quirk flags named IO_PGTABLE_QUIRK_ARM_BBML1/2 to transfer SMMU BBML feature to io-pgtable. (Robin) --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 ++++++ include/linux/io-pgtable.h | 8 ++++++++ 3 files changed, 33 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 369c0ea7a104..443ac19c6da9 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2030,6 +2030,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, if (smmu->features & ARM_SMMU_FEAT_HD) pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_HD; + if (smmu->features & ARM_SMMU_FEAT_BBML1) + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_BBML1; + else if (smmu->features & ARM_SMMU_FEAT_BBML2) + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_ARM_BBML2; + pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain); if (!pgtbl_ops) return -ENOMEM; @@ -3373,6 +3378,20 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) /* IDR3 */ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3); + switch (FIELD_GET(IDR3_BBML, reg)) { + case IDR3_BBML0: + break; + case IDR3_BBML1: + smmu->features |= ARM_SMMU_FEAT_BBML1; + break; + case IDR3_BBML2: + smmu->features |= ARM_SMMU_FEAT_BBML2; + break; + default: + dev_err(smmu->dev, "unknown/unsupported BBM behavior level\n"); + return -ENXIO; + } + if (FIELD_GET(IDR3_RIL, reg)) smmu->features |= ARM_SMMU_FEAT_RANGE_INV; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 26d6b935b383..a74125675544 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -54,6 +54,10 @@ #define IDR1_SIDSIZE GENMASK(5, 0) #define ARM_SMMU_IDR3 0xc +#define IDR3_BBML GENMASK(12, 11) +#define IDR3_BBML0 0 +#define IDR3_BBML1 1 +#define IDR3_BBML2 2 #define IDR3_RIL (1 << 10) #define ARM_SMMU_IDR5 0x14 @@ -615,6 +619,8 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_E2H (1 << 18) #define ARM_SMMU_FEAT_HA (1 << 19) #define ARM_SMMU_FEAT_HD (1 << 20) +#define ARM_SMMU_FEAT_BBML1 (1 << 21) +#define ARM_SMMU_FEAT_BBML2 (1 << 22) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 64cee6831c97..857932357f1d 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -84,6 +84,12 @@ struct io_pgtable_cfg { * attributes set in the TCR for a non-coherent page-table walker. * * IO_PGTABLE_QUIRK_ARM_HD: Support hardware management of dirty status. + * + * IO_PGTABLE_QUIRK_ARM_BBML1: ARM SMMU supports BBM Level 1 behavior + * when changing block size. + * + * IO_PGTABLE_QUIRK_ARM_BBML2: ARM SMMU supports BBM Level 2 behavior + * when changing block size. */ #define IO_PGTABLE_QUIRK_ARM_NS BIT(0) #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1) @@ -92,6 +98,8 @@ struct io_pgtable_cfg { #define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5) #define IO_PGTABLE_QUIRK_ARM_OUTER_WBWA BIT(6) #define IO_PGTABLE_QUIRK_ARM_HD BIT(7) + #define IO_PGTABLE_QUIRK_ARM_BBML1 BIT(8) + #define IO_PGTABLE_QUIRK_ARM_BBML2 BIT(9) unsigned long quirks; unsigned long pgsize_bitmap; unsigned int ias; From patchwork Wed Mar 10 09:06:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19A74C433E0 for ; Wed, 10 Mar 2021 09:11:15 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 665F764F2D for ; Wed, 10 Mar 2021 09:11:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 665F764F2D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=4MaGUTfTMd7rcHdEkYMmapWdpY+sc5wxSgZqxp1K4oc=; b=obGKoGo/b+0ZNq4cZ/lhJzp7q q5p8579uvvVNoBSeI3A/i3SU8Lbsvumi71qQafqnt3+vcS/UmGrO5eH09/fGNRgwgEfTpxkO62OUY D8gIuv0fkUvDM4f3DPzcQjj3FGxqW4iKg0lfEub0P7AVKDxjjPwqupCi6oVClq3wmj9tupx9j4r/x lg8KMsLE8pynYK28eTY+KJDflIbN+M//tIeGRhoirggcHsvtIEJ1lbTD7/vODGpHytHbo6ZkEtzSf Zj7yhO4ysuk9CN+5nx698R5cpZ0kW0aLtU6Jlml2gw5cEByxAFouH/IOgiaKWXIzpdunf11mU8sWo bHmPT/2OA==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJuqP-006PoI-H9; Wed, 10 Mar 2021 09:09:29 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuoA-006OrU-St for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:16 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3z2v4vzrTLG; Wed, 10 Mar 2021 17:05:03 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:21 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 04/11] iommu/arm-smmu-v3: Split block descriptor when start dirty log Date: Wed, 10 Mar 2021 17:06:07 +0800 Message-ID: <20210310090614.26668-5-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090711_621244_546B68DD X-CRM114-Status: GOOD ( 26.45 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun Block descriptor is not a proper granule for dirty log tracking. Take an extreme example, if DMA writes one byte, under 1G mapping, the dirty amount reported to userspace is 1G, but under 4K mapping, the dirty amount is just 4K. This adds a new interface named start_dirty_log in iommu layer and arm smmuv3 implements it, which splits block descriptor to an span of page descriptors. Other types of IOMMU will perform architecture specific actions to start dirty log. To allow code reuse, the split_block operation is realized as an iommu_ops too. We flush all iotlbs after the whole procedure is completed to ease the pressure of iommu, as we will hanle a huge range of mapping in general. Spliting block does not simultaneously work with other pgtable ops, as the only designed user is vfio, which always hold a lock, so race condition is not considered in the pgtable ops. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Change the return type of split_block(). size_t -> int. - Change commit message to properly describe race condition. (Robin) - Change commit message to properly describe the need of split block. - Add a new interface named start_dirty_log(). (Sun Yi) - Change commit message to explain the realtionship of split_block() and start_dirty_log(). --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 52 +++++++++ drivers/iommu/io-pgtable-arm.c | 122 ++++++++++++++++++++ drivers/iommu/iommu.c | 48 ++++++++ include/linux/io-pgtable.h | 2 + include/linux/iommu.h | 24 ++++ 5 files changed, 248 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 443ac19c6da9..5d2fb926a08e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2537,6 +2537,56 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain, return ret; } +static int arm_smmu_split_block(struct iommu_domain *domain, + unsigned long iova, size_t size) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; + size_t handled_size; + + if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) { + dev_err(smmu->dev, "don't support BBML1/2, can't split block\n"); + return -ENODEV; + } + if (!ops || !ops->split_block) { + pr_err("io-pgtable don't realize split block\n"); + return -ENODEV; + } + + handled_size = ops->split_block(ops, iova, size); + if (handled_size != size) { + pr_err("split block failed\n"); + return -EFAULT; + } + + return 0; +} + +/* + * For SMMU, the action to start dirty log is spliting block mapping. The + * hardware dirty management is always enabled if hardware supports HTTU HD. + */ +static int arm_smmu_start_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (!(smmu->features & ARM_SMMU_FEAT_HD)) + return -ENODEV; + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) + return -EINVAL; + + /* + * Even if the split operation fail, we can still track dirty at block + * granule, which is still a much better choice compared to full dirty + * policy. + */ + iommu_split_block(domain, iova, size); + return 0; +} + static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -2636,6 +2686,8 @@ static struct iommu_ops arm_smmu_ops = { .device_group = arm_smmu_device_group, .domain_get_attr = arm_smmu_domain_get_attr, .domain_set_attr = arm_smmu_domain_set_attr, + .split_block = arm_smmu_split_block, + .start_dirty_log = arm_smmu_start_dirty_log, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 94d790b8ed27..4c4eec3c0698 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -79,6 +79,8 @@ #define ARM_LPAE_PTE_SH_IS (((arm_lpae_iopte)3) << 8) #define ARM_LPAE_PTE_NS (((arm_lpae_iopte)1) << 5) #define ARM_LPAE_PTE_VALID (((arm_lpae_iopte)1) << 0) +/* Block descriptor bits */ +#define ARM_LPAE_PTE_NT (((arm_lpae_iopte)1) << 16) #define ARM_LPAE_PTE_ATTR_LO_MASK (((arm_lpae_iopte)0x3ff) << 2) /* Ignore the contiguous bit for block splitting */ @@ -679,6 +681,125 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops, return iopte_to_paddr(pte, data) | iova; } +static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, int lvl, + arm_lpae_iopte *ptep); + +static size_t arm_lpae_do_split_blk(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, + arm_lpae_iopte blk_pte, int lvl, + arm_lpae_iopte *ptep) +{ + struct io_pgtable_cfg *cfg = &data->iop.cfg; + arm_lpae_iopte pte, *tablep; + phys_addr_t blk_paddr; + size_t tablesz = ARM_LPAE_GRANULE(data); + size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data); + int i; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return 0; + + tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg); + if (!tablep) + return 0; + + blk_paddr = iopte_to_paddr(blk_pte, data); + pte = iopte_prot(blk_pte); + for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz) + __arm_lpae_init_pte(data, blk_paddr, pte, lvl, &tablep[i]); + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_BBML1) { + /* Race does not exist */ + blk_pte |= ARM_LPAE_PTE_NT; + __arm_lpae_set_pte(ptep, blk_pte, cfg); + io_pgtable_tlb_flush_walk(&data->iop, iova, size, size); + } + /* Race does not exist */ + pte = arm_lpae_install_table(tablep, ptep, blk_pte, cfg); + + /* Have splited it into page? */ + if (lvl == (ARM_LPAE_MAX_LEVELS - 1)) + return size; + + /* Go back to lvl - 1 */ + ptep -= ARM_LPAE_LVL_IDX(iova, lvl - 1, data); + return __arm_lpae_split_block(data, iova, size, lvl - 1, ptep); +} + +static size_t __arm_lpae_split_block(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, int lvl, + arm_lpae_iopte *ptep) +{ + arm_lpae_iopte pte; + struct io_pgtable *iop = &data->iop; + size_t base, next_size, total_size; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return 0; + + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return 0; + + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + if (iopte_leaf(pte, lvl, iop->fmt)) { + if (lvl == (ARM_LPAE_MAX_LEVELS - 1) || + (pte & ARM_LPAE_PTE_AP_RDONLY)) + return size; + + /* We find a writable block, split it. */ + return arm_lpae_do_split_blk(data, iova, size, pte, + lvl + 1, ptep); + } else { + /* If it is the last table level, then nothing to do */ + if (lvl == (ARM_LPAE_MAX_LEVELS - 2)) + return size; + + total_size = 0; + next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data); + ptep = iopte_deref(pte, data); + for (base = 0; base < size; base += next_size) + total_size += __arm_lpae_split_block(data, + iova + base, next_size, lvl + 1, + ptep); + return total_size; + } + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + WARN(1, "Can't split behind a block.\n"); + return 0; + } + + /* Keep on walkin */ + ptep = iopte_deref(pte, data); + return __arm_lpae_split_block(data, iova, size, lvl + 1, ptep); +} + +static size_t arm_lpae_split_block(struct io_pgtable_ops *ops, + unsigned long iova, size_t size) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + arm_lpae_iopte *ptep = data->pgd; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + int lvl = data->start_level; + long iaext = (s64)iova >> cfg->ias; + + if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) + return 0; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext)) + return 0; + + /* If it is smallest granule, then nothing to do */ + if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data)) + return size; + + return __arm_lpae_split_block(data, iova, size, lvl, ptep); +} + static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes; @@ -757,6 +878,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .map = arm_lpae_map, .unmap = arm_lpae_unmap, .iova_to_phys = arm_lpae_iova_to_phys, + .split_block = arm_lpae_split_block, }; return data; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index d0b0a15dba84..f644e0b16843 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2720,6 +2720,54 @@ int iommu_domain_set_attr(struct iommu_domain *domain, } EXPORT_SYMBOL_GPL(iommu_domain_set_attr); +int iommu_split_block(struct iommu_domain *domain, unsigned long iova, + size_t size) +{ + const struct iommu_ops *ops = domain->ops; + unsigned int min_pagesz; + size_t pgsize; + int ret = 0; + + if (unlikely(!ops || !ops->split_block)) + return -ENODEV; + + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); + if (!IS_ALIGNED(iova | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n", + iova, size, min_pagesz); + return -EINVAL; + } + + while (size) { + pgsize = iommu_pgsize(domain, iova, size); + + ret = ops->split_block(domain, iova, pgsize); + if (ret) + break; + + pr_debug("split handled: iova 0x%lx size 0x%zx\n", iova, pgsize); + + iova += pgsize; + size -= pgsize; + } + iommu_flush_iotlb_all(domain); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_split_block); + +int iommu_start_dirty_log(struct iommu_domain *domain, unsigned long iova, + size_t size) +{ + const struct iommu_ops *ops = domain->ops; + + if (unlikely(!ops || !ops->start_dirty_log)) + return -ENODEV; + + return ops->start_dirty_log(domain, iova, size); +} +EXPORT_SYMBOL_GPL(iommu_start_dirty_log); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 857932357f1d..d86dd2ade6ad 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -167,6 +167,8 @@ struct io_pgtable_ops { size_t size, struct iommu_iotlb_gather *gather); phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops, unsigned long iova); + size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova, + size_t size); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 5e7fe519430a..85ffa451547d 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -205,6 +205,8 @@ struct iommu_iotlb_gather { * @device_group: find iommu group for a particular device * @domain_get_attr: Query domain attributes * @domain_set_attr: Change domain attributes + * @split_block: Split block mapping into page mapping + * @start_dirty_log: Perform actions to start dirty log tracking * @get_resv_regions: Request list of reserved regions for a device * @put_resv_regions: Free list of reserved regions for a device * @apply_resv_region: Temporary helper call-back for iova reserved ranges @@ -260,6 +262,12 @@ struct iommu_ops { int (*domain_set_attr)(struct iommu_domain *domain, enum iommu_attr attr, void *data); + /* Track dirty log */ + int (*split_block)(struct iommu_domain *domain, unsigned long iova, + size_t size); + int (*start_dirty_log)(struct iommu_domain *domain, unsigned long iova, + size_t size); + /* Request/Free a list of reserved regions for a device */ void (*get_resv_regions)(struct device *dev, struct list_head *list); void (*put_resv_regions)(struct device *dev, struct list_head *list); @@ -511,6 +519,10 @@ extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr, void *data); extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr, void *data); +extern int iommu_split_block(struct iommu_domain *domain, unsigned long iova, + size_t size); +extern int iommu_start_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size); /* Window handling function prototypes */ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, @@ -901,6 +913,18 @@ static inline int iommu_domain_set_attr(struct iommu_domain *domain, return -EINVAL; } +static inline int iommu_split_block(struct iommu_domain *domain, + unsigned long iova, size_t size) +{ + return -EINVAL; +} + +static inline int iommu_start_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size) +{ + return -EINVAL; +} + static inline int iommu_device_register(struct iommu_device *iommu) { return -ENODEV; From patchwork Wed Mar 10 09:06:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEDE3C433DB for ; Wed, 10 Mar 2021 09:10:28 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1EB8F64FEF for ; Wed, 10 Mar 2021 09:10:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1EB8F64FEF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=DJ/kL0RIdseeAzZYo/aCz+WyPiHqtfN1TIHp2uNUDWA=; b=HJixAX7GsDNPs2vCPcEyH+vr9 gaC9Xy1HB11Ur8flPYROmZnNmaJC36TnZfICADvDlPgDYRzPjYFttFm+wJvNMlo5FgHct86zr2zq4 5Sm3xE8MGHHb766ZGMIjfj8Q05mMU2Kp8h2CFe3obpCKGb55B+M0MQYQ3H53gOPtLzFT6x0p4RBPF up7ytHiSPcrdqIQkLXzdELjA+1LmtiosGRdjLR48u/JNeVUDTFowWTDMO2ZCnMULz02NFqcF3oYzv qJoBGXaszDJz2yoDPxNycmloB6uEbuAZ90rFuWME5zVkvdMvYPWRG1X4ZE55AqkTkEYBAYZ6juiNP 3AitjJ0ug==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJupO-006PNp-Ga; Wed, 10 Mar 2021 09:08:28 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuo9-006Ord-ON for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:14 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3z3PF8zrTKj; Wed, 10 Mar 2021 17:05:03 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:22 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 05/11] iommu/arm-smmu-v3: Merge a span of page when stop dirty log Date: Wed, 10 Mar 2021 17:06:08 +0800 Message-ID: <20210310090614.26668-6-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090711_140713_D3713F05 X-CRM114-Status: GOOD ( 23.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun When stop dirty log tracking, we need to recover all block descriptors which are splited when start dirty log tracking. This adds a new interface named stop_dirty_log in iommu layer and arm smmuv3 implements it, which reinstall block mappings and unmap the span of page mappings. Other types of IOMMU perform architecture specific actions to stop dirty log. To allow code reuse, the merge_page operation is realized as an iommu_ops too. We flush all iotlbs after the whole procedure is completed to ease the pressure of iommu, as we will hanle a huge range of mapping in general. Merging page does not simultaneously work with other pgtable ops, as the only designed user is vfio, which always hold a lock, so race condition is not considered in the pgtable ops. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Change the return type of merge_page(). size_t -> int. - Change commit message to properly describe race condition. (Robin) - Add a new interface named stop_dirty_log(). (Sun Yi) - Change commit message to explain the realtionship of merge_page() and stop_dirty_log(). --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 52 +++++++++++++ drivers/iommu/io-pgtable-arm.c | 78 ++++++++++++++++++++ drivers/iommu/iommu.c | 82 +++++++++++++++++++++ include/linux/io-pgtable.h | 2 + include/linux/iommu.h | 24 ++++++ 5 files changed, 238 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 5d2fb926a08e..ac0d881c77b8 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2587,6 +2587,56 @@ static int arm_smmu_start_dirty_log(struct iommu_domain *domain, return 0; } +static int arm_smmu_merge_page(struct iommu_domain *domain, + unsigned long iova, phys_addr_t paddr, + size_t size, int prot) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; + size_t handled_size; + + if (!(smmu->features & (ARM_SMMU_FEAT_BBML1 | ARM_SMMU_FEAT_BBML2))) { + dev_err(smmu->dev, "don't support BBML1/2, can't merge page\n"); + return -ENODEV; + } + if (!ops || !ops->merge_page) { + pr_err("io-pgtable don't realize merge page\n"); + return -ENODEV; + } + + handled_size = ops->merge_page(ops, iova, paddr, size, prot); + if (handled_size != size) { + pr_err("merge page failed\n"); + return -EFAULT; + } + + return 0; +} + +/* + * For SMMU, the action to stop dirty log is merge page mapping. The hardware + * dirty management is always enabled if hardware supports HTTU HD. + */ +static int arm_smmu_stop_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, int prot) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (!(smmu->features & ARM_SMMU_FEAT_HD)) + return -ENODEV; + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) + return -EINVAL; + + /* + * Even if the merge operation fail, it just effects performace of DMA + * transaction. + */ + iommu_merge_page(domain, iova, size, prot); + return 0; +} + static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -2688,6 +2738,8 @@ static struct iommu_ops arm_smmu_ops = { .domain_set_attr = arm_smmu_domain_set_attr, .split_block = arm_smmu_split_block, .start_dirty_log = arm_smmu_start_dirty_log, + .merge_page = arm_smmu_merge_page, + .stop_dirty_log = arm_smmu_stop_dirty_log, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 4c4eec3c0698..9028328b99b0 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -800,6 +800,83 @@ static size_t arm_lpae_split_block(struct io_pgtable_ops *ops, return __arm_lpae_split_block(data, iova, size, lvl, ptep); } +static size_t __arm_lpae_merge_page(struct arm_lpae_io_pgtable *data, + unsigned long iova, phys_addr_t paddr, + size_t size, int lvl, arm_lpae_iopte *ptep, + arm_lpae_iopte prot) +{ + arm_lpae_iopte pte, *tablep; + struct io_pgtable *iop = &data->iop; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return 0; + + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return 0; + + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + if (iopte_leaf(pte, lvl, iop->fmt)) + return size; + + /* Race does not exist */ + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_BBML1) { + prot |= ARM_LPAE_PTE_NT; + __arm_lpae_init_pte(data, paddr, prot, lvl, ptep); + io_pgtable_tlb_flush_walk(iop, iova, size, + ARM_LPAE_GRANULE(data)); + + prot &= ~(ARM_LPAE_PTE_NT); + __arm_lpae_init_pte(data, paddr, prot, lvl, ptep); + } else { + __arm_lpae_init_pte(data, paddr, prot, lvl, ptep); + } + + tablep = iopte_deref(pte, data); + __arm_lpae_free_pgtable(data, lvl + 1, tablep); + return size; + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + /* The size is too small, already merged */ + return size; + } + + /* Keep on walkin */ + ptep = iopte_deref(pte, data); + return __arm_lpae_merge_page(data, iova, paddr, size, lvl + 1, ptep, prot); +} + +static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t paddr, size_t size, int iommu_prot) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + struct io_pgtable_cfg *cfg = &data->iop.cfg; + arm_lpae_iopte *ptep = data->pgd; + int lvl = data->start_level; + arm_lpae_iopte prot; + long iaext = (s64)iova >> cfg->ias; + + /* If no access, then nothing to do */ + if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE))) + return size; + + if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) + return 0; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext || paddr >> cfg->oas)) + return 0; + + /* If it is smallest granule, then nothing to do */ + if (size == ARM_LPAE_BLOCK_SIZE(ARM_LPAE_MAX_LEVELS - 1, data)) + return size; + + prot = arm_lpae_prot_to_pte(data, iommu_prot); + return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot); +} + static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes; @@ -879,6 +956,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .unmap = arm_lpae_unmap, .iova_to_phys = arm_lpae_iova_to_phys, .split_block = arm_lpae_split_block, + .merge_page = arm_lpae_merge_page, }; return data; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index f644e0b16843..2a10294b62a3 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2768,6 +2768,88 @@ int iommu_start_dirty_log(struct iommu_domain *domain, unsigned long iova, } EXPORT_SYMBOL_GPL(iommu_start_dirty_log); +static int __iommu_merge_page(struct iommu_domain *domain, + unsigned long iova, phys_addr_t paddr, + size_t size, int prot) +{ + const struct iommu_ops *ops = domain->ops; + unsigned int min_pagesz; + size_t pgsize; + int ret = 0; + + if (unlikely(!ops || !ops->merge_page)) + return -ENODEV; + + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); + if (!IS_ALIGNED(iova | paddr | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx pa %pa size 0x%zx min_pagesz 0x%x\n", + iova, &paddr, size, min_pagesz); + return -EINVAL; + } + + while (size) { + pgsize = iommu_pgsize(domain, iova | paddr, size); + + ret = ops->merge_page(domain, iova, paddr, pgsize, prot); + if (ret) + break; + + pr_debug("merge handled: iova 0x%lx pa %pa size 0x%zx\n", + iova, &paddr, pgsize); + + iova += pgsize; + paddr += pgsize; + size -= pgsize; + } + + return ret; +} + +int iommu_merge_page(struct iommu_domain *domain, unsigned long iova, + size_t size, int prot) +{ + phys_addr_t phys; + dma_addr_t p, i; + size_t cont_size; + int ret = 0; + + while (size) { + phys = iommu_iova_to_phys(domain, iova); + cont_size = PAGE_SIZE; + p = phys + cont_size; + i = iova + cont_size; + + while (cont_size < size && p == iommu_iova_to_phys(domain, i)) { + p += PAGE_SIZE; + i += PAGE_SIZE; + cont_size += PAGE_SIZE; + } + + ret = __iommu_merge_page(domain, iova, phys, cont_size, prot); + if (ret) + break; + + iova += cont_size; + size -= cont_size; + } + iommu_flush_iotlb_all(domain); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_merge_page); + +int iommu_stop_dirty_log(struct iommu_domain *domain, unsigned long iova, + size_t size, int prot) +{ + const struct iommu_ops *ops = domain->ops; + + if (unlikely(!ops || !ops->stop_dirty_log)) + return -ENODEV; + + return ops->stop_dirty_log(domain, iova, size, prot); +} +EXPORT_SYMBOL_GPL(iommu_stop_dirty_log); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index d86dd2ade6ad..38b4e17c70f0 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -169,6 +169,8 @@ struct io_pgtable_ops { unsigned long iova); size_t (*split_block)(struct io_pgtable_ops *ops, unsigned long iova, size_t size); + size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova, + phys_addr_t phys, size_t size, int prot); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 85ffa451547d..28111009cf6f 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -207,6 +207,8 @@ struct iommu_iotlb_gather { * @domain_set_attr: Change domain attributes * @split_block: Split block mapping into page mapping * @start_dirty_log: Perform actions to start dirty log tracking + * @merge_page: Merge page mapping into block mapping + * @stop_dirty_log: Perform actions to stop dirty log tracking * @get_resv_regions: Request list of reserved regions for a device * @put_resv_regions: Free list of reserved regions for a device * @apply_resv_region: Temporary helper call-back for iova reserved ranges @@ -267,6 +269,10 @@ struct iommu_ops { size_t size); int (*start_dirty_log)(struct iommu_domain *domain, unsigned long iova, size_t size); + int (*merge_page)(struct iommu_domain *domain, unsigned long iova, + phys_addr_t phys, size_t size, int prot); + int (*stop_dirty_log)(struct iommu_domain *domain, unsigned long iova, + size_t size, int prot); /* Request/Free a list of reserved regions for a device */ void (*get_resv_regions)(struct device *dev, struct list_head *list); @@ -523,6 +529,10 @@ extern int iommu_split_block(struct iommu_domain *domain, unsigned long iova, size_t size); extern int iommu_dirty_log_start(struct iommu_domain *domain, unsigned long iova, size_t size); +extern int iommu_merge_page(struct iommu_domain *domain, unsigned long iova, + size_t size, int prot); +extern int iommu_stop_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, int prot); /* Window handling function prototypes */ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, @@ -925,6 +935,20 @@ static inline int iommu_start_dirty_log(struct iommu_domain *domain, return -EINVAL; } +static inline int iommu_merge_page(struct iommu_domain *domain, + unsigned long iova, size_t size, + int prot) +{ + return -EINVAL; +} + +static inline int iommu_stop_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + int prot) +{ + return -EINVAL; +} + static inline int iommu_device_register(struct iommu_device *iommu) { return -ENODEV; From patchwork Wed Mar 10 09:06:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C041C433E6 for ; Wed, 10 Mar 2021 09:09:47 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 56E6164FF3 for ; Wed, 10 Mar 2021 09:09:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 56E6164FF3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=eZRUFRxGe21Wqmb9SrJikkdNE3cQFjsz4U6PrAdX7Mk=; b=o9E0QuemzIQcstGIDqK8yoZiN naR6xPi9YkoC9rd+BHaHMJKoHVe5hWBaRaewGDkQdsSyL1z2wBui9SEXAzPefkWQjQxAbA4oYoCeS L6V6hr1f6CsNqVXrHrRteVVHk5Gn0mxCcJn/OUUqNYLNlnKAaXDV9zQVsTN5QkWrCs3z+kb1JYjR2 I6B7OHCzUKy5o+Phsiw/nqSBMDYXESTQXup7EUqiS+mgjb75Eq9qt63eAOosuZjCTQsNitSttWJXY RBu0D9seXPOZ8d0ofqobtYlgtp03B9NefvNDtC51unOLZkY8tf+duEyfUhIOT7PHrB070YJ+bweax 5PPNa0ZKQ==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJup7-006PDC-RF; Wed, 10 Mar 2021 09:08:11 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuo6-006Onk-Gu for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:12 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3z1svGzrTKL; Wed, 10 Mar 2021 17:05:03 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:23 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 06/11] iommu/arm-smmu-v3: Scan leaf TTD to sync hardware dirty log Date: Wed, 10 Mar 2021 17:06:09 +0800 Message-ID: <20210310090614.26668-7-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090708_000260_B2719830 X-CRM114-Status: GOOD ( 21.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun During dirty log tracking, user will try to retrieve dirty log from iommu if it supports hardware dirty log. This adds a new interface named sync_dirty_log in iommu layer and arm smmuv3 implements it, which scans leaf TTD and treats it's dirty if it's writable (As we just enable HTTU for stage1, so check whether AP[2] is not set). Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Add new sanity check in arm_smmu_sync_dirty_log(). (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) - Document the purpose of flush_iotlb in arm_smmu_sync_dirty_log(). (Robin) --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 30 +++++++ drivers/iommu/io-pgtable-arm.c | 90 +++++++++++++++++++++ drivers/iommu/iommu.c | 38 +++++++++ include/linux/io-pgtable.h | 4 + include/linux/iommu.h | 18 +++++ 5 files changed, 180 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index ac0d881c77b8..7407896a710e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2637,6 +2637,35 @@ static int arm_smmu_stop_dirty_log(struct iommu_domain *domain, return 0; } +static int arm_smmu_sync_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (!(smmu->features & ARM_SMMU_FEAT_HD)) + return -ENODEV; + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) + return -EINVAL; + + if (!ops || !ops->sync_dirty_log) { + pr_err("io-pgtable don't realize sync dirty log\n"); + return -ENODEV; + } + + /* + * Flush iotlb to ensure all inflight transactions are completed. + * See doc IHI0070Da 3.13.4 "HTTU behavior summary". + */ + arm_smmu_flush_iotlb_all(domain); + return ops->sync_dirty_log(ops, iova, size, bitmap, base_iova, + bitmap_pgshift); +} + static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -2740,6 +2769,7 @@ static struct iommu_ops arm_smmu_ops = { .start_dirty_log = arm_smmu_start_dirty_log, .merge_page = arm_smmu_merge_page, .stop_dirty_log = arm_smmu_stop_dirty_log, + .sync_dirty_log = arm_smmu_sync_dirty_log, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 9028328b99b0..67a208a05ab2 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -877,6 +877,95 @@ static size_t arm_lpae_merge_page(struct io_pgtable_ops *ops, unsigned long iova return __arm_lpae_merge_page(data, iova, paddr, size, lvl, ptep, prot); } +static int __arm_lpae_sync_dirty_log(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, + int lvl, arm_lpae_iopte *ptep, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + arm_lpae_iopte pte; + struct io_pgtable *iop = &data->iop; + size_t base, next_size; + unsigned long offset; + int nbits, ret; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return -EINVAL; + + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return -EINVAL; + + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + if (iopte_leaf(pte, lvl, iop->fmt)) { + if (pte & ARM_LPAE_PTE_AP_RDONLY) + return 0; + + /* It is writable, set the bitmap */ + nbits = size >> bitmap_pgshift; + offset = (iova - base_iova) >> bitmap_pgshift; + bitmap_set(bitmap, offset, nbits); + return 0; + } else { + /* To traverse next level */ + next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data); + ptep = iopte_deref(pte, data); + for (base = 0; base < size; base += next_size) { + ret = __arm_lpae_sync_dirty_log(data, + iova + base, next_size, lvl + 1, + ptep, bitmap, base_iova, bitmap_pgshift); + if (ret) + return ret; + } + return 0; + } + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + if (pte & ARM_LPAE_PTE_AP_RDONLY) + return 0; + + /* Though the size is too small, also set bitmap */ + nbits = size >> bitmap_pgshift; + offset = (iova - base_iova) >> bitmap_pgshift; + bitmap_set(bitmap, offset, nbits); + return 0; + } + + /* Keep on walkin */ + ptep = iopte_deref(pte, data); + return __arm_lpae_sync_dirty_log(data, iova, size, lvl + 1, ptep, + bitmap, base_iova, bitmap_pgshift); +} + +static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + arm_lpae_iopte *ptep = data->pgd; + int lvl = data->start_level; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + long iaext = (s64)iova >> cfg->ias; + + if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) + return -EINVAL; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext)) + return -EINVAL; + + if (data->iop.fmt != ARM_64_LPAE_S1 && + data->iop.fmt != ARM_32_LPAE_S1) + return -EINVAL; + + return __arm_lpae_sync_dirty_log(data, iova, size, lvl, ptep, + bitmap, base_iova, bitmap_pgshift); +} + static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes; @@ -957,6 +1046,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .iova_to_phys = arm_lpae_iova_to_phys, .split_block = arm_lpae_split_block, .merge_page = arm_lpae_merge_page, + .sync_dirty_log = arm_lpae_sync_dirty_log, }; return data; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 2a10294b62a3..44dfb78f9050 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2850,6 +2850,44 @@ int iommu_stop_dirty_log(struct iommu_domain *domain, unsigned long iova, } EXPORT_SYMBOL_GPL(iommu_stop_dirty_log); +int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova, + size_t size, unsigned long *bitmap, + unsigned long base_iova, unsigned long bitmap_pgshift) +{ + const struct iommu_ops *ops = domain->ops; + unsigned int min_pagesz; + size_t pgsize; + int ret = 0; + + if (unlikely(!ops || !ops->sync_dirty_log)) + return -ENODEV; + + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); + if (!IS_ALIGNED(iova | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx size 0x%zx min_pagesz 0x%x\n", + iova, size, min_pagesz); + return -EINVAL; + } + + while (size) { + pgsize = iommu_pgsize(domain, iova, size); + + ret = ops->sync_dirty_log(domain, iova, pgsize, + bitmap, base_iova, bitmap_pgshift); + if (ret) + break; + + pr_debug("dirty_log_sync handle: iova 0x%lx pagesz 0x%zx\n", + iova, pgsize); + + iova += pgsize; + size -= pgsize; + } + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_sync_dirty_log); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 38b4e17c70f0..5107a9d4ac79 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -171,6 +171,10 @@ struct io_pgtable_ops { size_t size); size_t (*merge_page)(struct io_pgtable_ops *ops, unsigned long iova, phys_addr_t phys, size_t size, int prot); + int (*sync_dirty_log)(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 28111009cf6f..7d5777acfdb7 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -209,6 +209,7 @@ struct iommu_iotlb_gather { * @start_dirty_log: Perform actions to start dirty log tracking * @merge_page: Merge page mapping into block mapping * @stop_dirty_log: Perform actions to stop dirty log tracking + * @sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap * @get_resv_regions: Request list of reserved regions for a device * @put_resv_regions: Free list of reserved regions for a device * @apply_resv_region: Temporary helper call-back for iova reserved ranges @@ -273,6 +274,10 @@ struct iommu_ops { phys_addr_t phys, size_t size, int prot); int (*stop_dirty_log)(struct iommu_domain *domain, unsigned long iova, size_t size, int prot); + int (*sync_dirty_log)(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift); /* Request/Free a list of reserved regions for a device */ void (*get_resv_regions)(struct device *dev, struct list_head *list); @@ -533,6 +538,10 @@ extern int iommu_merge_page(struct iommu_domain *domain, unsigned long iova, size_t size, int prot); extern int iommu_stop_dirty_log(struct iommu_domain *domain, unsigned long iova, size_t size, int prot); +extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova, + size_t size, unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift); /* Window handling function prototypes */ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, @@ -949,6 +958,15 @@ static inline int iommu_stop_dirty_log(struct iommu_domain *domain, return -EINVAL; } +static inline int iommu_sync_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long pgshift) +{ + return -EINVAL; +} + static inline int iommu_device_register(struct iommu_device *iommu) { return -ENODEV; From patchwork Wed Mar 10 09:06:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7467AC433DB for ; Wed, 10 Mar 2021 09:09:03 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BBB1464F2D for ; Wed, 10 Mar 2021 09:09:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BBB1464F2D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=X9iTNiZ8h6Edc5+FnFGimgVgltmTXpnCzeY80iYehj4=; b=hQgCpyzGM/tBuRpFii2HsIYpH PsZW3iMQImzpZyoOXFSrXf3ysDwBux4Kp+xrnyKxGyGG0zi3kIRDMJRRSlJe9jFOLzpdNyOZwMmAK i0fzoQBt/K4YbFpnsxQYxrVv11vb/uI+7cWE882RU36jJ6QtycmZ7C6UiiQC5iUe2V8IeTcfJuyQn p7p6wSTbSLf4tQtu+Mte9C4CA440TjP5aCAPPd3Ra3xt1kKOpDnopacvF7SWq9PpgSl6eokxvNBno lnYAuZqqG3X+8xYLVqbp5mmqmNoQDGeDElHjgQouKfc4DZbrQNzt06gqH8sJ8LQvxdZs6O+EAkLZQ nLWkViutQ==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJuoW-006Ozn-UR; Wed, 10 Mar 2021 09:07:33 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuo2-006Onn-Rf for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:07 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3z2NyhzrTLC; Wed, 10 Mar 2021 17:05:03 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:24 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 07/11] iommu/arm-smmu-v3: Clear dirty log according to bitmap Date: Wed, 10 Mar 2021 17:06:10 +0800 Message-ID: <20210310090614.26668-8-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090704_719104_4B6842AE X-CRM114-Status: GOOD ( 23.02 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun After dirty log is retrieved, user should clear dirty log to re-enable dirty log tracking for these dirtied pages. This adds a new interface named clear_dirty_log in iommu layer and arm smmuv3 implements it, which clears the dirty state (As we just enable HTTU for stage1, so set the AP[2] bit) of these TTDs that are specified by the user provided bitmap. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Add new sanity check in arm_smmu_sync_dirty_log(). (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) - Remove extra flush_iotlb in __iommu_clear_dirty_log(). --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 25 ++++++ drivers/iommu/io-pgtable-arm.c | 95 +++++++++++++++++++++ drivers/iommu/iommu.c | 68 +++++++++++++++ include/linux/io-pgtable.h | 4 + include/linux/iommu.h | 17 ++++ 5 files changed, 209 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 7407896a710e..696df51a3282 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2666,6 +2666,30 @@ static int arm_smmu_sync_dirty_log(struct iommu_domain *domain, bitmap_pgshift); } +static int arm_smmu_clear_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (!(smmu->features & ARM_SMMU_FEAT_HD)) + return -ENODEV; + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) + return -EINVAL; + + if (!ops || !ops->clear_dirty_log) { + pr_err("io-pgtable don't realize clear dirty log\n"); + return -ENODEV; + } + + return ops->clear_dirty_log(ops, iova, size, bitmap, base_iova, + bitmap_pgshift); +} + static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args) { return iommu_fwspec_add_ids(dev, args->args, 1); @@ -2770,6 +2794,7 @@ static struct iommu_ops arm_smmu_ops = { .merge_page = arm_smmu_merge_page, .stop_dirty_log = arm_smmu_stop_dirty_log, .sync_dirty_log = arm_smmu_sync_dirty_log, + .clear_dirty_log = arm_smmu_clear_dirty_log, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 67a208a05ab2..e3ef0f50611c 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -966,6 +966,100 @@ static int arm_lpae_sync_dirty_log(struct io_pgtable_ops *ops, bitmap, base_iova, bitmap_pgshift); } +static int __arm_lpae_clear_dirty_log(struct arm_lpae_io_pgtable *data, + unsigned long iova, size_t size, + int lvl, arm_lpae_iopte *ptep, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + arm_lpae_iopte pte; + struct io_pgtable *iop = &data->iop; + unsigned long offset; + size_t base, next_size; + int nbits, ret, i; + + if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS)) + return -EINVAL; + + ptep += ARM_LPAE_LVL_IDX(iova, lvl, data); + pte = READ_ONCE(*ptep); + if (WARN_ON(!pte)) + return -EINVAL; + + if (size == ARM_LPAE_BLOCK_SIZE(lvl, data)) { + if (iopte_leaf(pte, lvl, iop->fmt)) { + if (pte & ARM_LPAE_PTE_AP_RDONLY) + return 0; + + /* Ensure all corresponding bits are set */ + nbits = size >> bitmap_pgshift; + offset = (iova - base_iova) >> bitmap_pgshift; + for (i = offset; i < offset + nbits; i++) { + if (!test_bit(i, bitmap)) + return 0; + } + + /* Race does not exist */ + pte |= ARM_LPAE_PTE_AP_RDONLY; + __arm_lpae_set_pte(ptep, pte, &iop->cfg); + return 0; + } else { + /* To traverse next level */ + next_size = ARM_LPAE_BLOCK_SIZE(lvl + 1, data); + ptep = iopte_deref(pte, data); + for (base = 0; base < size; base += next_size) { + ret = __arm_lpae_clear_dirty_log(data, + iova + base, next_size, lvl + 1, + ptep, bitmap, base_iova, + bitmap_pgshift); + if (ret) + return ret; + } + return 0; + } + } else if (iopte_leaf(pte, lvl, iop->fmt)) { + /* Though the size is too small, it is already clean */ + if (pte & ARM_LPAE_PTE_AP_RDONLY) + return 0; + + return -EINVAL; + } + + /* Keep on walkin */ + ptep = iopte_deref(pte, data); + return __arm_lpae_clear_dirty_log(data, iova, size, lvl + 1, ptep, + bitmap, base_iova, bitmap_pgshift); +} + +static int arm_lpae_clear_dirty_log(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops); + arm_lpae_iopte *ptep = data->pgd; + int lvl = data->start_level; + struct io_pgtable_cfg *cfg = &data->iop.cfg; + long iaext = (s64)iova >> cfg->ias; + + if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size)) + return -EINVAL; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) + iaext = ~iaext; + if (WARN_ON(iaext)) + return -EINVAL; + + if (data->iop.fmt != ARM_64_LPAE_S1 && + data->iop.fmt != ARM_32_LPAE_S1) + return -EINVAL; + + return __arm_lpae_clear_dirty_log(data, iova, size, lvl, ptep, + bitmap, base_iova, bitmap_pgshift); +} + static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg) { unsigned long granule, page_sizes; @@ -1047,6 +1141,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) .split_block = arm_lpae_split_block, .merge_page = arm_lpae_merge_page, .sync_dirty_log = arm_lpae_sync_dirty_log, + .clear_dirty_log = arm_lpae_clear_dirty_log, }; return data; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 44dfb78f9050..105e4c1f015e 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2888,6 +2888,74 @@ int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova, } EXPORT_SYMBOL_GPL(iommu_sync_dirty_log); +static int __iommu_clear_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + const struct iommu_ops *ops = domain->ops; + size_t pgsize; + int ret = 0; + + if (unlikely(!ops || !ops->clear_dirty_log)) + return -ENODEV; + + while (size) { + pgsize = iommu_pgsize(domain, iova, size); + + ret = ops->clear_dirty_log(domain, iova, pgsize, bitmap, + base_iova, bitmap_pgshift); + if (ret) + break; + + pr_debug("dirty_log_clear handled: iova 0x%lx pagesz 0x%zx\n", + iova, pgsize); + + iova += pgsize; + size -= pgsize; + } + + return ret; +} + +int iommu_clear_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift) +{ + unsigned long riova, rsize; + unsigned int min_pagesz; + bool flush = false; + int rs, re, start, end; + int ret = 0; + + min_pagesz = 1 << __ffs(domain->pgsize_bitmap); + if (!IS_ALIGNED(iova | size, min_pagesz)) { + pr_err("unaligned: iova 0x%lx min_pagesz 0x%x\n", + iova, min_pagesz); + return -EINVAL; + } + + start = (iova - base_iova) >> bitmap_pgshift; + end = start + (size >> bitmap_pgshift); + bitmap_for_each_set_region(bitmap, rs, re, start, end) { + flush = true; + riova = iova + (rs << bitmap_pgshift); + rsize = (re - rs) << bitmap_pgshift; + ret = __iommu_clear_dirty_log(domain, riova, rsize, bitmap, + base_iova, bitmap_pgshift); + if (ret) + break; + } + + if (flush) + iommu_flush_iotlb_all(domain); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_clear_dirty_log); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 5107a9d4ac79..48dbbd2e12b2 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -175,6 +175,10 @@ struct io_pgtable_ops { unsigned long iova, size_t size, unsigned long *bitmap, unsigned long base_iova, unsigned long bitmap_pgshift); + int (*clear_dirty_log)(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 7d5777acfdb7..4f7db5d23b23 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -278,6 +278,10 @@ struct iommu_ops { unsigned long iova, size_t size, unsigned long *bitmap, unsigned long base_iova, unsigned long bitmap_pgshift); + int (*clear_dirty_log)(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, unsigned long base_iova, + unsigned long bitmap_pgshift); /* Request/Free a list of reserved regions for a device */ void (*get_resv_regions)(struct device *dev, struct list_head *list); @@ -542,6 +546,10 @@ extern int iommu_sync_dirty_log(struct iommu_domain *domain, unsigned long iova, size_t size, unsigned long *bitmap, unsigned long base_iova, unsigned long bitmap_pgshift); +extern int iommu_clear_dirty_log(struct iommu_domain *domain, unsigned long iova, + size_t dma_size, unsigned long *bitmap, + unsigned long base_iova, + unsigned long bitmap_pgshift); /* Window handling function prototypes */ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, @@ -967,6 +975,15 @@ static inline int iommu_sync_dirty_log(struct iommu_domain *domain, return -EINVAL; } +static inline int iommu_clear_dirty_log(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long *bitmap, + unsigned long base_iova, + unsigned long pgshift) +{ + return -EINVAL; +} + static inline int iommu_device_register(struct iommu_device *iommu) { return -ENODEV; From patchwork Wed Mar 10 09:06:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94375C433E9 for ; Wed, 10 Mar 2021 09:10:33 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1271D64E74 for ; Wed, 10 Mar 2021 09:10:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1271D64E74 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=52C4g47QkjQudZO8QuNKa8g/usN4bxPVL3EalljHB/0=; b=QEOcEkroyKdso8RLviidzv6+J 06eyXGQc41AyNp4GAdd1lZ7sOMC7na3LuR1m4a7zquuQXozLeC9xkTeQeV29S593XCjIUNos42i6x mYtIuMTfaRWPzacg9xl+UAF/Xjdvm7BHUvOEmsB4JfCQhrS1NDU7llCuQ+na19FI9lPt0uJQLHfns trtekb2B81yUv4NUe5KH3wPCtPCL48qIGLTchsIOr3ZFxuHYiKc0fX4MnF2OOZ4T95IsiyrAxyevD 85QkKgukHqF7pmj/LfiynTzD1rHKeDD4qlKOGBt/knDWGnUcrE8sqnAuFCnWvDgPapqpn65UhRUV+ T87NlRQGA==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJupq-006PYn-P1; Wed, 10 Mar 2021 09:08:54 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuoA-006Ore-0G for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:14 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3z3vPkzrTKS; Wed, 10 Mar 2021 17:05:03 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:25 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 08/11] iommu/arm-smmu-v3: Add HWDBM device feature reporting Date: Wed, 10 Mar 2021 17:06:11 +0800 Message-ID: <20210310090614.26668-9-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090711_132153_AC597463 X-CRM114-Status: GOOD ( 13.58 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun We have implemented these interfaces required to support iommu dirty log tracking. The last step is reporting this feature to upper user, then the user can perform higher policy base on it. This adds a new dev feature named IOMMU_DEV_FEAT_HWDBM in iommu layer. For arm smmuv3, it is equal to ARM_SMMU_FEAT_HD and it is enabled by default if supported. Other types of IOMMU can enable it by default or when dev_enable_feature() is called. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - As dev_has_feature() has been removed from iommu layer, IOMMU_DEV_FEAT_HWDBM is designed to be used through "enable" interface. --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 ++++ include/linux/iommu.h | 1 + 2 files changed, 5 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 696df51a3282..cd1627123e80 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2722,6 +2722,8 @@ static bool arm_smmu_dev_has_feature(struct device *dev, switch (feat) { case IOMMU_DEV_FEAT_SVA: return arm_smmu_master_sva_supported(master); + case IOMMU_DEV_FEAT_HWDBM: + return !!(master->smmu->features & ARM_SMMU_FEAT_HD); default: return false; } @@ -2738,6 +2740,8 @@ static bool arm_smmu_dev_feature_enabled(struct device *dev, switch (feat) { case IOMMU_DEV_FEAT_SVA: return arm_smmu_master_sva_enabled(master); + case IOMMU_DEV_FEAT_HWDBM: + return arm_smmu_dev_has_feature(dev, feat); default: return false; } diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 4f7db5d23b23..88584a2d027c 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -160,6 +160,7 @@ struct iommu_resv_region { enum iommu_dev_features { IOMMU_DEV_FEAT_AUX, /* Aux-domain feature */ IOMMU_DEV_FEAT_SVA, /* Shared Virtual Addresses */ + IOMMU_DEV_FEAT_HWDBM, /* Hardware Dirty Bit Management */ }; #define IOMMU_PASID_INVALID (-1U) From patchwork Wed Mar 10 09:06:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2C0AC433DB for ; Wed, 10 Mar 2021 09:09:15 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6E43664FEF for ; Wed, 10 Mar 2021 09:09:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6E43664FEF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=SrBF2H0z+pYFhk5+jvdEBOk9RqxeTbNy7P33bX5OzgI=; b=OBg/O/R418uUI4zGDXMp6BwAi JZOMiErQhJYzOLZYMpt8wA4bLl43EqYn9BIR17PGumDAeNE4udyx094iloUpbxeVajeDHytdLJkPW k88erGZfXGnYI0PULYmLcA5J/hcyegfVLqrNiqJ6FeXJSaJnaaIeFoMnwDYr8WZeDNhvUJ/dxbQuO wi6Xz/yTftpaTTCTIHXLt0f1MOj+S36ffoW2/zkGvUM1dyjZxLSVq56YZxtZAx5mqwFVggXnZgdQO +//204ZpXSS6DaPLmqe8wLnTDgO+QLRiZ9JavzpaiAzzV+pG5BHi8XVvAUpi4mYBk+tm6ZNVGBJz5 WHVauIqlA==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJuoh-006P1h-Ae; Wed, 10 Mar 2021 09:07:43 +0000 Received: from szxga07-in.huawei.com ([45.249.212.35]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuo4-006Ono-Ik for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:07 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4DwR400qkwz8vtv; Wed, 10 Mar 2021 17:05:04 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:26 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 09/11] vfio/iommu_type1: Add HWDBM status maintanance Date: Wed, 10 Mar 2021 17:06:12 +0800 Message-ID: <20210310090614.26668-10-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090705_418858_B9FBCCB0 X-CRM114-Status: GOOD ( 13.15 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun We are going to optimize dirty log tracking based on iommu HWDBM feature, but the dirty log from iommu is useful only when all iommu backed groups are connected to iommu with HWDBM feature. This maintains a counter for this feature. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Simplify vfio_group_supports_hwdbm(). - AS feature report of HWDBM has been changed, so change vfio_dev_has_feature() to vfio_dev_enable_feature(). --- drivers/vfio/vfio_iommu_type1.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 4bb162c1d649..876351c061e4 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -79,6 +79,7 @@ struct vfio_iommu { bool dirty_page_tracking; bool pinned_page_dirty_scope; bool container_open; + uint64_t num_non_hwdbm_groups; }; struct vfio_domain { @@ -116,6 +117,7 @@ struct vfio_group { struct list_head next; bool mdev_group; /* An mdev group */ bool pinned_page_dirty_scope; + bool iommu_hwdbm; /* For iommu-backed group */ }; struct vfio_iova { @@ -1187,6 +1189,24 @@ static void vfio_update_pgsize_bitmap(struct vfio_iommu *iommu) } } +static int vfio_dev_enable_feature(struct device *dev, void *data) +{ + enum iommu_dev_features *feat = data; + + if (iommu_dev_feature_enabled(dev, *feat)) + return 0; + + return iommu_dev_enable_feature(dev, *feat); +} + +static bool vfio_group_supports_hwdbm(struct vfio_group *group) +{ + enum iommu_dev_features feat = IOMMU_DEV_FEAT_HWDBM; + + return !iommu_group_for_each_dev(group->iommu_group, &feat, + vfio_dev_enable_feature); +} + static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, struct vfio_dma *dma, dma_addr_t base_iova, size_t pgsize) @@ -2435,6 +2455,12 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, * capable via the page pinning interface. */ iommu->num_non_pinned_groups++; + + /* Update the hwdbm status of group and iommu */ + group->iommu_hwdbm = vfio_group_supports_hwdbm(group); + if (!group->iommu_hwdbm) + iommu->num_non_hwdbm_groups++; + mutex_unlock(&iommu->lock); vfio_iommu_resv_free(&group_resv_regions); @@ -2571,6 +2597,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, struct vfio_domain *domain; struct vfio_group *group; bool update_dirty_scope = false; + bool update_iommu_hwdbm = false; LIST_HEAD(iova_copy); mutex_lock(&iommu->lock); @@ -2609,6 +2636,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, vfio_iommu_detach_group(domain, group); update_dirty_scope = !group->pinned_page_dirty_scope; + update_iommu_hwdbm = !group->iommu_hwdbm; list_del(&group->next); kfree(group); /* @@ -2651,6 +2679,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, if (iommu->dirty_page_tracking) vfio_iommu_populate_bitmap_full(iommu); } + if (update_iommu_hwdbm) + iommu->num_non_hwdbm_groups--; mutex_unlock(&iommu->lock); } From patchwork Wed Mar 10 09:06:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60B1AC433DB for ; Wed, 10 Mar 2021 09:10:50 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ADAA664FEF for ; Wed, 10 Mar 2021 09:10:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ADAA664FEF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=BhO40AofVQrNhxnzKYQ2dlNEuCi0qDQ6ftQ45T0bWA4=; b=M6UlsVsoBRCN04hjwzDaoPW5R UwcYZSgs7tZ7guY0UhCRAzy3TeJlZbimcLpRVfk5dXFzK2RrC5v3gDV0YZF9ElJmlD22vhgZLULtS 6ZcjgJ2ZHIIdemig0gyPOJBV1q/7qqTKAKkF0rfq5jW46r1hnDe2K+Q8hOQ7jJB3DYIp+Fui82x6N 2fA2Zwb3Xzv/YBnRGJokArYFkfEKlRu4qbZQrqa0v9lMwGhBCnIN3lqszOPw+2rJ6AmdoEQRcpdez Dzu9UkQ2ib8MFzrstmsg/5Daj0LJS9LlxuqT/H9UiMjuHRaBNLXLN8+RltlXiq8gQzpmq5yCVUors 19mAzHn0Q==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJuq2-006Pew-W7; Wed, 10 Mar 2021 09:09:07 +0000 Received: from szxga07-in.huawei.com ([45.249.212.35]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuo9-006Onp-OI for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:14 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4DwR400JMQz8vtf; Wed, 10 Mar 2021 17:05:04 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:27 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 10/11] vfio/iommu_type1: Optimize dirty bitmap population based on iommu HWDBM Date: Wed, 10 Mar 2021 17:06:13 +0800 Message-ID: <20210310090614.26668-11-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090711_046929_59BDABA8 X-CRM114-Status: GOOD ( 19.62 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun In the past if vfio_iommu is not of pinned_page_dirty_scope and vfio_dma is iommu_mapped, we populate full dirty bitmap for this vfio_dma. Now we can try to get dirty log from iommu before make the lousy decision. In detail, if all vfio_group are of pinned_page_dirty_scope, the dirty bitmap population is not affected. If there are vfio_groups not of pinned_page_dirty_scope and their domains support HWDBM, then we can try to get dirty log from IOMMU. Otherwise, lead to full dirty bitmap. We should start dirty log for newly added dma range and domain. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Use new interface to start|stop dirty log. As split_block|merge_page are related to ARM SMMU. (Sun Yi) - Bugfix: Start dirty log for newly added dma range and domain. --- drivers/vfio/vfio_iommu_type1.c | 136 +++++++++++++++++++++++++++++++- 1 file changed, 132 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 876351c061e4..a7ab0279eda0 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1207,6 +1207,25 @@ static bool vfio_group_supports_hwdbm(struct vfio_group *group) vfio_dev_enable_feature); } +static int vfio_iommu_dirty_log_clear(struct vfio_iommu *iommu, + dma_addr_t start_iova, size_t size, + unsigned long *bitmap_buffer, + dma_addr_t base_iova, size_t pgsize) +{ + struct vfio_domain *d; + unsigned long pgshift = __ffs(pgsize); + int ret; + + list_for_each_entry(d, &iommu->domain_list, next) { + ret = iommu_clear_dirty_log(d->domain, start_iova, size, + bitmap_buffer, base_iova, pgshift); + if (ret) + return ret; + } + + return 0; +} + static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, struct vfio_dma *dma, dma_addr_t base_iova, size_t pgsize) @@ -1218,13 +1237,28 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, unsigned long shift = bit_offset % BITS_PER_LONG; unsigned long leftover; + if (!iommu->num_non_pinned_groups || !dma->iommu_mapped) + goto bitmap_done; + + /* try to get dirty log from IOMMU */ + if (!iommu->num_non_hwdbm_groups) { + struct vfio_domain *d; + + list_for_each_entry(d, &iommu->domain_list, next) { + if (iommu_sync_dirty_log(d->domain, dma->iova, dma->size, + dma->bitmap, dma->iova, pgshift)) + return -EFAULT; + } + goto bitmap_done; + } + /* * mark all pages dirty if any IOMMU capable device is not able * to report dirty pages and all pages are pinned and mapped. */ - if (iommu->num_non_pinned_groups && dma->iommu_mapped) - bitmap_set(dma->bitmap, 0, nbits); + bitmap_set(dma->bitmap, 0, nbits); +bitmap_done: if (shift) { bitmap_shift_left(dma->bitmap, dma->bitmap, shift, nbits + shift); @@ -1286,6 +1320,18 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, */ bitmap_clear(dma->bitmap, 0, dma->size >> pgshift); vfio_dma_populate_bitmap(dma, pgsize); + + /* Clear iommu dirty log to re-enable dirty log tracking */ + if (!iommu->pinned_page_dirty_scope && + dma->iommu_mapped && !iommu->num_non_hwdbm_groups) { + ret = vfio_iommu_dirty_log_clear(iommu, dma->iova, + dma->size, dma->bitmap, dma->iova, + pgsize); + if (ret) { + pr_warn("dma dirty log clear failed!\n"); + return ret; + } + } } return 0; } @@ -1561,6 +1607,9 @@ static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu, return list_empty(iova); } +static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, + struct vfio_dma *dma); + static int vfio_dma_do_map(struct vfio_iommu *iommu, struct vfio_iommu_type1_dma_map *map) { @@ -1684,8 +1733,13 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, if (!ret && iommu->dirty_page_tracking) { ret = vfio_dma_bitmap_alloc(dma, pgsize); - if (ret) + if (ret) { vfio_remove_dma(iommu, dma); + goto out_unlock; + } + + /* Start dirty log for newly added dma */ + vfio_dma_dirty_log_start(iommu, dma); } out_unlock: @@ -2262,6 +2316,9 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu, list_splice_tail(iova_copy, iova); } +static void vfio_domain_dirty_log_start(struct vfio_iommu *iommu, + struct vfio_domain *d); + static int vfio_iommu_type1_attach_group(void *iommu_data, struct iommu_group *iommu_group) { @@ -2445,6 +2502,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, list_add(&domain->next, &iommu->domain_list); vfio_update_pgsize_bitmap(iommu); + + /* Start dirty log for newly added vfio domain */ + if (iommu->dirty_page_tracking) + vfio_domain_dirty_log_start(iommu, domain); done: /* Delete the old one and insert new iova list */ vfio_iommu_iova_insert_copy(iommu, &iova_copy); @@ -3022,6 +3083,70 @@ static int vfio_iommu_type1_unmap_dma(struct vfio_iommu *iommu, -EFAULT : 0; } +static void vfio_domain_dirty_log_start(struct vfio_iommu *iommu, + struct vfio_domain *d) +{ + struct rb_node *n; + + /* Go through all dmas even if some dmas failed */ + for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) { + struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node); + + if (!dma->iommu_mapped) + continue; + + iommu_start_dirty_log(d->domain, dma->iova, dma->size); + } +} + +static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, + struct vfio_dma *dma) +{ + struct vfio_domain *d; + + if (!dma->iommu_mapped) + return; + + /* Go through all domains even if some domain failed */ + list_for_each_entry(d, &iommu->domain_list, next) { + iommu_start_dirty_log(d->domain, dma->iova, dma->size); + } +} + +static void vfio_dma_dirty_log_stop(struct vfio_iommu *iommu, + struct vfio_dma *dma) +{ + struct vfio_domain *d; + + if (!dma->iommu_mapped) + return; + + /* Go through all domains even if some domain failed */ + list_for_each_entry(d, &iommu->domain_list, next) { + iommu_stop_dirty_log(d->domain, dma->iova, dma->size, + d->prot | dma->prot); + } +} + +static void vfio_iommu_dirty_log_switch(struct vfio_iommu *iommu, bool start) +{ + struct rb_node *n; + + /* + * Go ahead even if all iommu domains don't support HWDBM for now, as + * we can get dirty log from IOMMU when these domains without HWDBM + * are detached. + */ + for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) { + struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node); + + if (start) + vfio_dma_dirty_log_start(iommu, dma); + else + vfio_dma_dirty_log_stop(iommu, dma); + } +} + static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, unsigned long arg) { @@ -3054,8 +3179,10 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, pgsize = 1 << __ffs(iommu->pgsize_bitmap); if (!iommu->dirty_page_tracking) { ret = vfio_dma_bitmap_alloc_all(iommu, pgsize); - if (!ret) + if (!ret) { iommu->dirty_page_tracking = true; + vfio_iommu_dirty_log_switch(iommu, true); + } } mutex_unlock(&iommu->lock); return ret; @@ -3064,6 +3191,7 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, if (iommu->dirty_page_tracking) { iommu->dirty_page_tracking = false; vfio_dma_bitmap_free_all(iommu); + vfio_iommu_dirty_log_switch(iommu, false); } mutex_unlock(&iommu->lock); return 0; From patchwork Wed Mar 10 09:06:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhukeqian X-Patchwork-Id: 12127341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CB3DC433E6 for ; Wed, 10 Mar 2021 09:10:29 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0D31C64FF3 for ; Wed, 10 Mar 2021 09:10:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D31C64FF3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Oi/H2ancNQPwgQp+SKIaXNGi4Kq1l7Nnh1JghnYU/CA=; b=R74+LqA3mCiTexx9eJYUgmMD1 g4CpAgFRpr6SXNOb4Ck+LvYKTHv0Zl2En6wNwPyGgZedIaMzktWm68i7D5gh+cPxTZxxzccrCoe4B YgKslSYwiX+m5USDmjDv7r098dU5i59eBCkF7dDGp4KxA8+evl/tTrG1em8wfG5qm0kO+TtLGRRs1 PmUJwuIjhkFSFJe4kys9t3iOaD+w1yb1HXF3aKeFSgXGhGIFOSRALyb7Jrc9CnabQvQSsT/0Iux8h jBWSvrvMrpl7Ik7nEwqYIflmoBPGwDaKfoyWjwwsQKFUdQpn7tbqTVpNQUmhCRviTZPtltqa2dwib 6s3GDRcdw==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lJupc-006PTP-PJ; Wed, 10 Mar 2021 09:08:40 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lJuo9-006Orf-Rt for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 09:07:14 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4DwR3z4hN3zrTKn; Wed, 10 Mar 2021 17:05:03 +0800 (CST) Received: from DESKTOP-5IS4806.china.huawei.com (10.174.184.42) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.498.0; Wed, 10 Mar 2021 17:06:28 +0800 From: Keqian Zhu To: , , , Alex Williamson , Robin Murphy , Yi Sun , Will Deacon CC: Kirti Wankhede , Cornelia Huck , Marc Zyngier , Catalin Marinas , Mark Rutland , James Morse , Suzuki K Poulose , , , , Subject: [PATCH v2 11/11] vfio/iommu_type1: Add support for manual dirty log clear Date: Wed, 10 Mar 2021 17:06:14 +0800 Message-ID: <20210310090614.26668-12-zhukeqian1@huawei.com> X-Mailer: git-send-email 2.8.4.windows.1 In-Reply-To: <20210310090614.26668-1-zhukeqian1@huawei.com> References: <20210310090614.26668-1-zhukeqian1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.174.184.42] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_090711_132984_D7695E75 X-CRM114-Status: GOOD ( 26.62 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: jiangkunkun In the past, we clear dirty log immediately after sync dirty log to userspace. This may cause redundant dirty handling if userspace handles dirty log iteratively: After vfio clears dirty log, new dirty log starts to generate. These new dirty log will be reported to userspace even if they are generated before userspace handles the same dirty page. That's to say, we should minimize the time gap of dirty log clearing and dirty log handling. We can give userspace the interface to clear dirty log. Co-developed-by: Keqian Zhu Signed-off-by: Kunkun Jiang --- changelog: v2: - Rebase to newest code, so change VFIO_DIRTY_LOG_MANUAL_CLEAR form 9 to 11. --- drivers/vfio/vfio_iommu_type1.c | 104 ++++++++++++++++++++++++++++++-- include/uapi/linux/vfio.h | 28 ++++++++- 2 files changed, 127 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index a7ab0279eda0..94306f567894 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -77,6 +77,7 @@ struct vfio_iommu { bool v2; bool nesting; bool dirty_page_tracking; + bool dirty_log_manual_clear; bool pinned_page_dirty_scope; bool container_open; uint64_t num_non_hwdbm_groups; @@ -1226,6 +1227,78 @@ static int vfio_iommu_dirty_log_clear(struct vfio_iommu *iommu, return 0; } +static int vfio_iova_dirty_log_clear(u64 __user *bitmap, + struct vfio_iommu *iommu, + dma_addr_t iova, size_t size, + size_t pgsize) +{ + struct vfio_dma *dma; + struct rb_node *n; + dma_addr_t start_iova, end_iova, riova; + unsigned long pgshift = __ffs(pgsize); + unsigned long bitmap_size; + unsigned long *bitmap_buffer = NULL; + bool clear_valid; + int rs, re, start, end, dma_offset; + int ret = 0; + + bitmap_size = DIRTY_BITMAP_BYTES(size >> pgshift); + bitmap_buffer = kvmalloc(bitmap_size, GFP_KERNEL); + if (!bitmap_buffer) { + ret = -ENOMEM; + goto out; + } + + if (copy_from_user(bitmap_buffer, bitmap, bitmap_size)) { + ret = -EFAULT; + goto out; + } + + for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) { + dma = rb_entry(n, struct vfio_dma, node); + if (!dma->iommu_mapped) + continue; + if ((dma->iova + dma->size - 1) < iova) + continue; + if (dma->iova > iova + size - 1) + break; + + start_iova = max(iova, dma->iova); + end_iova = min(iova + size, dma->iova + dma->size); + + /* Similar logic as the tail of vfio_iova_dirty_bitmap */ + + clear_valid = false; + start = (start_iova - iova) >> pgshift; + end = (end_iova - iova) >> pgshift; + bitmap_for_each_set_region(bitmap_buffer, rs, re, start, end) { + clear_valid = true; + riova = iova + (rs << pgshift); + dma_offset = (riova - dma->iova) >> pgshift; + bitmap_clear(dma->bitmap, dma_offset, re - rs); + } + + if (clear_valid) + vfio_dma_populate_bitmap(dma, pgsize); + + if (clear_valid && !iommu->pinned_page_dirty_scope && + dma->iommu_mapped && !iommu->num_non_hwdbm_groups) { + ret = vfio_iommu_dirty_log_clear(iommu, start_iova, + end_iova - start_iova, bitmap_buffer, + iova, pgsize); + if (ret) { + pr_warn("dma dirty log clear failed!\n"); + goto out; + } + } + + } + +out: + kfree(bitmap_buffer); + return ret; +} + static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, struct vfio_dma *dma, dma_addr_t base_iova, size_t pgsize) @@ -1275,6 +1348,11 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, DIRTY_BITMAP_BYTES(nbits + shift))) return -EFAULT; + /* Recover the bitmap under manual clear */ + if (shift && iommu->dirty_log_manual_clear) + bitmap_shift_right(dma->bitmap, dma->bitmap, shift, + nbits + shift); + return 0; } @@ -1313,6 +1391,9 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, if (ret) return ret; + if (iommu->dirty_log_manual_clear) + continue; + /* * Re-populate bitmap to include all pinned pages which are * considered as dirty but exclude pages which are unpinned and @@ -2850,6 +2931,11 @@ static int vfio_iommu_type1_check_extension(struct vfio_iommu *iommu, if (!iommu) return 0; return vfio_domains_have_iommu_cache(iommu); + case VFIO_DIRTY_LOG_MANUAL_CLEAR: + if (!iommu) + return 0; + iommu->dirty_log_manual_clear = true; + return 1; default: return 0; } @@ -3153,7 +3239,8 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, struct vfio_iommu_type1_dirty_bitmap dirty; uint32_t mask = VFIO_IOMMU_DIRTY_PAGES_FLAG_START | VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP | - VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP; + VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP | + VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP; unsigned long minsz; int ret = 0; @@ -3195,7 +3282,8 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, } mutex_unlock(&iommu->lock); return 0; - } else if (dirty.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP) { + } else if (dirty.flags & (VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP | + VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP)) { struct vfio_iommu_type1_dirty_bitmap_get range; unsigned long pgshift; size_t data_size = dirty.argsz - minsz; @@ -3238,13 +3326,21 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, goto out_unlock; } - if (iommu->dirty_page_tracking) + if (!iommu->dirty_page_tracking) { + ret = -EINVAL; + goto out_unlock; + } + + if (dirty.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP) ret = vfio_iova_dirty_bitmap(range.bitmap.data, iommu, range.iova, range.size, range.bitmap.pgsize); else - ret = -EINVAL; + ret = vfio_iova_dirty_log_clear(range.bitmap.data, + iommu, range.iova, + range.size, + range.bitmap.pgsize); out_unlock: mutex_unlock(&iommu->lock); diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 8ce36c1d53ca..784dc3cf2a8f 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -52,6 +52,14 @@ /* Supports the vaddr flag for DMA map and unmap */ #define VFIO_UPDATE_VADDR 10 +/* + * The vfio_iommu driver may support user clears dirty log manually, which means + * dirty log is not cleared automatically after dirty log is copied to userspace, + * it's user's duty to clear dirty log. Note: when user queries this extension + * and vfio_iommu driver supports it, then it is enabled. + */ +#define VFIO_DIRTY_LOG_MANUAL_CLEAR 11 + /* * The IOCTL interface is designed for extensibility by embedding the * structure length (argsz) and flags into structures passed between @@ -1188,7 +1196,24 @@ struct vfio_iommu_type1_dma_unmap { * actual bitmap. If dirty pages logging is not enabled, an error will be * returned. * - * Only one of the flags _START, _STOP and _GET may be specified at a time. + * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP flag set, + * instructs the IOMMU driver to clear the dirty status of pages in a bitmap + * for IOMMU container for a given IOVA range. The user must specify the IOVA + * range, the bitmap and the pgsize through the structure + * vfio_iommu_type1_dirty_bitmap_get in the data[] portion. This interface + * supports clearing a bitmap of the smallest supported pgsize only and can be + * modified in future to clear a bitmap of any specified supported pgsize. The + * user must provide a memory area for the bitmap memory and specify its size + * in bitmap.size. One bit is used to represent one page consecutively starting + * from iova offset. The user should provide page size in bitmap.pgsize field. + * A bit set in the bitmap indicates that the page at that offset from iova is + * cleared the dirty status, and dirty tracking is re-enabled for that page. The + * caller must set argsz to a value including the size of structure + * vfio_iommu_dirty_bitmap_get, but excluing the size of the actual bitmap. If + * dirty pages logging is not enabled, an error will be returned. + * + * Only one of the flags _START, _STOP, _GET and _CLEAR may be specified at a + * time. * */ struct vfio_iommu_type1_dirty_bitmap { @@ -1197,6 +1222,7 @@ struct vfio_iommu_type1_dirty_bitmap { #define VFIO_IOMMU_DIRTY_PAGES_FLAG_START (1 << 0) #define VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP (1 << 1) #define VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP (1 << 2) +#define VFIO_IOMMU_DIRTY_PAGES_FLAG_CLEAR_BITMAP (1 << 3) __u8 data[]; };