From patchwork Fri Mar 17 06:15:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oza Pawandeep X-Patchwork-Id: 9629881 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 241A560249 for ; Fri, 17 Mar 2017 06:15:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 13BA528679 for ; Fri, 17 Mar 2017 06:15:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0885E28697; Fri, 17 Mar 2017 06:15:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E6E6028679 for ; Fri, 17 Mar 2017 06:15:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:To:Subject:Message-ID:Date:MIME-Version :In-Reply-To:References:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=g7dVu3n07JTVOEHz+/v6KStiMDne4RB7YAv8/Vl/JyU=; b=GcAqFnA6PglnH0 o7eQ5Pwm3YXOtv9syvxcBeZJWubuEU/kX3dccFm8u9yTyISciMuIR7oJTUBR2E80TmlXmWYa/c5sW 7mJgYh6dwIEkq+oyX2kAjWf11xJcMzO1l7Q44ksI6aIoO61k0QBiCOLy8Ga71A17ToEVctG6RjokU ckic2kI/JzCLNqQQwNgHdac+UdrrQJc6HwSyTZBVaGzhu9+5b5yzXLlzFmi8lZJRjUja3W+BywpyJ TaKQREhXlhUvFDFpK+SEg9CqN+EoJ6ovXjvZ+9coECbU6SjYrCqrkjhRYAaDVU/45hAZkdwae7BHD iyU/Gps1Uo6ctNdkFL/g==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1colAq-0001gA-CC; Fri, 17 Mar 2017 06:15:40 +0000 Received: from mail-wr0-x229.google.com ([2a00:1450:400c:c0c::229]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1colAl-0001dW-DB for linux-arm-kernel@lists.infradead.org; Fri, 17 Mar 2017 06:15:38 +0000 Received: by mail-wr0-x229.google.com with SMTP id g10so45362035wrg.2 for ; Thu, 16 Mar 2017 23:15:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:references:in-reply-to:mime-version:thread-index:date :message-id:subject:to:cc; bh=1AxPz7uzauElOQBULGK9adjv7tk4Ac1BC4z7rZQ4Xek=; b=Ehyb7XiTKSS4eweh72NutGIp0Mx9ESOBaCeEUlN7z5CqZfgzvWw8oPjwiTTg1Raoc0 aanGdllGhUcRFpDZvxZ8otGT21tWnDSZTzDLKmBAi8Xq8RnmaLEeqM4drEe34qSAJTRj TWS66TagKuC4XhMbx4uoegl6uQ5oJ7588sNJA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:references:in-reply-to:mime-version :thread-index:date:message-id:subject:to:cc; bh=1AxPz7uzauElOQBULGK9adjv7tk4Ac1BC4z7rZQ4Xek=; b=uHKzRUqpucHYlSp96dWwn15E1Ah8FKljB72An7y2j7cx5j9IDt2EwXeT8YtbEQRJla zj/qFMZeT0VZXOkbSIqpphJbADCnIXXUAJX8kg6ZRMkbmUYVh6Y+hXUu2VpRi08SjKwN uwEe9K8Mg/HrHC2gnuMJTXer0X//F6yBhmIgN2ubgvxm8vHhpw/xXItELnBdJG8gyOIJ I2pn635FhgyYK2e3wjr4DTqeZ3eszkWXWQq+wcLjHf6qCxuC9FuVCjDM/wd81BP28QL1 teeSCDMX2yXMFnTsL8iD15I7Pkdf9E0tudSxtaXsi0EJnzm0lSbVpEKqmENxErZxYiiI 37WQ== X-Gm-Message-State: AFeK/H3S0EEqqZOajOdWmlI+MWaziG6hhu7UdqMbWNrFPb8o14iP5DhoKG9tZtjconlYJeKScztrfdbu+JiBdTmU X-Received: by 10.223.142.40 with SMTP id n37mr10739727wrb.137.1489731313585; Thu, 16 Mar 2017 23:15:13 -0700 (PDT) From: Oza Oza References: <1489731045-6719-1-git-send-email-oza.oza@broadcom.com> In-Reply-To: <1489731045-6719-1-git-send-email-oza.oza@broadcom.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQGV8RCV4h9/pBjdUVwijp9U8unVDaIR6IQA Date: Fri, 17 Mar 2017 11:45:12 +0530 Message-ID: Subject: RE: [RFC PATCH] iommu/dma: account pci host bridge dma_mask for IOVA allocation To: Joerg Roedel , Robin Murphy X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170316_231535_767106_BD55EE4B X-CRM114-Status: GOOD ( 20.89 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: devicetree@vger.kernel.org, iommu@lists.linux-foundation.org, bcm-kernel-feedback-list@broadcom.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Hi, There are certain areas which requires contemplation. And this problem requires more attention from Pci of framework and iommu, and integration of both. Regards, Oza. -----Original Message----- From: Oza Pawandeep [mailto:oza.oza@broadcom.com] Sent: Friday, March 17, 2017 11:41 AM To: Joerg Roedel; Robin Murphy Cc: iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org; devicetree@vger.kernel.org; bcm-kernel-feedback-list@broadcom.com; Oza Pawandeep Subject: [RFC PATCH] iommu/dma: account pci host bridge dma_mask for IOVA allocation It is possible that PCI device supports 64-bit DMA addressing, and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64), however PCI host bridge may have limitations on the inbound transaction addressing. As an example, consider NVME SSD device connected to iproc-PCIe controller. Currently, the IOMMU DMA ops only considers PCI device dma_mask when allocating an IOVA. This is particularly problematic on ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to PA for in-bound transactions only after PCI Host has forwarded these transactions on SOC IO bus. This means on such ARM/ARM64 SOCs the IOVA of in-bound transactions has to honor the addressing restrictions of the PCI Host. this patch is inspired by http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1306545.html http://www.spinics.net/lists/arm-kernel/msg566947.html but above inspiraiton solves the half of the problem. the rest of the problem is descrbied below, what we face on iproc based SOCs. current pcie frmework and of framework integration assumes dma-ranges in a way where memory-mapped devices define their dma-ranges. dma-ranges: (child-bus-address, parent-bus-address, length). but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges. dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; of_dma_configure is specifically witten to take care of memory mapped devices. but no implementation exists for pci to take care of pcie based memory ranges. in fact pci world doesnt seem to define standard dma-ranges since there is an absense of the same, the dma_mask used to remain 32bit because of 0 size return (parsed by of_dma_configure()) this patch also implements of_pci_get_dma_ranges to cater to pci world dma-ranges. so then the returned size get best possible (largest) dma_mask. for e.g. dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; we should get dev->coherent_dma_mask=0x7fffffffff. conclusion: there are following problems 1) linux pci and iommu framework integration has glitches with respect to dma-ranges 2) pci linux framework look very uncertain about dma-ranges, rather binding is not defined the way it is defined for memory mapped devices. rcar and iproc based SOCs use their custom one dma-ranges (rather can be standard) 3) even if in case of default parser of_dma_get_ranges,: it throws and erro" "no dma-ranges found for node" because of the bug which exists. following lines should be moved to the end of while(1) 839 node = of_get_next_parent(node); 840 if (!node) 841 break; Reviewed-by: Anup Patel Reviewed-by: Scott Branden Signed-off-by: Oza Pawandeep *dev, unsigned char busno, unsigned char bus_max, @@ -83,6 +84,11 @@ static inline int of_pci_get_host_bridge_resources(struct device_node *dev, { return -EINVAL; } + +static inline int of_pci_get_dma_ranges(struct device_node *np, u64 +*dma_addr, u64 *paddr, u64 *size) { + return -EINVAL; +} #endif #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI) --- 1.9.1 diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 8c7c244..20cfff7 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -217,6 +217,9 @@ config NEED_DMA_MAP_STATE config NEED_SG_DMA_LENGTH def_bool y +config ARCH_HAS_DMA_SET_COHERENT_MASK + def_bool y + config SMP def_bool y diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h index 73d5bab..64b4dc3 100644 --- a/arch/arm64/include/asm/device.h +++ b/arch/arm64/include/asm/device.h @@ -20,6 +20,7 @@ struct dev_archdata { #ifdef CONFIG_IOMMU_API void *iommu; /* private IOMMU data */ #endif + u64 parent_dma_mask; bool dma_coherent; }; diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 81cdb2e..5845ecd 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -564,6 +564,7 @@ static void flush_page(struct device *dev, const void *virt, phys_addr_t phys) __dma_flush_area(virt, PAGE_SIZE); } + static void *__iommu_alloc_attrs(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp, unsigned long attrs) @@ -795,6 +796,20 @@ static void __iommu_unmap_sg_attrs(struct device *dev, iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs); } +static int __iommu_set_dma_mask(struct device *dev, u64 mask) { + /* device is not DMA capable */ + if (!dev->dma_mask) + return -EIO; + + if (mask > dev->archdata.parent_dma_mask) + mask = dev->archdata.parent_dma_mask; + + *dev->dma_mask = mask; + + return 0; +} + static const struct dma_map_ops iommu_dma_ops = { .alloc = __iommu_alloc_attrs, .free = __iommu_free_attrs, @@ -811,8 +826,21 @@ static void __iommu_unmap_sg_attrs(struct device *dev, .map_resource = iommu_dma_map_resource, .unmap_resource = iommu_dma_unmap_resource, .mapping_error = iommu_dma_mapping_error, + .set_dma_mask = __iommu_set_dma_mask, }; +int dma_set_coherent_mask(struct device *dev, u64 mask) { + if (get_dma_ops(dev) == &iommu_dma_ops && + mask > dev->archdata.parent_dma_mask) + mask = dev->archdata.parent_dma_mask; + + dev->coherent_dma_mask = mask; + return 0; +} +EXPORT_SYMBOL(dma_set_coherent_mask); + + /* * TODO: Right now __iommu_setup_dma_ops() gets called too early to do * everything it needs to - the device is only partially created and the @@ -975,6 +1003,8 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, if (!dev->dma_ops) dev->dma_ops = &swiotlb_dma_ops; + dev->archdata.parent_dma_mask = size - 1; + dev->archdata.dma_coherent = coherent; __iommu_setup_dma_ops(dev, dma_base, size, iommu); } diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c index 0ee42c3..5804717 100644 --- a/drivers/of/of_pci.c +++ b/drivers/of/of_pci.c @@ -283,6 +283,51 @@ int of_pci_get_host_bridge_resources(struct device_node *dev, return err; } EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources); + +int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 +*paddr, u64 *size) { + struct device_node *node = of_node_get(np); + int rlen, naddr, nsize, pna; + int ret = 0; + const int na = 3, ns = 2; + struct of_pci_range_parser parser; + struct of_pci_range range; + + if (!node) + return -EINVAL; + + parser.node = node; + parser.pna = of_n_addr_cells(node); + parser.np = parser.pna + na + ns; + + parser.range = of_get_property(node, "dma-ranges", &rlen); + + if (!parser.range) { + pr_debug("pcie device has no dma-ranges defined for node(%s)\n", np->full_name); + ret = -ENODEV; + goto out; + } + + parser.end = parser.range + rlen / sizeof(__be32); + + /* how do we take care of multiple dma windows ?. */ + for_each_of_pci_range(&parser, &range) { + *dma_addr = range.pci_addr; + *size = range.size; + *paddr = range.cpu_addr; + } + + pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n", + *dma_addr, *paddr, *size); + *dma_addr = range.pci_addr; + *size = range.size; + +out: + of_node_put(node); + return ret; + +} +EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges); #endif /* CONFIG_OF_ADDRESS */ #ifdef CONFIG_PCI_MSI diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h index 0e0974e..907ace0 100644 --- a/include/linux/of_pci.h +++ b/include/linux/of_pci.h @@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { } int of_pci_get_host_bridge_resources(struct device_node *dev, unsigned char busno, unsigned char bus_max, struct list_head *resources, resource_size_t *io_base); +int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 +*paddr, u64 *size); #else static inline int of_pci_get_host_bridge_resources(struct device_node