From patchwork Wed Feb 17 03:44:00 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 8334451 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Original-To: patchwork-linux-pci@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C71F39F38B for ; Wed, 17 Feb 2016 03:47:21 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 94C3020357 for ; Wed, 17 Feb 2016 03:47:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DDFE720328 for ; Wed, 17 Feb 2016 03:47:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965150AbcBQDrK (ORCPT ); Tue, 16 Feb 2016 22:47:10 -0500 Received: from e23smtp03.au.ibm.com ([202.81.31.145]:60680 "EHLO e23smtp03.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965148AbcBQDqA (ORCPT ); Tue, 16 Feb 2016 22:46:00 -0500 Received: from localhost by e23smtp03.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 17 Feb 2016 13:45:57 +1000 Received: from d23dlp02.au.ibm.com (202.81.31.213) by e23smtp03.au.ibm.com (202.81.31.209) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 17 Feb 2016 13:45:56 +1000 X-IBM-Helo: d23dlp02.au.ibm.com X-IBM-MailFrom: gwshan@linux.vnet.ibm.com X-IBM-RcptTo: devicetree@vger.kernel.org;linux-pci@vger.kernel.org Received: from d23relay06.au.ibm.com (d23relay06.au.ibm.com [9.185.63.219]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id CABC52BB005A; Wed, 17 Feb 2016 14:45:55 +1100 (EST) Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by d23relay06.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u1H3jlhE3408352; Wed, 17 Feb 2016 14:45:55 +1100 Received: from d23av01.au.ibm.com (localhost [127.0.0.1]) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u1H3jIQ7013072; Wed, 17 Feb 2016 14:45:22 +1100 Received: from ozlabs.au.ibm.com (ozlabs.au.ibm.com [9.192.253.14]) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u1H3jH2L011976; Wed, 17 Feb 2016 14:45:18 +1100 Received: from bran.ozlabs.ibm.com (haven.au.ibm.com [9.192.254.114]) by ozlabs.au.ibm.com (Postfix) with ESMTP id B19A4A03C4; Wed, 17 Feb 2016 14:44:36 +1100 (AEDT) Received: from gwshan (shangw.ozlabs.ibm.com [10.61.2.199]) by bran.ozlabs.ibm.com (Postfix) with ESMTP id A9451E39C0; Wed, 17 Feb 2016 14:44:36 +1100 (AEDT) Received: by gwshan (Postfix, from userid 1000) id 8F0E6941E93; Wed, 17 Feb 2016 14:44:36 +1100 (AEDT) From: Gavin Shan To: linuxppc-dev@lists.ozlabs.org Cc: linux-pci@vger.kernel.org, devicetree@vger.kernel.org, benh@kernel.crashing.org, mpe@ellerman.id.au, aik@ozlabs.ru, dja@axtens.net, bhelgaas@google.com, robherring2@gmail.com, grant.likely@linaro.org, Gavin Shan Subject: [PATCH v8 17/45] powerpc/powernv/ioda1: Improve DMA32 segment track Date: Wed, 17 Feb 2016 14:44:00 +1100 Message-Id: <1455680668-23298-18-git-send-email-gwshan@linux.vnet.ibm.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1455680668-23298-1-git-send-email-gwshan@linux.vnet.ibm.com> References: <1455680668-23298-1-git-send-email-gwshan@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16021703-0009-0000-0000-000002FF06CE Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In current implementation, the DMA32 segments required by one specific PE isn't calculated with the information hold in the PE independently. It conflicts with the PCI hotplug design: PE centralized, meaning the PE's DMA32 segments should be calculated from the information hold in the PE independently. This introduces an array (@dma32_segmap) for every PHB to track the DMA32 segmeng usage. Besides, this moves the logic calculating PE's consumed DMA32 segments to pnv_pci_ioda1_setup_dma_pe() so that PE's DMA32 segments are calculated/allocated from the information hold in the PE (DMA32 weight). Also the logic is improved: we try to allocate as much DMA32 segments as we can. It's acceptable that number of DMA32 segments less than the expected number are allocated. Signed-off-by: Gavin Shan --- arch/powerpc/platforms/powernv/pci-ioda.c | 111 +++++++++++++++++------------- arch/powerpc/platforms/powernv/pci.h | 7 +- 2 files changed, 66 insertions(+), 52 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 0fc2309..59782fba 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2007,20 +2007,54 @@ static unsigned int pnv_pci_ioda_total_dma_weight(struct pnv_phb *phb) } static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb, - struct pnv_ioda_pe *pe, - unsigned int base, - unsigned int segs) + struct pnv_ioda_pe *pe) { struct page *tce_mem = NULL; struct iommu_table *tbl; - unsigned int tce32_segsz, i; + unsigned int weight, total_weight; + unsigned int tce32_segsz, base, segs, i; int64_t rc; void *addr; /* XXX FIXME: Handle 64-bit only DMA devices */ /* XXX FIXME: Provide 64-bit DMA facilities & non-4K TCE tables etc.. */ /* XXX FIXME: Allocate multi-level tables on PHB3 */ + total_weight = pnv_pci_ioda_total_dma_weight(phb); + weight = pnv_pci_ioda_pe_dma_weight(pe); + + segs = (weight * phb->ioda.dma32_count) / total_weight; + if (!segs) + segs = 1; + + /* + * Allocate contiguous DMA32 segments. We begin with the expected + * number of segments. With one more attempt, the number of DMA32 + * segments to be allocated is decreased by one until one segment + * is allocated successfully. + */ + while (segs) { + for (base = 0; base <= phb->ioda.dma32_count - segs; base++) { + for (i = base; i < base + segs; i++) { + if (phb->ioda.dma32_segmap[i] != + IODA_INVALID_PE) + break; + } + + if (i >= base + segs) + break; + } + + if (i >= base + segs) + break; + + segs--; + } + + if (!segs) { + pe_warn(pe, "No available DMA32 segments\n"); + return; + } tbl = pnv_pci_table_alloc(phb->hose->node); iommu_register_group(&pe->table_group, phb->hose->global_number, @@ -2028,6 +2062,8 @@ static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb, pnv_pci_link_table_and_group(phb->hose->node, 0, tbl, &pe->table_group); /* Grab a 32-bit TCE table */ + pe_info(pe, "DMA weight %d (%d), assigned (%d) %d DMA32 segments\n", + weight, total_weight, base, segs); pe_info(pe, " Setting up 32-bit TCE table at %08x..%08x\n", base * PNV_IODA1_DMA32_SEGSIZE, (base + segs) * PNV_IODA1_DMA32_SEGSIZE - 1); @@ -2064,6 +2100,10 @@ static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb, } } + /* Setup DMA32 segment mapping */ + for (i = base; i < base + segs; i++) + phb->ioda.dma32_segmap[i] = pe->pe_number; + /* Setup linux iommu table */ pnv_pci_setup_iommu_table(tbl, addr, tce32_segsz * segs, base * PNV_IODA1_DMA32_SEGSIZE, @@ -2538,70 +2578,34 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb, static void pnv_ioda_setup_dma(struct pnv_phb *phb) { struct pci_controller *hose = phb->hose; - unsigned int weight, total_weight, dma_pe_count; - unsigned int residual, remaining, segs, base; struct pnv_ioda_pe *pe; - - total_weight = pnv_pci_ioda_total_dma_weight(phb); - dma_pe_count = 0; - list_for_each_entry(pe, &phb->ioda.pe_list, list) { - weight = pnv_pci_ioda_pe_dma_weight(pe); - if (weight > 0) - dma_pe_count++; - } + unsigned int weight; /* If we have more PE# than segments available, hand out one * per PE until we run out and let the rest fail. If not, * then we assign at least one segment per PE, plus more based * on the amount of devices under that PE */ - if (dma_pe_count > phb->ioda.tce32_count) - residual = 0; - else - residual = phb->ioda.tce32_count - dma_pe_count; - pr_info("PCI: Domain %04x has %ld available 32-bit DMA segments\n", - hose->global_number, phb->ioda.tce32_count); - pr_info("PCI: %d PE# for a total weight of %d\n", - dma_pe_count, total_weight); + hose->global_number, phb->ioda.dma32_count); pnv_pci_ioda_setup_opal_tce_kill(phb); - /* Walk our PE list and configure their DMA segments, hand them - * out one base segment plus any residual segments based on - * weight - */ - remaining = phb->ioda.tce32_count; - base = 0; + /* Walk our PE list and configure their DMA segments */ list_for_each_entry(pe, &phb->ioda.pe_list, list) { weight = pnv_pci_ioda_pe_dma_weight(pe); if (!weight) continue; - if (!remaining) { - pe_warn(pe, "No DMA32 resources available\n"); - continue; - } - segs = 1; - if (residual) { - segs += ((weight * residual) + (total_weight / 2)) / - total_weight; - if (segs > remaining) - segs = remaining; - } - /* * For IODA2 compliant PHB3, we needn't care about the weight. * The all available 32-bits DMA space will be assigned to * the specific PE. */ if (phb->type == PNV_PHB_IODA1) { - pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n", - weight, segs); - pnv_pci_ioda1_setup_dma_pe(phb, pe, base, segs); + pnv_pci_ioda1_setup_dma_pe(phb, pe); } else if (phb->type == PNV_PHB_IODA2) { pe_info(pe, "Assign DMA32 space\n"); - segs = 0; pnv_pci_ioda2_setup_dma_pe(phb, pe); } else if (phb->type == PNV_PHB_NPU) { /* @@ -2611,9 +2615,6 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb) * as the PHB3 TVT. */ } - - remaining -= segs; - base += segs; } } @@ -3313,7 +3314,8 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, { struct pci_controller *hose; struct pnv_phb *phb; - unsigned long size, m64map_off, m32map_off, pemap_off, iomap_off = 0; + unsigned long size, m64map_off, m32map_off, pemap_off; + unsigned long iomap_off = 0, dma32map_off = 0; const __be64 *prop64; const __be32 *prop32; int i, len; @@ -3398,6 +3400,10 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, phb->ioda.io_segsize = phb->ioda.io_size / phb->ioda.total_pe_num; phb->ioda.io_pci_base = 0; /* XXX calculate this ? */ + /* Calculate how many 32-bit TCE segments we have */ + phb->ioda.dma32_count = phb->ioda.m32_pci_base / + PNV_IODA1_DMA32_SEGSIZE; + /* Allocate aux data & arrays. We don't have IO ports on PHB3 */ size = _ALIGN_UP(phb->ioda.total_pe_num / 8, sizeof(unsigned long)); m64map_off = size; @@ -3407,6 +3413,9 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, if (phb->type == PNV_PHB_IODA1) { iomap_off = size; size += phb->ioda.total_pe_num * sizeof(phb->ioda.io_segmap[0]); + dma32map_off = size; + size += phb->ioda.dma32_count * + sizeof(phb->ioda.dma32_segmap[0]); } pemap_off = size; size += phb->ioda.total_pe_num * sizeof(struct pnv_ioda_pe); @@ -3422,6 +3431,10 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, phb->ioda.io_segmap = aux + iomap_off; for (i = 0; i < phb->ioda.total_pe_num; i++) phb->ioda.io_segmap[i] = IODA_INVALID_PE; + + phb->ioda.dma32_segmap = aux + dma32map_off; + for (i = 0; i < phb->ioda.dma32_count; i++) + phb->ioda.dma32_segmap[i] = IODA_INVALID_PE; } phb->ioda.pe_array = aux + pemap_off; set_bit(phb->ioda.reserved_pe_idx, phb->ioda.pe_alloc); @@ -3430,7 +3443,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, mutex_init(&phb->ioda.pe_list_mutex); /* Calculate how many 32-bit TCE segments we have */ - phb->ioda.tce32_count = phb->ioda.m32_pci_base / + phb->ioda.dma32_count = phb->ioda.m32_pci_base / PNV_IODA1_DMA32_SEGSIZE; #if 0 /* We should really do that ... */ diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h index e90bcbe..350e630 100644 --- a/arch/powerpc/platforms/powernv/pci.h +++ b/arch/powerpc/platforms/powernv/pci.h @@ -146,6 +146,10 @@ struct pnv_phb { int *m32_segmap; int *io_segmap; + /* DMA32 segment maps - IODA1 only */ + unsigned long dma32_count; + int *dma32_segmap; + /* IRQ chip */ int irq_chip_init; struct irq_chip irq_chip; @@ -162,9 +166,6 @@ struct pnv_phb { */ unsigned char pe_rmap[0x10000]; - /* 32-bit TCE tables allocation */ - unsigned long tce32_count; - /* TCE cache invalidate registers (physical and * remapped) */