From patchwork Wed Jun 7 12:29:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sricharan Ramabadhran X-Patchwork-Id: 9771455 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AFF6F6034B for ; Wed, 7 Jun 2017 12:30:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C42B284A8 for ; Wed, 7 Jun 2017 12:30:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 80F32284F6; Wed, 7 Jun 2017 12:30:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C5BEA284DA for ; Wed, 7 Jun 2017 12:30:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=PM0tY6mUrtxKOYQwGHyoZ4al6aylaL8UB1MdDHDVi3A=; b=Sy+ LdAHvfEqnFdQcaCuj9NZs7rohHM4RMF3turRIXmUvT2W0GfE/wco6qoAnQM630DOG14I5qk7H1OQF JNLisDGEwxli6ZZQgZbehoMBw56UarHnu4dhaCQIYGapsm2bICDd6sarZLZk8NDEfSoUX9n1dmHrk maCw0QkfNG6oPeP+ToN/+O8NPZtsN79+Frp8sMimOl0hFv+KvKsOs6V+bUqKgnTXDVx5RjL+FtuLF z5gs+dGKPvmf0dDNdwB82UM+X4M2BvBYKLPXYBYDePtAHrCpKEHKJ5i/QzxPeddvSNjVsNpZXGPm2 zT6K8QOazc+DO01DwLhGdRk1WhEfmLA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1dIa6f-00049F-Gc; Wed, 07 Jun 2017 12:30:37 +0000 Received: from smtp.codeaurora.org ([198.145.29.96]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1dIa6b-00047G-9u for linux-arm-kernel@lists.infradead.org; Wed, 07 Jun 2017 12:30:35 +0000 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id AB27260AC8; Wed, 7 Jun 2017 12:30:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1496838611; bh=hDJtGW3T2XATTaDZ6uATHK363Dk833OrQa/JfalLHnA=; h=From:To:Cc:Subject:Date:From; b=ebnZ0shACgvxlWtFK0B6Zc/nEDeR8mPIPSQLEtcPt5fGcgNzBOGHFLs5tCfI27KNZ VPn16LsNyP3FhupHmXN8fcX1gocwSrRcGVTJum/Mhm/QDmU5cXCRzr55qkTdyYLZvd OLUWnppoDtk9hXshoxgFo6XBxXziNn7S9mTIfvQk= Received: from srichara-linux.qualcomm.com (blr-c-bdr-fw-01_globalnat_allzones-outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: sricharan@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id D6417609A1; Wed, 7 Jun 2017 12:30:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1496838610; bh=hDJtGW3T2XATTaDZ6uATHK363Dk833OrQa/JfalLHnA=; h=From:To:Cc:Subject:Date:From; b=NlV9qOPQRKh0plgg4iNGHwWzH/tlNUTHq0+FpJ3mke0+O3YVjZLFTtk7dNhADoosD 95PebkRj46SXQQPsCC2a/7iTLb8sG8PRi5JL2oqWM8EWaqulkIcMOIEdVLNKpWEIuk To4ALfmWW0HkS6uvr+2VymcWmJc4GFCl6ipeG+OA= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org D6417609A1 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=sricharan@codeaurora.org From: Sricharan R To: andy.gross@linaro.org, david.brown@linaro.org, vinod.koul@intel.com, dan.j.williams@intel.com, linux-arm-msm@vger.kernel.org, linux-soc@vger.kernel.org, dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH] dmaengine: qcom-bam: Process multiple pending descriptors Date: Wed, 7 Jun 2017 17:59:07 +0530 Message-Id: <1496838547-15092-1-git-send-email-sricharan@codeaurora.org> X-Mailer: git-send-email 1.9.1 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170607_053033_399486_72DA61E6 X-CRM114-Status: GOOD ( 21.64 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: sricharan@codeaurora.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP The bam dmaengine has a circular FIFO to which we add hw descriptors that describes the transaction. The FIFO has space for about 4096 hw descriptors. Currently we add one descriptor and wait for it to complete with interrupt and start processing the next pending descriptor. In this way, FIFO is underutilised and also adds additional interrupt overhead for each of the descriptor. This results in loss of throughput for clients that submits multiple descriptors together, that can be processed by bam and the client peripheral together. So instead, when a client does an issue_pending, we can keep pulling in descriptors from the pending queue and add it to the FIFO, till either the FIFO is full (or) client has requested for an interrupt notification for that descriptor. After this, receiving a completion interrupt implies that all descriptors that were submitted have completed. so notify completion for all the descriptors. CURRENT: ------ ------- --------------- |DES 0| |DESC 1| |DESC 2 + INT | ------ ------- --------------- | | | | | | INTERRUPT: (INT) (INT) (INT) CALLBACK: (CB) (CB) (CB) MTD_SPEEDTEST READ PAGE: 3560 KiB/s MTD_SPEEDTEST WRITE PAGE: 2664 KiB/s IOZONE READ: 2456 KB/s IOZONE WRITE: 1230 KB/s bam dma interrupts (after tests): 96508 CHANGE: ------ ------- ------------- |DES 0| |DESC 1 |DESC 2 + INT | ------ ------- -------------- | | (INT) (CB for 0, 1, 2) MTD_SPEEDTEST READ PAGE: 3860 KiB/s MTD_SPEEDTEST WRITE PAGE: 2837 KiB/s IOZONE READ: 2677 KB/s IOZONE WRITE: 1308 KB/s bam dma interrupts (after tests): 58806 Signed-off-by: Sricharan R --- drivers/dma/qcom/bam_dma.c | 180 +++++++++++++++++++++++++++++---------------- 1 file changed, 115 insertions(+), 65 deletions(-) diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c index 03c4eb3..97892f7 100644 --- a/drivers/dma/qcom/bam_dma.c +++ b/drivers/dma/qcom/bam_dma.c @@ -46,6 +46,7 @@ #include #include #include +#include #include #include #include @@ -76,7 +77,8 @@ struct bam_async_desc { u16 flags; struct bam_desc_hw *curr_desc; - + /* list node for the desc in the bam_chan list of descriptors */ + struct list_head desc_node; enum dma_transfer_direction dir; size_t length; struct bam_desc_hw desc[0]; @@ -371,6 +373,8 @@ struct bam_chan { unsigned int initialized; /* is the channel hw initialized? */ unsigned int paused; /* is the channel paused? */ unsigned int reconfigure; /* new slave config? */ + /* list of descriptors currently processed */ + struct list_head desc_list; struct list_head node; }; @@ -486,6 +490,8 @@ static void bam_chan_init_hw(struct bam_chan *bchan, bchan->initialized = 1; + INIT_LIST_HEAD(&bchan->desc_list); + /* init FIFO pointers */ bchan->head = 0; bchan->tail = 0; @@ -631,8 +637,6 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan, if (flags & DMA_PREP_INTERRUPT) async_desc->flags |= DESC_FLAG_EOT; - else - async_desc->flags |= DESC_FLAG_INT; async_desc->num_desc = num_alloc; async_desc->curr_desc = async_desc->desc; @@ -680,13 +684,18 @@ static struct dma_async_tx_descriptor *bam_prep_slave_sg(struct dma_chan *chan, static int bam_dma_terminate_all(struct dma_chan *chan) { struct bam_chan *bchan = to_bam_chan(chan); + struct bam_async_desc *async_desc; unsigned long flag; LIST_HEAD(head); /* remove all transactions, including active transaction */ spin_lock_irqsave(&bchan->vc.lock, flag); if (bchan->curr_txd) { - list_add(&bchan->curr_txd->vd.node, &bchan->vc.desc_issued); + list_for_each_entry(async_desc, &bchan->desc_list, desc_node) { + bchan->curr_txd = async_desc; + list_add(&bchan->curr_txd->vd.node, + &bchan->vc.desc_issued); + } bchan->curr_txd = NULL; } @@ -761,7 +770,7 @@ static u32 process_channel_irqs(struct bam_device *bdev) { u32 i, srcs, pipe_stts; unsigned long flags; - struct bam_async_desc *async_desc; + struct bam_async_desc *async_desc, *tmp; srcs = readl_relaxed(bam_addr(bdev, 0, BAM_IRQ_SRCS_EE)); @@ -777,33 +786,39 @@ static u32 process_channel_irqs(struct bam_device *bdev) /* clear pipe irq */ pipe_stts = readl_relaxed(bam_addr(bdev, i, BAM_P_IRQ_STTS)); - writel_relaxed(pipe_stts, bam_addr(bdev, i, BAM_P_IRQ_CLR)); spin_lock_irqsave(&bchan->vc.lock, flags); - async_desc = bchan->curr_txd; - - if (async_desc) { - async_desc->num_desc -= async_desc->xfer_len; - async_desc->curr_desc += async_desc->xfer_len; - bchan->curr_txd = NULL; - - /* manage FIFO */ - bchan->head += async_desc->xfer_len; - bchan->head %= MAX_DESCRIPTORS; - - /* - * if complete, process cookie. Otherwise - * push back to front of desc_issued so that - * it gets restarted by the tasklet - */ - if (!async_desc->num_desc) - vchan_cookie_complete(&async_desc->vd); - else - list_add(&async_desc->vd.node, - &bchan->vc.desc_issued); + + list_for_each_entry_safe(async_desc, tmp, + &bchan->desc_list, desc_node) { + bchan->curr_txd = async_desc; + + if (async_desc) { + async_desc->num_desc -= async_desc->xfer_len; + async_desc->curr_desc += async_desc->xfer_len; + + /* manage FIFO */ + bchan->head += async_desc->xfer_len; + bchan->head %= MAX_DESCRIPTORS; + + /* + * if complete, process cookie. Otherwise + * push back to front of desc_issued so that + * it gets restarted by the tasklet + */ + if (!async_desc->num_desc) { + vchan_cookie_complete(&async_desc->vd); + } else { + list_add(&async_desc->vd.node, + &bchan->vc.desc_issued); + } + list_del(&async_desc->desc_node); + } } + bchan->curr_txd = NULL; + spin_unlock_irqrestore(&bchan->vc.lock, flags); } @@ -863,6 +878,7 @@ static enum dma_status bam_tx_status(struct dma_chan *chan, dma_cookie_t cookie, struct dma_tx_state *txstate) { struct bam_chan *bchan = to_bam_chan(chan); + struct bam_async_desc *async_desc; struct virt_dma_desc *vd; int ret; size_t residue = 0; @@ -877,12 +893,21 @@ static enum dma_status bam_tx_status(struct dma_chan *chan, dma_cookie_t cookie, return bchan->paused ? DMA_PAUSED : ret; spin_lock_irqsave(&bchan->vc.lock, flags); + vd = vchan_find_desc(&bchan->vc, cookie); - if (vd) + if (vd) { residue = container_of(vd, struct bam_async_desc, vd)->length; - else if (bchan->curr_txd && bchan->curr_txd->vd.tx.cookie == cookie) - for (i = 0; i < bchan->curr_txd->num_desc; i++) - residue += bchan->curr_txd->curr_desc[i].size; + } else if (bchan->curr_txd) { + list_for_each_entry(async_desc, &bchan->desc_list, desc_node) { + bchan->curr_txd = async_desc; + + if (bchan->curr_txd->vd.tx.cookie != cookie) + continue; + + for (i = 0; i < bchan->curr_txd->num_desc; i++) + residue += bchan->curr_txd->curr_desc[i].size; + } + } spin_unlock_irqrestore(&bchan->vc.lock, flags); @@ -923,63 +948,88 @@ static void bam_start_dma(struct bam_chan *bchan) { struct virt_dma_desc *vd = vchan_next_desc(&bchan->vc); struct bam_device *bdev = bchan->bdev; - struct bam_async_desc *async_desc; + struct bam_async_desc *async_desc = NULL; struct bam_desc_hw *desc; struct bam_desc_hw *fifo = PTR_ALIGN(bchan->fifo_virt, sizeof(struct bam_desc_hw)); int ret; + unsigned int avail; lockdep_assert_held(&bchan->vc.lock); if (!vd) return; - list_del(&vd->node); - - async_desc = container_of(vd, struct bam_async_desc, vd); - bchan->curr_txd = async_desc; - ret = pm_runtime_get_sync(bdev->dev); if (ret < 0) return; - /* on first use, initialize the channel hardware */ - if (!bchan->initialized) - bam_chan_init_hw(bchan, async_desc->dir); + while (vd) { + list_del(&vd->node); - /* apply new slave config changes, if necessary */ - if (bchan->reconfigure) - bam_apply_new_config(bchan, async_desc->dir); + async_desc = container_of(vd, struct bam_async_desc, vd); - desc = bchan->curr_txd->curr_desc; + /* on first use, initialize the channel hardware */ + if (!bchan->initialized) + bam_chan_init_hw(bchan, async_desc->dir); - if (async_desc->num_desc > MAX_DESCRIPTORS) - async_desc->xfer_len = MAX_DESCRIPTORS; - else - async_desc->xfer_len = async_desc->num_desc; + /* apply new slave config changes, if necessary */ + if (bchan->reconfigure) + bam_apply_new_config(bchan, async_desc->dir); - /* set any special flags on the last descriptor */ - if (async_desc->num_desc == async_desc->xfer_len) - desc[async_desc->xfer_len - 1].flags = - cpu_to_le16(async_desc->flags); - else - desc[async_desc->xfer_len - 1].flags |= + bchan->curr_txd = async_desc; + desc = bchan->curr_txd->curr_desc; + avail = CIRC_SPACE(bchan->tail, bchan->head, + MAX_DESCRIPTORS + 1); + + if (async_desc->num_desc > avail) + async_desc->xfer_len = avail; + else + async_desc->xfer_len = async_desc->num_desc; + + /* set any special flags on the last descriptor */ + if (async_desc->num_desc == async_desc->xfer_len) + desc[async_desc->xfer_len - 1].flags |= + cpu_to_le16(async_desc->flags); + + vd = vchan_next_desc(&bchan->vc); + + /* + * This will be the last descriptor in the chain if, + * - FIFO is FULL. + * - No more descriptors to add. + * - This descriptor has interrupt flags set, + * so that we will have to indicate finishing of + * that descriptor. + */ + if (!(avail - async_desc->xfer_len) || !vd || + (async_desc->flags & DESC_FLAG_EOT)) { + /* set INT flag for the last descriptor if unset */ + if ((async_desc->num_desc != async_desc->xfer_len) || + (!(async_desc->flags & DESC_FLAG_EOT))) + desc[async_desc->xfer_len - 1].flags |= cpu_to_le16(DESC_FLAG_INT); + vd = NULL; + } - if (bchan->tail + async_desc->xfer_len > MAX_DESCRIPTORS) { - u32 partial = MAX_DESCRIPTORS - bchan->tail; + if (bchan->tail + async_desc->xfer_len > MAX_DESCRIPTORS) { + u32 partial = MAX_DESCRIPTORS - bchan->tail; - memcpy(&fifo[bchan->tail], desc, - partial * sizeof(struct bam_desc_hw)); - memcpy(fifo, &desc[partial], (async_desc->xfer_len - partial) * + memcpy(&fifo[bchan->tail], desc, + partial * sizeof(struct bam_desc_hw)); + memcpy(fifo, &desc[partial], + (async_desc->xfer_len - partial) * sizeof(struct bam_desc_hw)); - } else { - memcpy(&fifo[bchan->tail], desc, - async_desc->xfer_len * sizeof(struct bam_desc_hw)); - } + } else { + memcpy(&fifo[bchan->tail], desc, + async_desc->xfer_len * + sizeof(struct bam_desc_hw)); + } - bchan->tail += async_desc->xfer_len; - bchan->tail %= MAX_DESCRIPTORS; + bchan->tail += async_desc->xfer_len; + bchan->tail %= MAX_DESCRIPTORS; + list_add_tail(&async_desc->desc_node, &bchan->desc_list); + } /* ensure descriptor writes and dma start not reordered */ wmb();