From patchwork Mon Dec 23 13:22:03 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 3396611 Return-Path: X-Original-To: patchwork-linux-mmc@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id A5842C0D4A for ; Mon, 23 Dec 2013 13:22:42 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 3D18120667 for ; Mon, 23 Dec 2013 13:22:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3126E20661 for ; Mon, 23 Dec 2013 13:22:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757193Ab3LWNWj (ORCPT ); Mon, 23 Dec 2013 08:22:39 -0500 Received: from mail-pa0-f47.google.com ([209.85.220.47]:61374 "EHLO mail-pa0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756990Ab3LWNWi (ORCPT ); Mon, 23 Dec 2013 08:22:38 -0500 Received: by mail-pa0-f47.google.com with SMTP id kq14so5262631pab.6 for ; Mon, 23 Dec 2013 05:22:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=t85gIIJGkGFPWaLnK9tlXA7ligFDzjKfglSNB8XAUFE=; b=CPbrng2TWPIyYDSyr1Uy/oTE6QsVWVJcK2cWsELMNb7MyCqu4+nyUtopBSRM/G7wdw OAdbjqob7YSeDV1rlfc8CDcZKYrfOPqBBovrQy1vnET6huvX9AvROM+Uk/kq80yB0TNd K4qoeuHckEamAlW1NaEVwiFOg1CCYMgYV8KQyIt2UduRtnlUCONdSrW9+CSpsi/NHAJg E8i/oqs/P4K1Y61gtefXzvAoQuYyy1BOhBKo2B5JmHGbL3R7SZdVXtnCo7cyI/fYLT2q J9+YKVQp4Mes0R+a4PGxIxp86IZ2acnpxALvZPNoQ0bWCyY0oWoUCU0tL/KGtEpJUEJK bBQg== X-Received: by 10.66.142.42 with SMTP id rt10mr25676321pab.1.1387804958063; Mon, 23 Dec 2013 05:22:38 -0800 (PST) Received: from localhost.localdomain ([101.245.121.84]) by mx.google.com with ESMTPSA id vf7sm34409028pbc.5.2013.12.23.05.22.34 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 23 Dec 2013 05:22:37 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: cjb@laptop.org Cc: linux-mmc@vger.kernel.org, linux-arm-kernel@lists.infradead.org, workgroup.linux@csr.com, Bin Shi , Barry Song Subject: [PATCH 1/2] mmc: sdhci: combined dma buffer support for sdma Date: Mon, 23 Dec 2013 21:22:03 +0800 Message-Id: <1387804924-3372-1-git-send-email-21cnbao@gmail.com> X-Mailer: git-send-email 1.8.2.3 Sender: linux-mmc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-mmc@vger.kernel.org X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Bin Shi Since some SD host controller has just sdma scheme, no adma for sd/mmc/sdio host, in current sdhci driver, sdma just enable max_segs be 1, so all the sg list length will be 1 and host driver will handler each one by one with lots of cost in request/cmd_done/ irq_handler/tasklet_handler, which make worse read/write performance. A better solution is copy sg data to pre-defined dma coherent buffer and write to sd, or copy data from dma buffer to sg data and let high layer core to access. So we define quirks2 SDHCI_QUIRK2_SG_LIST_COMBINED_DMA_BUFFER to distinglish with normal dma mapping. Also this will involve one more memory copy, but good IO performance is got: On CSR SiRFprimaII, reading 8192KB will speed up from 17444KB/s to 18687KB/s, 7% lift. Signed-off-by: Bin Shi Signed-off-by: Barry Song --- drivers/mmc/host/sdhci.c | 98 ++++++++++++++++++++++++++++++++++++--------- include/linux/mmc/sdhci.h | 5 ++ 2 files changed, 84 insertions(+), 19 deletions(-) diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index bd8a098..aac92bd 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -726,6 +726,40 @@ static void sdhci_set_transfer_irqs(struct sdhci_host *host) sdhci_clear_set_irqs(host, dma_irqs, pio_irqs); } +static inline void sdhci_sg_to_dma(struct sdhci_host *host, struct mmc_data *data) +{ + unsigned int len, i; + struct scatterlist *sg; + void *dmabuf = host->combined_dma_buffer; + void *sgbuf; + + sg = data->sg; + len = data->sg_len; + + for (i = 0; i < len; i++) { + sgbuf = sg_virt(&sg[i]); + memcpy(dmabuf, sgbuf, sg[i].length); + dmabuf += sg[i].length; + } +} + +static inline void sdhci_dma_to_sg(struct sdhci_host *host, struct mmc_data *data) +{ + unsigned int len, i; + struct scatterlist *sg; + void *dmabuf = host->combined_dma_buffer; + void *sgbuf; + + sg = data->sg; + len = data->sg_len; + + for (i = 0; i < len; i++) { + sgbuf = sg_virt(&sg[i]); + memcpy(sgbuf, dmabuf, sg[i].length); + dmabuf += sg[i].length; + } +} + static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd) { u8 count; @@ -836,22 +870,34 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd) } else { int sg_cnt; - sg_cnt = dma_map_sg(mmc_dev(host->mmc), - data->sg, data->sg_len, - (data->flags & MMC_DATA_READ) ? - DMA_FROM_DEVICE : - DMA_TO_DEVICE); - if (sg_cnt == 0) { - /* - * This only happens when someone fed - * us an invalid request. - */ - WARN_ON(1); - host->flags &= ~SDHCI_REQ_USE_DMA; - } else { - WARN_ON(sg_cnt != 1); - sdhci_writel(host, sg_dma_address(data->sg), + /* + * Transfer data from the SG list to + * the DMA buffer. + */ + if (host->quirks2 & SDHCI_QUIRK2_SG_LIST_COMBINED_DMA_BUFFER) { + if (data->flags & MMC_DATA_WRITE) + sdhci_sg_to_dma(host, data); + sdhci_writel(host, host->combined_dma_addr, SDHCI_DMA_ADDRESS); + } else { + + sg_cnt = dma_map_sg(mmc_dev(host->mmc), + data->sg, data->sg_len, + (data->flags & MMC_DATA_READ) ? + DMA_FROM_DEVICE : + DMA_TO_DEVICE); + if (sg_cnt == 0) { + /* + * This only happens when someone fed + * us an invalid request. + */ + WARN_ON(1); + host->flags &= ~SDHCI_REQ_USE_DMA; + } else { + WARN_ON(sg_cnt != 1); + sdhci_writel(host, sg_dma_address(data->sg), + SDHCI_DMA_ADDRESS); + } } } } @@ -939,9 +985,11 @@ static void sdhci_finish_data(struct sdhci_host *host) if (host->flags & SDHCI_USE_ADMA) sdhci_adma_table_post(host, data); else { - dma_unmap_sg(mmc_dev(host->mmc), data->sg, - data->sg_len, (data->flags & MMC_DATA_READ) ? - DMA_FROM_DEVICE : DMA_TO_DEVICE); + if (!(host->quirks2 & SDHCI_QUIRK2_SG_LIST_COMBINED_DMA_BUFFER)) { + dma_unmap_sg(mmc_dev(host->mmc), data->sg, + data->sg_len, (data->flags & MMC_DATA_READ) ? + DMA_FROM_DEVICE : DMA_TO_DEVICE); + } } } @@ -2147,6 +2195,15 @@ static void sdhci_tasklet_finish(unsigned long param) mrq = host->mrq; /* + * Transfer data from DMA buffer to + * SG list. + */ + if ((host->quirks2 & SDHCI_QUIRK2_SG_LIST_COMBINED_DMA_BUFFER) && + mrq->data && (mrq->data->flags & MMC_DATA_READ)) + if (host->flags & SDHCI_REQ_USE_DMA) + sdhci_dma_to_sg(host, mrq->data); + + /* * The controller needs a reset of internal state machines * upon error conditions. */ @@ -3152,7 +3209,10 @@ int sdhci_add_host(struct sdhci_host *host) if (host->flags & SDHCI_USE_ADMA) mmc->max_segs = 128; else if (host->flags & SDHCI_USE_SDMA) - mmc->max_segs = 1; + if (host->quirks2 & SDHCI_QUIRK2_SG_LIST_COMBINED_DMA_BUFFER) + mmc->max_segs = 128; + else + mmc->max_segs = 1; else /* PIO */ mmc->max_segs = 128; diff --git a/include/linux/mmc/sdhci.h b/include/linux/mmc/sdhci.h index 3e781b8..c2fc13f 100644 --- a/include/linux/mmc/sdhci.h +++ b/include/linux/mmc/sdhci.h @@ -98,6 +98,8 @@ struct sdhci_host { #define SDHCI_QUIRK2_CARD_ON_NEEDS_BUS_ON (1<<4) /* Controller has a non-standard host control register */ #define SDHCI_QUIRK2_BROKEN_HOST_CONTROL (1<<5) +/* For better performance for SDMA controller, alloc a buffer to combine */ +#define SDHCI_QUIRK2_SG_LIST_COMBINED_DMA_BUFFER (1<<6) int irq; /* Device IRQ */ void __iomem *ioaddr; /* Mapped address */ @@ -160,6 +162,9 @@ struct sdhci_host { dma_addr_t adma_addr; /* Mapped ADMA descr. table */ dma_addr_t align_addr; /* Mapped bounce buffer */ + dma_addr_t combined_dma_addr; /* combined dma buffer */ + void *combined_dma_buffer;/* Mapped combined dma buffer */ + struct tasklet_struct card_tasklet; /* Tasklet structures */ struct tasklet_struct finish_tasklet;