From patchwork Fri Mar 30 09:37:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chaitra P B X-Patchwork-Id: 10317475 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6AE8260212 for ; Fri, 30 Mar 2018 09:40:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 59B2C2A577 for ; Fri, 30 Mar 2018 09:40:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4EB0C2A579; Fri, 30 Mar 2018 09:40:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 878E22A577 for ; Fri, 30 Mar 2018 09:40:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752171AbeC3JjO (ORCPT ); Fri, 30 Mar 2018 05:39:14 -0400 Received: from mail-qk0-f195.google.com ([209.85.220.195]:33404 "EHLO mail-qk0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751418AbeC3JjJ (ORCPT ); Fri, 30 Mar 2018 05:39:09 -0400 Received: by mail-qk0-f195.google.com with SMTP id d206so8655744qkb.0 for ; Fri, 30 Mar 2018 02:39:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RprNQLuphtOGwFJWc93YsHg+pHGhXl6OXY4oYV5sTKM=; b=YpF0YFVHs0m0u47wGjYSTAOf5JcAfxg2YSxc/ENE3JWL5XcOcQoM6b6NAAvny3Rx4x PHVMWIYgrFwOKt42T/j7mJMtyu8MNalDcP6XwyC+9t0DOLoGPr5sUPrK5zCmEZuyeLBN MZSU0GaL55+0OJ3PGLip38ZwHlh4teW2+Az94= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RprNQLuphtOGwFJWc93YsHg+pHGhXl6OXY4oYV5sTKM=; b=KzH1zZ9wSf1/MtDYckdIaQk/p+wQ7czWVkudh8gXaIoU/eI8J8Nr2ktc7W75s8H+sk KtGdTRTBeblxbpxBMWD2UcZht4mlmjvZWmqjIp5TaalV5IEnbX3e6jcPplaCi2L1rKoR c/298Kplm6Pzfzylt3LPbDzlaOri2M5ypPVdFN9fPlomOuoRfvX4tpvdTtIYucNszyiQ sUAWMAMmoJpcjssEffbzsFtgkVa+1LWYPvZUoZA/3trV21WltBEx0QAZqNm6MaDdtCoz HF3SS6nNcaEyTLX9HGQMPZPiE7obCLccpljkoDyZQvQyjkX1SzEz0MRlJr7Ed7s446m0 Z1HA== X-Gm-Message-State: AElRT7GQpsvP2aBiZcYxiSJnU1gr2GPIGpYbPvzKyWIalgIkj639Wb+h v9KEVyLKMagJ9ZroWtJGArdpQ66J X-Google-Smtp-Source: AIpwx4/FdZkp8IzwmjGIjYnGqLAHbR5vSxjgVSwxYvk/xiQtVgTjS/0XB6nZ9t8ECZLuyxURpxx1lg== X-Received: by 10.55.51.207 with SMTP id z198mr15760686qkz.185.1522402748136; Fri, 30 Mar 2018 02:39:08 -0700 (PDT) Received: from dhcp-10-122-86-193.dhcp.broadcom.net ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id n4sm1125526qkf.29.2018.03.30.02.39.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Mar 2018 02:39:07 -0700 (PDT) From: Chaitra P B To: linux-scsi@vger.kernel.org, stable@vger.kernel.org Cc: suganath-prabu.subramani@broadcom.com, Sathya.Prakash@broadcom.com, sreekanth.reddy@broadcom.com, Chaitra P B Subject: [PATCH 05/15] mpt3sas: Optimize I/O memory consumption in driver. Date: Fri, 30 Mar 2018 15:07:14 +0530 Message-Id: <1522402644-3016-6-git-send-email-chaitra.basappa@broadcom.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1522402644-3016-1-git-send-email-chaitra.basappa@broadcom.com> References: <1522402644-3016-1-git-send-email-chaitra.basappa@broadcom.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For every IO, memory of PAGE size is allocated for handling NVMe native PRPS. And in addition to that for every IO (chains need per IO * chain buffer size, e.g. 38 * 128byte) amount of memory is allocated for chain buffers. However, at any point of time; the IO request can be for NVMe target device (where PRP's page is used for framing PRP's) or can be for SCSI target device (where chain buffers are used for framing chain SGE's). This patch modifies the driver to reuse same pre-allocated PRP page buffers as a chain buffer for IO's targeted for SCSI target devices. No need to allocate separate buffers for chain SGE's buffers. Suppose if the number of chain buffers need for IO doesn't fit in the PRP Page size then driver maintain's separate buffers for those extra chain buffers that exceeds the PRP page size. For example consider PRP page size as 4K and chain buffer size as 128 bytes, then number of chain buffers that can fit in PRP page is 4096/128 => 32. if the number of chain buffer need per IO exceeds 32; for example consider number of chains need per IO is 36 then for remaining 4 chain buffer's driver allocates them individual. Signed-off-by: Chaitra P B Signed-off-by: Suganath Prabu S --- drivers/scsi/mpt3sas/mpt3sas_base.c | 80 +++++++++++++++++++++++-------------- 1 file changed, 51 insertions(+), 29 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 1e8e399..701e1e7 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -4188,7 +4188,8 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc) kfree(ioc->internal_lookup); if (ioc->chain_lookup) { for (i = 0; i < ioc->scsiio_depth; i++) { - for (j = 0; j < ioc->chains_needed_per_io; j++) { + for (j = ioc->chains_per_prp_buffer; + j < ioc->chains_needed_per_io; j++) { ct = &ioc->chain_lookup[i].chains_per_smid[j]; if (ct && ct->chain_buffer) dma_pool_free(ioc->chain_dma_pool, @@ -4506,7 +4507,7 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc) ioc->chain_lookup = kzalloc(sz, GFP_KERNEL); if (!ioc->chain_lookup) { pr_err(MPT3SAS_FMT "chain_lookup: __get_free_pages " - "failed\n", ioc->name); + "failed\n", ioc->name); goto out; } @@ -4520,33 +4521,6 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc) } } - ioc->chain_dma_pool = dma_pool_create("chain pool", &ioc->pdev->dev, - ioc->chain_segment_sz, 16, 0); - if (!ioc->chain_dma_pool) { - pr_err(MPT3SAS_FMT "chain_dma_pool: dma_pool_create failed\n", - ioc->name); - goto out; - } - for (i = 0; i < ioc->scsiio_depth; i++) { - for (j = 0; j < ioc->chains_needed_per_io; j++) { - ct = &ioc->chain_lookup[i].chains_per_smid[j]; - ct->chain_buffer = dma_pool_alloc( - ioc->chain_dma_pool , GFP_KERNEL, - &ct->chain_buffer_dma); - if (!ct->chain_buffer) { - pr_err(MPT3SAS_FMT "chain_lookup: " - " pci_pool_alloc failed\n", ioc->name); - goto out; - } - } - total_sz += ioc->chain_segment_sz; - } - - dinitprintk(ioc, pr_info(MPT3SAS_FMT - "chain pool depth(%d), frame_size(%d), pool_size(%d kB)\n", - ioc->name, ioc->chain_depth, ioc->chain_segment_sz, - ((ioc->chain_depth * ioc->chain_segment_sz))/1024)); - /* initialize hi-priority queue smid's */ ioc->hpr_lookup = kcalloc(ioc->hi_priority_depth, sizeof(struct request_tracker), GFP_KERNEL); @@ -4587,6 +4561,7 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc) * be required for NVMe PRP's, only each set of NVMe blocks will be * contiguous, so a new set is allocated for each possible I/O. */ + ioc->chains_per_prp_buffer = 0; if (ioc->facts.ProtocolFlags & MPI2_IOCFACTS_PROTOCOL_NVME_DEVICES) { nvme_blocks_needed = (ioc->shost->sg_tablesize * NVME_PRP_SIZE) - 1; @@ -4609,6 +4584,11 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc) ioc->name); goto out; } + + ioc->chains_per_prp_buffer = sz/ioc->chain_segment_sz; + ioc->chains_per_prp_buffer = min(ioc->chains_per_prp_buffer, + ioc->chains_needed_per_io); + for (i = 0; i < ioc->scsiio_depth; i++) { ioc->pcie_sg_lookup[i].pcie_sgl = dma_pool_alloc( ioc->pcie_sgl_dma_pool, GFP_KERNEL, @@ -4619,13 +4599,55 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc) ioc->name); goto out; } + for (j = 0; j < ioc->chains_per_prp_buffer; j++) { + ct = &ioc->chain_lookup[i].chains_per_smid[j]; + ct->chain_buffer = + ioc->pcie_sg_lookup[i].pcie_sgl + + (j * ioc->chain_segment_sz); + ct->chain_buffer_dma = + ioc->pcie_sg_lookup[i].pcie_sgl_dma + + (j * ioc->chain_segment_sz); + } } dinitprintk(ioc, pr_info(MPT3SAS_FMT "PCIe sgl pool depth(%d), " "element_size(%d), pool_size(%d kB)\n", ioc->name, ioc->scsiio_depth, sz, (sz * ioc->scsiio_depth)/1024)); + dinitprintk(ioc, pr_info(MPT3SAS_FMT "Number of chains can " + "fit in a PRP page(%d)\n", ioc->name, + ioc->chains_per_prp_buffer)); total_sz += sz * ioc->scsiio_depth; } + + ioc->chain_dma_pool = dma_pool_create("chain pool", &ioc->pdev->dev, + ioc->chain_segment_sz, 16, 0); + if (!ioc->chain_dma_pool) { + pr_err(MPT3SAS_FMT "chain_dma_pool: dma_pool_create failed\n", + ioc->name); + goto out; + } + for (i = 0; i < ioc->scsiio_depth; i++) { + for (j = ioc->chains_per_prp_buffer; + j < ioc->chains_needed_per_io; j++) { + ct = &ioc->chain_lookup[i].chains_per_smid[j]; + ct->chain_buffer = dma_pool_alloc( + ioc->chain_dma_pool, GFP_KERNEL, + &ct->chain_buffer_dma); + if (!ct->chain_buffer) { + pr_err(MPT3SAS_FMT "chain_lookup: " + " pci_pool_alloc failed\n", ioc->name); + _base_release_memory_pools(ioc); + goto out; + } + } + total_sz += ioc->chain_segment_sz; + } + + dinitprintk(ioc, pr_info(MPT3SAS_FMT + "chain pool depth(%d), frame_size(%d), pool_size(%d kB)\n", + ioc->name, ioc->chain_depth, ioc->chain_segment_sz, + ((ioc->chain_depth * ioc->chain_segment_sz))/1024)); + /* sense buffers, 4 byte align */ sz = ioc->scsiio_depth * SCSI_SENSE_BUFFERSIZE; ioc->sense_dma_pool = dma_pool_create("sense pool", &ioc->pdev->dev, sz,