From patchwork Thu Jul 26 18:54:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Battersby X-Patchwork-Id: 10546333 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2AFD61805 for ; Thu, 26 Jul 2018 19:11:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1CFA22BCA0 for ; Thu, 26 Jul 2018 19:11:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 10CCE2BCC2; Thu, 26 Jul 2018 19:11:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A32892BCD0 for ; Thu, 26 Jul 2018 19:11:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731183AbeGZUaC (ORCPT ); Thu, 26 Jul 2018 16:30:02 -0400 Received: from mail.cybernetics.com ([173.71.130.66]:37646 "EHLO mail.cybernetics.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730517AbeGZUaC (ORCPT ); Thu, 26 Jul 2018 16:30:02 -0400 X-ASG-Debug-ID: 1532631261-0fb3b01fb33a2750001-ziuLRu Received: from cybernetics.com ([10.157.1.126]) by mail.cybernetics.com with ESMTP id wjiWuOT5fySaaiG1 (version=SSLv3 cipher=DES-CBC3-SHA bits=112 verify=NO); Thu, 26 Jul 2018 14:54:21 -0400 (EDT) X-Barracuda-Envelope-From: tonyb@cybernetics.com X-ASG-Whitelist: Client Received: from [10.157.2.224] (account tonyb HELO [192.168.200.1]) by cybernetics.com (CommuniGate Pro SMTP 5.1.14) with ESMTPSA id 8304004; Thu, 26 Jul 2018 14:54:21 -0400 From: Tony Battersby Subject: [PATCH 1/3] dmapool: improve scalability of dma_pool_alloc To: Christoph Hellwig , Marek Szyprowski , Matthew Wilcox , Sathya Prakash , Chaitra P B , Suganath Prabu Subramani , iommu@lists.linux-foundation.org, linux-mm@kvack.org, linux-scsi@vger.kernel.org, MPT-FusionLinux.pdl@broadcom.com X-ASG-Orig-Subj: [PATCH 1/3] dmapool: improve scalability of dma_pool_alloc Message-ID: <15ff502d-d840-1003-6c45-bc17f0d81262@cybernetics.com> Date: Thu, 26 Jul 2018 14:54:21 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 Content-Language: en-US X-Barracuda-Connect: UNKNOWN[10.157.1.126] X-Barracuda-Start-Time: 1532631261 X-Barracuda-Encrypted: DES-CBC3-SHA X-Barracuda-URL: https://10.157.1.122:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 4197 X-Virus-Scanned: by bsmtpd at cybernetics.com X-Barracuda-BRTS-Status: 1 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP dma_pool_alloc() scales poorly when allocating a large number of pages because it does a linear scan of all previously-allocated pages before allocating a new one. Improve its scalability by maintaining a separate list of pages that have free blocks ready to (re)allocate. In big O notation, this improves the algorithm from O(n^2) to O(n). Signed-off-by: Tony Battersby --- Using list_del_init() in dma_pool_alloc() makes it safe to call list_del() unconditionally when freeing the page. In dma_pool_free(), the check for being already in avail_page_list could be written several different ways. The most obvious way is: if (page->offset >= pool->allocation) list_add(&page->avail_page_link, &pool->avail_page_list); Another way would be to check page->in_use. But since it is already using list_del_init(), checking the list pointers directly is safest to prevent any possible list corruption in case the caller misuses the API (e.g. double-dma_pool_free()) with DMAPOOL_DEBUG disabled. --- a/mm/dmapool.c +++ b/mm/dmapool.c @@ -20,6 +20,10 @@ * least 'size' bytes. Free blocks are tracked in an unsorted singly-linked * list of free blocks within the page. Used blocks aren't tracked, but we * keep a count of how many are currently allocated from each page. + * + * The avail_page_list keeps track of pages that have one or more free blocks + * available to (re)allocate. Pages are moved in and out of avail_page_list + * as their blocks are allocated and freed. */ #include @@ -44,6 +48,7 @@ struct dma_pool { /* the pool */ struct list_head page_list; + struct list_head avail_page_list; spinlock_t lock; size_t size; struct device *dev; @@ -55,6 +60,7 @@ struct dma_pool { /* the pool */ struct dma_page { /* cacheable header for 'allocation' bytes */ struct list_head page_list; + struct list_head avail_page_link; void *vaddr; dma_addr_t dma; unsigned int in_use; @@ -164,6 +170,7 @@ struct dma_pool *dma_pool_create(const c retval->dev = dev; INIT_LIST_HEAD(&retval->page_list); + INIT_LIST_HEAD(&retval->avail_page_list); spin_lock_init(&retval->lock); retval->size = size; retval->boundary = boundary; @@ -256,6 +263,7 @@ static void pool_free_page(struct dma_po #endif dma_free_coherent(pool->dev, pool->allocation, page->vaddr, dma); list_del(&page->page_list); + list_del(&page->avail_page_link); kfree(page); } @@ -298,6 +306,7 @@ void dma_pool_destroy(struct dma_pool *p pool->name, page->vaddr); /* leak the still-in-use consistent memory */ list_del(&page->page_list); + list_del(&page->avail_page_link); kfree(page); } else pool_free_page(pool, page); @@ -328,9 +337,11 @@ void *dma_pool_alloc(struct dma_pool *po might_sleep_if(gfpflags_allow_blocking(mem_flags)); spin_lock_irqsave(&pool->lock, flags); - list_for_each_entry(page, &pool->page_list, page_list) { - if (page->offset < pool->allocation) - goto ready; + if (!list_empty(&pool->avail_page_list)) { + page = list_first_entry(&pool->avail_page_list, + struct dma_page, + avail_page_link); + goto ready; } /* pool_alloc_page() might sleep, so temporarily drop &pool->lock */ @@ -343,10 +354,13 @@ void *dma_pool_alloc(struct dma_pool *po spin_lock_irqsave(&pool->lock, flags); list_add(&page->page_list, &pool->page_list); + list_add(&page->avail_page_link, &pool->avail_page_list); ready: page->in_use++; offset = page->offset; page->offset = *(int *)(page->vaddr + offset); + if (page->offset >= pool->allocation) + list_del_init(&page->avail_page_link); retval = offset + page->vaddr; *handle = offset + page->dma; #ifdef DMAPOOL_DEBUG @@ -461,6 +475,10 @@ void dma_pool_free(struct dma_pool *pool memset(vaddr, POOL_POISON_FREED, pool->size); #endif + /* This test checks if the page is already in avail_page_list. */ + if (list_empty(&page->avail_page_link)) + list_add(&page->avail_page_link, &pool->avail_page_list); + page->in_use--; *(int *)vaddr = page->offset; page->offset = offset; From patchwork Thu Jul 26 18:54:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Battersby X-Patchwork-Id: 10546335 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B9A441805 for ; Thu, 26 Jul 2018 19:11:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC21B2BCA0 for ; Thu, 26 Jul 2018 19:11:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A060F2BCD0; Thu, 26 Jul 2018 19:11:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05E6D2BCA0 for ; Thu, 26 Jul 2018 19:11:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731197AbeGZUaH (ORCPT ); Thu, 26 Jul 2018 16:30:07 -0400 Received: from mail.cybernetics.com ([173.71.130.66]:37654 "EHLO mail.cybernetics.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730517AbeGZUaH (ORCPT ); Thu, 26 Jul 2018 16:30:07 -0400 X-ASG-Debug-ID: 1532631296-0fb3b01fb33a2770001-ziuLRu Received: from cybernetics.com ([10.157.1.126]) by mail.cybernetics.com with ESMTP id EjELGtzuCzCSTUNj (version=SSLv3 cipher=DES-CBC3-SHA bits=112 verify=NO); Thu, 26 Jul 2018 14:54:56 -0400 (EDT) X-Barracuda-Envelope-From: tonyb@cybernetics.com X-ASG-Whitelist: Client Received: from [10.157.2.224] (account tonyb HELO [192.168.200.1]) by cybernetics.com (CommuniGate Pro SMTP 5.1.14) with ESMTPSA id 8304012; Thu, 26 Jul 2018 14:54:56 -0400 From: Tony Battersby Subject: [PATCH 2/3] dmapool: improve scalability of dma_pool_free To: Christoph Hellwig , Marek Szyprowski , Matthew Wilcox , Sathya Prakash , Chaitra P B , Suganath Prabu Subramani , iommu@lists.linux-foundation.org, linux-mm@kvack.org, linux-scsi , MPT-FusionLinux.pdl@broadcom.com X-ASG-Orig-Subj: [PATCH 2/3] dmapool: improve scalability of dma_pool_free Message-ID: <1288e597-a67a-25b3-b7c6-db883ca67a25@cybernetics.com> Date: Thu, 26 Jul 2018 14:54:56 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 Content-Language: en-US X-Barracuda-Connect: UNKNOWN[10.157.1.126] X-Barracuda-Start-Time: 1532631296 X-Barracuda-Encrypted: DES-CBC3-SHA X-Barracuda-URL: https://10.157.1.122:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 8846 X-Barracuda-BRTS-Status: 1 X-Virus-Scanned: by bsmtpd at cybernetics.com Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP dma_pool_free() scales poorly when the pool contains many pages because pool_find_page() does a linear scan of all allocated pages. Improve its scalability by replacing the linear scan with a red-black tree lookup. In big O notation, this improves the algorithm from O(n^2) to O(n * log n). Signed-off-by: Tony Battersby --- I moved some code from dma_pool_destroy() into pool_free_page() to avoid code repetition. --- a/mm/dmapool.c +++ b/mm/dmapool.c @@ -15,11 +15,12 @@ * Many older drivers still have their own code to do this. * * The current design of this allocator is fairly simple. The pool is - * represented by the 'struct dma_pool' which keeps a doubly-linked list of - * allocated pages. Each page in the page_list is split into blocks of at - * least 'size' bytes. Free blocks are tracked in an unsorted singly-linked - * list of free blocks within the page. Used blocks aren't tracked, but we - * keep a count of how many are currently allocated from each page. + * represented by the 'struct dma_pool' which keeps a red-black tree of all + * allocated pages, keyed by DMA address for fast lookup when freeing. + * Each page in the page_tree is split into blocks of at least 'size' bytes. + * Free blocks are tracked in an unsorted singly-linked list of free blocks + * within the page. Used blocks aren't tracked, but we keep a count of how + * many are currently allocated from each page. * * The avail_page_list keeps track of pages that have one or more free blocks * available to (re)allocate. Pages are moved in and out of avail_page_list @@ -41,13 +42,14 @@ #include #include #include +#include #if defined(CONFIG_DEBUG_SLAB) || defined(CONFIG_SLUB_DEBUG_ON) #define DMAPOOL_DEBUG 1 #endif struct dma_pool { /* the pool */ - struct list_head page_list; + struct rb_root page_tree; struct list_head avail_page_list; spinlock_t lock; size_t size; @@ -59,7 +61,7 @@ struct dma_pool { /* the pool */ }; struct dma_page { /* cacheable header for 'allocation' bytes */ - struct list_head page_list; + struct rb_node page_node; struct list_head avail_page_link; void *vaddr; dma_addr_t dma; @@ -78,6 +80,7 @@ show_pools(struct device *dev, struct de char *next; struct dma_page *page; struct dma_pool *pool; + struct rb_node *node; next = buf; size = PAGE_SIZE; @@ -92,7 +95,10 @@ show_pools(struct device *dev, struct de unsigned blocks = 0; spin_lock_irq(&pool->lock); - list_for_each_entry(page, &pool->page_list, page_list) { + for (node = rb_first(&pool->page_tree); + node; + node = rb_next(node)) { + page = rb_entry(node, struct dma_page, page_node); pages++; blocks += page->in_use; } @@ -169,7 +175,7 @@ struct dma_pool *dma_pool_create(const c retval->dev = dev; - INIT_LIST_HEAD(&retval->page_list); + retval->page_tree = RB_ROOT; INIT_LIST_HEAD(&retval->avail_page_list); spin_lock_init(&retval->lock); retval->size = size; @@ -210,6 +216,65 @@ struct dma_pool *dma_pool_create(const c } EXPORT_SYMBOL(dma_pool_create); +/* + * Find the dma_page that manages the given DMA address. + */ +static struct dma_page *pool_find_page(struct dma_pool *pool, dma_addr_t dma) +{ + struct rb_node *node = pool->page_tree.rb_node; + + while (node) { + struct dma_page *page = + container_of(node, struct dma_page, page_node); + + if (dma < page->dma) + node = node->rb_left; + else if ((dma - page->dma) >= pool->allocation) + node = node->rb_right; + else + return page; + } + return NULL; +} + +/* + * Insert a dma_page into the page_tree. + */ +static int pool_insert_page(struct dma_pool *pool, struct dma_page *new_page) +{ + dma_addr_t dma = new_page->dma; + struct rb_node **node = &(pool->page_tree.rb_node), *parent = NULL; + + while (*node) { + struct dma_page *this_page = + container_of(*node, struct dma_page, page_node); + + parent = *node; + if (dma < this_page->dma) + node = &((*node)->rb_left); + else if (likely((dma - this_page->dma) >= pool->allocation)) + node = &((*node)->rb_right); + else { + /* + * A page that overlaps the new DMA range is already + * present in the tree. This should not happen. + */ + WARN(1, + "%s: %s: DMA address overlap: old 0x%llx new 0x%llx len %zu\n", + pool->dev ? dev_name(pool->dev) : "(nodev)", + pool->name, (u64) this_page->dma, (u64) dma, + pool->allocation); + return -1; + } + } + + /* Add new node and rebalance tree. */ + rb_link_node(&new_page->page_node, parent, node); + rb_insert_color(&new_page->page_node, &pool->page_tree); + + return 0; +} + static void pool_initialise_page(struct dma_pool *pool, struct dma_page *page) { unsigned int offset = 0; @@ -254,15 +319,36 @@ static inline bool is_page_busy(struct d return page->in_use != 0; } -static void pool_free_page(struct dma_pool *pool, struct dma_page *page) +static void pool_free_page(struct dma_pool *pool, + struct dma_page *page, + bool destroying_pool) { - dma_addr_t dma = page->dma; - + if (destroying_pool && is_page_busy(page)) { + if (pool->dev) + dev_err(pool->dev, + "dma_pool_destroy %s, %p busy\n", + pool->name, page->vaddr); + else + pr_err("dma_pool_destroy %s, %p busy\n", + pool->name, page->vaddr); + /* leak the still-in-use consistent memory */ + } else { #ifdef DMAPOOL_DEBUG - memset(page->vaddr, POOL_POISON_FREED, pool->allocation); + memset(page->vaddr, POOL_POISON_FREED, pool->allocation); #endif - dma_free_coherent(pool->dev, pool->allocation, page->vaddr, dma); - list_del(&page->page_list); + dma_free_coherent(pool->dev, + pool->allocation, + page->vaddr, + page->dma); + } + + /* + * If the pool is being destroyed, it is not safe to modify the + * page_tree while iterating over it, and it is also unnecessary since + * the whole tree will be discarded anyway. + */ + if (!destroying_pool) + rb_erase(&page->page_node, &pool->page_tree); list_del(&page->avail_page_link); kfree(page); } @@ -277,6 +363,7 @@ static void pool_free_page(struct dma_po */ void dma_pool_destroy(struct dma_pool *pool) { + struct dma_page *page, *tmp_page; bool empty = false; if (unlikely(!pool)) @@ -292,24 +379,11 @@ void dma_pool_destroy(struct dma_pool *p device_remove_file(pool->dev, &dev_attr_pools); mutex_unlock(&pools_reg_lock); - while (!list_empty(&pool->page_list)) { - struct dma_page *page; - page = list_entry(pool->page_list.next, - struct dma_page, page_list); - if (is_page_busy(page)) { - if (pool->dev) - dev_err(pool->dev, - "dma_pool_destroy %s, %p busy\n", - pool->name, page->vaddr); - else - pr_err("dma_pool_destroy %s, %p busy\n", - pool->name, page->vaddr); - /* leak the still-in-use consistent memory */ - list_del(&page->page_list); - list_del(&page->avail_page_link); - kfree(page); - } else - pool_free_page(pool, page); + rbtree_postorder_for_each_entry_safe(page, + tmp_page, + &pool->page_tree, + page_node) { + pool_free_page(pool, page, true); } kfree(pool); @@ -353,7 +427,15 @@ void *dma_pool_alloc(struct dma_pool *po spin_lock_irqsave(&pool->lock, flags); - list_add(&page->page_list, &pool->page_list); + if (unlikely(pool_insert_page(pool, page))) { + /* + * This should not happen, so something must have gone horribly + * wrong. Instead of crashing, intentionally leak the memory + * and make for the exit. + */ + spin_unlock_irqrestore(&pool->lock, flags); + return NULL; + } list_add(&page->avail_page_link, &pool->avail_page_list); ready: page->in_use++; @@ -400,19 +482,6 @@ void *dma_pool_alloc(struct dma_pool *po } EXPORT_SYMBOL(dma_pool_alloc); -static struct dma_page *pool_find_page(struct dma_pool *pool, dma_addr_t dma) -{ - struct dma_page *page; - - list_for_each_entry(page, &pool->page_list, page_list) { - if (dma < page->dma) - continue; - if ((dma - page->dma) < pool->allocation) - return page; - } - return NULL; -} - /** * dma_pool_free - put block back into dma pool * @pool: the dma pool holding the block @@ -484,7 +553,7 @@ void dma_pool_free(struct dma_pool *pool page->offset = offset; /* * Resist a temptation to do - * if (!is_page_busy(page)) pool_free_page(pool, page); + * if (!is_page_busy(page)) pool_free_page(pool, page, false); * Better have a few empty pages hang around. */ spin_unlock_irqrestore(&pool->lock, flags); From patchwork Thu Jul 26 18:55:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Battersby X-Patchwork-Id: 10546331 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EC91112B for ; Thu, 26 Jul 2018 19:11:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 10F232BCEB for ; Thu, 26 Jul 2018 19:11:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 051282BCE7; Thu, 26 Jul 2018 19:11:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 75E9C2BCC2 for ; Thu, 26 Jul 2018 19:11:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730940AbeGZUaC (ORCPT ); Thu, 26 Jul 2018 16:30:02 -0400 Received: from mail.cybernetics.com ([173.71.130.66]:37640 "EHLO mail.cybernetics.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730198AbeGZUaC (ORCPT ); Thu, 26 Jul 2018 16:30:02 -0400 X-Greylist: delayed 1073 seconds by postgrey-1.27 at vger.kernel.org; Thu, 26 Jul 2018 16:30:02 EDT X-ASG-Debug-ID: 1532631332-0fb3b01fb33a2790001-ziuLRu Received: from cybernetics.com ([10.157.1.126]) by mail.cybernetics.com with ESMTP id mI9WtgWibIWVV87X (version=SSLv3 cipher=DES-CBC3-SHA bits=112 verify=NO); Thu, 26 Jul 2018 14:55:32 -0400 (EDT) X-Barracuda-Envelope-From: tonyb@cybernetics.com X-ASG-Whitelist: Client Received: from [10.157.2.224] (account tonyb HELO [192.168.200.1]) by cybernetics.com (CommuniGate Pro SMTP 5.1.14) with ESMTPSA id 8304014; Thu, 26 Jul 2018 14:55:31 -0400 From: Tony Battersby Subject: [PATCH 3/3] [SCSI] mpt3sas: replace chain_dma_pool To: Christoph Hellwig , Marek Szyprowski , Matthew Wilcox , Sathya Prakash , Chaitra P B , Suganath Prabu Subramani , iommu@lists.linux-foundation.org, linux-mm@kvack.org, linux-scsi , MPT-FusionLinux.pdl@broadcom.com X-ASG-Orig-Subj: [PATCH 3/3] [SCSI] mpt3sas: replace chain_dma_pool Message-ID: Date: Thu, 26 Jul 2018 14:55:31 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 Content-Language: en-US X-Barracuda-Connect: UNKNOWN[10.157.1.126] X-Barracuda-Start-Time: 1532631332 X-Barracuda-Encrypted: DES-CBC3-SHA X-Barracuda-URL: https://10.157.1.122:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 7758 X-Barracuda-BRTS-Status: 1 X-Virus-Scanned: by bsmtpd at cybernetics.com Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Replace chain_dma_pool with direct calls to dma_alloc_coherent() and dma_free_coherent(). Since the chain lookup can involve hundreds of thousands of allocations, it is worthwile to avoid the overhead of the dma_pool API. Signed-off-by: Tony Battersby --- The original code called _base_release_memory_pools() before "goto out" if dma_pool_alloc() failed, but this was unnecessary because mpt3sas_base_attach() will call _base_release_memory_pools() after "goto out_free_resources". It may have been that way because the out-of-tree vendor driver (from https://www.broadcom.com/support/download-search) has a slightly-more-complicated error handler there that adjusts max_request_credit, calls _base_release_memory_pools() and then does "goto retry_allocation" under some circumstances, but that is missing from the in-tree driver. diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 569392d..2cb567a 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -4224,6 +4224,134 @@ void mpt3sas_base_clear_st(struct MPT3SAS_ADAPTER *ioc, } /** + * _base_release_chain_lookup - release chain_lookup memory pools + * @ioc: per adapter object + * + * Free memory allocated from _base_allocate_chain_lookup. + */ +static void +_base_release_chain_lookup(struct MPT3SAS_ADAPTER *ioc) +{ + unsigned int chains_avail = 0; + struct chain_tracker *ct; + int i, j; + + if (!ioc->chain_lookup) + return; + + /* + * NOTE + * + * To make this code easier to understand and maintain, the for loops + * and the management of the chains_avail value are designed to be + * similar to the _base_allocate_chain_lookup() function. That way, + * the code for freeing the memory is similar to the code for + * allocating the memory. + */ + for (i = 0; i < ioc->scsiio_depth; i++) { + if (!ioc->chain_lookup[i].chains_per_smid) + break; + + for (j = ioc->chains_per_prp_buffer; + j < ioc->chains_needed_per_io; j++) { + /* + * If chains_avail is 0, then the chain represents a + * real allocation, so free it. + * + * If chains_avail is nonzero, then the chain was + * initialized at an offset from a previous allocation, + * so don't free it. + */ + if (chains_avail == 0) { + ct = &ioc->chain_lookup[i].chains_per_smid[j]; + if (ct->chain_buffer) + dma_free_coherent( + &ioc->pdev->dev, + ioc->chain_allocation_sz, + ct->chain_buffer, + ct->chain_buffer_dma); + chains_avail = ioc->chains_per_allocation; + } + chains_avail--; + } + kfree(ioc->chain_lookup[i].chains_per_smid); + } + + kfree(ioc->chain_lookup); + ioc->chain_lookup = NULL; +} + +/** + * _base_allocate_chain_lookup - allocate chain_lookup memory pools + * @ioc: per adapter object + * @total_sz: external value that tracks total amount of memory allocated + * + * Return: 0 success, anything else error + */ +static int +_base_allocate_chain_lookup(struct MPT3SAS_ADAPTER *ioc, u32 *total_sz) +{ + unsigned int aligned_chain_segment_sz; + const unsigned int align = 16; + unsigned int chains_avail = 0; + struct chain_tracker *ct; + dma_addr_t dma_addr = 0; + void *vaddr = NULL; + int i, j; + + /* Round up the allocation size for alignment. */ + aligned_chain_segment_sz = ioc->chain_segment_sz; + if (aligned_chain_segment_sz % align != 0) + aligned_chain_segment_sz = + ALIGN(aligned_chain_segment_sz, align); + + /* Allocate a page of chain buffers at a time. */ + ioc->chain_allocation_sz = + max_t(unsigned int, aligned_chain_segment_sz, PAGE_SIZE); + + /* Calculate how many chain buffers we can get from one allocation. */ + ioc->chains_per_allocation = + ioc->chain_allocation_sz / aligned_chain_segment_sz; + + for (i = 0; i < ioc->scsiio_depth; i++) { + for (j = ioc->chains_per_prp_buffer; + j < ioc->chains_needed_per_io; j++) { + /* + * Check if there are any chain buffers left in the + * previously-allocated block. + */ + if (chains_avail == 0) { + /* Allocate a new block of chain buffers. */ + vaddr = dma_alloc_coherent( + &ioc->pdev->dev, + ioc->chain_allocation_sz, + &dma_addr, + GFP_KERNEL); + if (!vaddr) { + pr_err(MPT3SAS_FMT + "chain_lookup: dma_alloc_coherent failed\n", + ioc->name); + return -1; + } + chains_avail = ioc->chains_per_allocation; + } + + ct = &ioc->chain_lookup[i].chains_per_smid[j]; + ct->chain_buffer = vaddr; + ct->chain_buffer_dma = dma_addr; + + /* Go to the next chain buffer in the block. */ + vaddr += aligned_chain_segment_sz; + dma_addr += aligned_chain_segment_sz; + *total_sz += ioc->chain_segment_sz; + chains_avail--; + } + } + + return 0; +} + +/** * _base_release_memory_pools - release memory * @ioc: per adapter object * @@ -4235,8 +4363,6 @@ void mpt3sas_base_clear_st(struct MPT3SAS_ADAPTER *ioc, _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc) { int i = 0; - int j = 0; - struct chain_tracker *ct; struct reply_post_struct *rps; dexitprintk(ioc, pr_info(MPT3SAS_FMT "%s\n", ioc->name, @@ -4326,22 +4452,7 @@ void mpt3sas_base_clear_st(struct MPT3SAS_ADAPTER *ioc, kfree(ioc->hpr_lookup); kfree(ioc->internal_lookup); - if (ioc->chain_lookup) { - for (i = 0; i < ioc->scsiio_depth; i++) { - for (j = ioc->chains_per_prp_buffer; - j < ioc->chains_needed_per_io; j++) { - ct = &ioc->chain_lookup[i].chains_per_smid[j]; - if (ct && ct->chain_buffer) - dma_pool_free(ioc->chain_dma_pool, - ct->chain_buffer, - ct->chain_buffer_dma); - } - kfree(ioc->chain_lookup[i].chains_per_smid); - } - dma_pool_destroy(ioc->chain_dma_pool); - kfree(ioc->chain_lookup); - ioc->chain_lookup = NULL; - } + _base_release_chain_lookup(ioc); } /** @@ -4784,29 +4895,8 @@ void mpt3sas_base_clear_st(struct MPT3SAS_ADAPTER *ioc, total_sz += sz * ioc->scsiio_depth; } - ioc->chain_dma_pool = dma_pool_create("chain pool", &ioc->pdev->dev, - ioc->chain_segment_sz, 16, 0); - if (!ioc->chain_dma_pool) { - pr_err(MPT3SAS_FMT "chain_dma_pool: dma_pool_create failed\n", - ioc->name); + if (_base_allocate_chain_lookup(ioc, &total_sz)) goto out; - } - for (i = 0; i < ioc->scsiio_depth; i++) { - for (j = ioc->chains_per_prp_buffer; - j < ioc->chains_needed_per_io; j++) { - ct = &ioc->chain_lookup[i].chains_per_smid[j]; - ct->chain_buffer = dma_pool_alloc( - ioc->chain_dma_pool, GFP_KERNEL, - &ct->chain_buffer_dma); - if (!ct->chain_buffer) { - pr_err(MPT3SAS_FMT "chain_lookup: " - " pci_pool_alloc failed\n", ioc->name); - _base_release_memory_pools(ioc); - goto out; - } - } - total_sz += ioc->chain_segment_sz; - } dinitprintk(ioc, pr_info(MPT3SAS_FMT "chain pool depth(%d), frame_size(%d), pool_size(%d kB)\n", diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index f02974c..7ee81d5 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -1298,7 +1298,6 @@ struct MPT3SAS_ADAPTER { /* chain */ struct chain_lookup *chain_lookup; struct list_head free_chain_list; - struct dma_pool *chain_dma_pool; ulong chain_pages; u16 max_sges_in_main_message; u16 max_sges_in_chain_message; @@ -1306,6 +1305,8 @@ struct MPT3SAS_ADAPTER { u32 chain_depth; u16 chain_segment_sz; u16 chains_per_prp_buffer; + u32 chain_allocation_sz; + u32 chains_per_allocation; /* hi-priority queue */ u16 hi_priority_smid;