Message ID | 15ff502d-d840-1003-6c45-bc17f0d81262@cybernetics.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | mpt3sas and dmapool scalability | expand |
On Thu, Jul 26, 2018 at 9:54 PM, Tony Battersby <tonyb@cybernetics.com> wrote: > dma_pool_alloc() scales poorly when allocating a large number of pages > because it does a linear scan of all previously-allocated pages before > allocating a new one. Improve its scalability by maintaining a separate > list of pages that have free blocks ready to (re)allocate. In big O > notation, this improves the algorithm from O(n^2) to O(n). > spin_lock_irqsave(&pool->lock, flags); > - list_for_each_entry(page, &pool->page_list, page_list) { > - if (page->offset < pool->allocation) > - goto ready; > + if (!list_empty(&pool->avail_page_list)) { > + page = list_first_entry(&pool->avail_page_list, > + struct dma_page, > + avail_page_link); > + goto ready; > } It looks like page = list_first_entry_or_null(); if (page) goto ready; Though I don't know which one produces better code in the result. From reader prospective of view I would go with my variant. > + /* This test checks if the page is already in avail_page_list. */ > + if (list_empty(&page->avail_page_link)) > + list_add(&page->avail_page_link, &pool->avail_page_list); How can you be sure that the page you are testing for is the first one? It seems you are relying on the fact that in the list should be either 0 or 1 page. In that case what's the point to have a list?
On 07/26/2018 03:37 PM, Andy Shevchenko wrote: > On Thu, Jul 26, 2018 at 9:54 PM, Tony Battersby <tonyb@cybernetics.com> wrote: >> dma_pool_alloc() scales poorly when allocating a large number of pages >> because it does a linear scan of all previously-allocated pages before >> allocating a new one. Improve its scalability by maintaining a separate >> list of pages that have free blocks ready to (re)allocate. In big O >> notation, this improves the algorithm from O(n^2) to O(n). > > >> spin_lock_irqsave(&pool->lock, flags); >> - list_for_each_entry(page, &pool->page_list, page_list) { >> - if (page->offset < pool->allocation) >> - goto ready; >> + if (!list_empty(&pool->avail_page_list)) { >> + page = list_first_entry(&pool->avail_page_list, >> + struct dma_page, >> + avail_page_link); >> + goto ready; >> } > It looks like > > page = list_first_entry_or_null(); > if (page) > goto ready; > > Though I don't know which one produces better code in the result. > > >From reader prospective of view I would go with my variant. Thanks, I didn't know about list_first_entry_or_null(). > >> + /* This test checks if the page is already in avail_page_list. */ >> + if (list_empty(&page->avail_page_link)) >> + list_add(&page->avail_page_link, &pool->avail_page_list); > How can you be sure that the page you are testing for is the first one? > > It seems you are relying on the fact that in the list should be either > 0 or 1 page. In that case what's the point to have a list? > That would be true if the test were "if (list_empty(&pool->avail_page_list))". But it is testing the list pointers in the item rather than the list pointers in the pool. It may be a bit confusing if you have never seen that usage before, which is why I added a comment. Basically, if you use list_del_init() instead of list_del(), then you can use list_empty() on the item itself to test if the item is present in a list or not. For example, the comments in list.h warn not to use list_empty() on the entry after just list_del(): /** * list_del - deletes entry from list. * @entry: the element to delete from the list. * Note: list_empty() on entry does not return true after this, the entry is * in an undefined state. */
> That would be true if the test were "if > (list_empty(&pool->avail_page_list))". But it is testing the list > pointers in the item rather than the list pointers in the pool. It may > be a bit confusing if you have never seen that usage before, which is > why I added a comment. Basically, if you use list_del_init() instead of > list_del(), then you can use list_empty() on the item itself to test if > the item is present in a list or not. For example, the comments in > list.h warn not to use list_empty() on the entry after just list_del(): > > /** > * list_del - deletes entry from list. > * @entry: the element to delete from the list. > * Note: list_empty() on entry does not return true after this, the entry is > * in an undefined state. > */ Sorry for the crappy line length (fixed above). Should have just used Preformat in Thunderbird like always rather than following Documentation/process/email-clients.rst and changing mailnews.wraplength from 72 to 0. Anyway, I have been using list_del_init() for a long time in various places, but now I can't find where any of its useful features are documented. I will have to submit a separate patch to add more documentation for it. I find it useful for two things. 1) If you use list_del_init(), you can delete an item from a list multiple times without checking if it has already been deleted. So the following is safe: list_add(entry, list); list_del_init(entry); list_del_init(entry); That would not be safe if just using list_del(). 2) If you use list_del_init(), you can test if an item is present in a list using list_empty() on the entry. So: list_add(entry, list); /* list_empty(entry) is false */ list_del_init(entry); /* list_empty(entry) is true */ That would not work if using just list_del(). Since the list_empty() name is unintuitive for this purpose, it might be useful to add a new macro for this use case.
--- a/mm/dmapool.c +++ b/mm/dmapool.c @@ -20,6 +20,10 @@ * least 'size' bytes. Free blocks are tracked in an unsorted singly-linked * list of free blocks within the page. Used blocks aren't tracked, but we * keep a count of how many are currently allocated from each page. + * + * The avail_page_list keeps track of pages that have one or more free blocks + * available to (re)allocate. Pages are moved in and out of avail_page_list + * as their blocks are allocated and freed. */ #include <linux/device.h> @@ -44,6 +48,7 @@ struct dma_pool { /* the pool */ struct list_head page_list; + struct list_head avail_page_list; spinlock_t lock; size_t size; struct device *dev; @@ -55,6 +60,7 @@ struct dma_pool { /* the pool */ struct dma_page { /* cacheable header for 'allocation' bytes */ struct list_head page_list; + struct list_head avail_page_link; void *vaddr; dma_addr_t dma; unsigned int in_use; @@ -164,6 +170,7 @@ struct dma_pool *dma_pool_create(const c retval->dev = dev; INIT_LIST_HEAD(&retval->page_list); + INIT_LIST_HEAD(&retval->avail_page_list); spin_lock_init(&retval->lock); retval->size = size; retval->boundary = boundary; @@ -256,6 +263,7 @@ static void pool_free_page(struct dma_po #endif dma_free_coherent(pool->dev, pool->allocation, page->vaddr, dma); list_del(&page->page_list); + list_del(&page->avail_page_link); kfree(page); } @@ -298,6 +306,7 @@ void dma_pool_destroy(struct dma_pool *p pool->name, page->vaddr); /* leak the still-in-use consistent memory */ list_del(&page->page_list); + list_del(&page->avail_page_link); kfree(page); } else pool_free_page(pool, page); @@ -328,9 +337,11 @@ void *dma_pool_alloc(struct dma_pool *po might_sleep_if(gfpflags_allow_blocking(mem_flags)); spin_lock_irqsave(&pool->lock, flags); - list_for_each_entry(page, &pool->page_list, page_list) { - if (page->offset < pool->allocation) - goto ready; + if (!list_empty(&pool->avail_page_list)) { + page = list_first_entry(&pool->avail_page_list, + struct dma_page, + avail_page_link); + goto ready; } /* pool_alloc_page() might sleep, so temporarily drop &pool->lock */ @@ -343,10 +354,13 @@ void *dma_pool_alloc(struct dma_pool *po spin_lock_irqsave(&pool->lock, flags); list_add(&page->page_list, &pool->page_list); + list_add(&page->avail_page_link, &pool->avail_page_list); ready: page->in_use++; offset = page->offset; page->offset = *(int *)(page->vaddr + offset); + if (page->offset >= pool->allocation) + list_del_init(&page->avail_page_link); retval = offset + page->vaddr; *handle = offset + page->dma; #ifdef DMAPOOL_DEBUG @@ -461,6 +475,10 @@ void dma_pool_free(struct dma_pool *pool memset(vaddr, POOL_POISON_FREED, pool->size); #endif + /* This test checks if the page is already in avail_page_list. */ + if (list_empty(&page->avail_page_link)) + list_add(&page->avail_page_link, &pool->avail_page_list); + page->in_use--; *(int *)vaddr = page->offset; page->offset = offset;
dma_pool_alloc() scales poorly when allocating a large number of pages because it does a linear scan of all previously-allocated pages before allocating a new one. Improve its scalability by maintaining a separate list of pages that have free blocks ready to (re)allocate. In big O notation, this improves the algorithm from O(n^2) to O(n). Signed-off-by: Tony Battersby <tonyb@cybernetics.com> --- Using list_del_init() in dma_pool_alloc() makes it safe to call list_del() unconditionally when freeing the page. In dma_pool_free(), the check for being already in avail_page_list could be written several different ways. The most obvious way is: if (page->offset >= pool->allocation) list_add(&page->avail_page_link, &pool->avail_page_list); Another way would be to check page->in_use. But since it is already using list_del_init(), checking the list pointers directly is safest to prevent any possible list corruption in case the caller misuses the API (e.g. double-dma_pool_free()) with DMAPOOL_DEBUG disabled.