Message ID | 20231013064827.61135-1-linyunsheng@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | introduce page_pool_alloc() related API | expand |
On Fri, 13 Oct 2023 14:48:20 +0800 Yunsheng Lin wrote: > v5 RFC: Add a new page_pool_cache_alloc() API, and other minor > change as discussed in v4. As there seems to be three > comsumers that might be made use of the new API, so > repost it as RFC and CC the relevant authors to see > if the new API fits their need. I have looked thru the v4 discussion (admittedly it was pretty huge). I can't find where the "cache" API was suggested. And I can't figure out now what the "cache" in the name is referring to. Looks like these are just convenience wrappers which return VA instead of struct page..
Hello: This series was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 13 Oct 2023 14:48:20 +0800 you wrote: > In [1] & [2] & [3], there are usecases for veth and virtio_net > to use frag support in page pool to reduce memory usage, and it > may request different frag size depending on the head/tail > room space for xdp_frame/shinfo and mtu/packet size. When the > requested frag size is large enough that a single page can not > be split into more than one frag, using frag support only have > performance penalty because of the extra frag count handling > for frag support. > > [...] Here is the summary with links: - [net-next,v11,1/6] page_pool: fragment API support for 32-bit arch with 64-bit DMA https://git.kernel.org/netdev/net-next/c/90de47f020db - [net-next,v11,2/6] page_pool: unify frag_count handling in page_pool_is_last_frag() (no matching commit) - [net-next,v11,3/6] page_pool: remove PP_FLAG_PAGE_FRAG (no matching commit) - [net-next,v11,4/6] page_pool: introduce page_pool[_cache]_alloc() API (no matching commit) - [net-next,v11,5/6] page_pool: update document about fragment API (no matching commit) - [net-next,v11,6/6] net: veth: use newly added page pool API for veth with xdp (no matching commit) You are awesome, thank you!
On 2023/10/17 9:27, Jakub Kicinski wrote: > On Fri, 13 Oct 2023 14:48:20 +0800 Yunsheng Lin wrote: >> v5 RFC: Add a new page_pool_cache_alloc() API, and other minor >> change as discussed in v4. As there seems to be three >> comsumers that might be made use of the new API, so >> repost it as RFC and CC the relevant authors to see >> if the new API fits their need. > > I have looked thru the v4 discussion (admittedly it was pretty huge). > I can't find where the "cache" API was suggested. Actually, the discussion happened in V3 as both of discussions in V3 and V4 seems to be happening concurrently: https://lore.kernel.org/all/f8ce176f-f975-af11-641c-b56c53a8066a@redhat.com/ > And I can't figure out now what the "cache" in the name is referring to. > Looks like these are just convenience wrappers which return VA instead > of struct page.. Yes, it is corresponding to some API like napi_alloc_frag() returning va instead of 'struct page' mentioned in patch 5. Anyway, naming is hard, any suggestion for a better naming is always welcomed:) > . >
On Tue, 17 Oct 2023 15:56:48 +0800 Yunsheng Lin wrote: > > And I can't figure out now what the "cache" in the name is referring to. > > Looks like these are just convenience wrappers which return VA instead > > of struct page.. > > Yes, it is corresponding to some API like napi_alloc_frag() returning va > instead of 'struct page' mentioned in patch 5. > > Anyway, naming is hard, any suggestion for a better naming is always > welcomed:) I'd just throw a _va (for virtual address) at the end. And not really mention it in the documentation. Plus the kdoc of the function should say that this is just a thin wrapper around other page pool APIs, and it's safe to mix it with other page pool APIs?
On Tue, 17 Oct 2023 08:13:03 -0700 Jakub Kicinski wrote:
> I'd just throw a _va (for virtual address) at the end
To be clear I mean:
page_pool_alloc_va()
page_pool_dev_alloc_va()
page_pool_free_va()
On 2023/10/17 23:13, Jakub Kicinski wrote: > On Tue, 17 Oct 2023 15:56:48 +0800 Yunsheng Lin wrote: >>> And I can't figure out now what the "cache" in the name is referring to. >>> Looks like these are just convenience wrappers which return VA instead >>> of struct page.. >> >> Yes, it is corresponding to some API like napi_alloc_frag() returning va >> instead of 'struct page' mentioned in patch 5. >> >> Anyway, naming is hard, any suggestion for a better naming is always >> welcomed:) > > I'd just throw a _va (for virtual address) at the end. And not really _va seems fine:) > mention it in the documentation. Plus the kdoc of the function should > say that this is just a thin wrapper around other page pool APIs, and > it's safe to mix it with other page pool APIs? I am not sure I understand what do 'safe' and 'mix' mean here. For 'safe' part, I suppose you mean if there is a va accociated with a 'struct page' without calling some API like kmap()? For that, I suppose it is safe when the driver is calling page_pool API without the __GFP_HIGHMEM flag. Maybe we should mention that in the kdoc and give a warning if page_pool_*alloc_va() is called with the __GFP_HIGHMEM flag? For the 'mix', I suppose you mean the below: 1. Allocate a page with the page_pool_*alloc_va() API and free a page with page_pool_free() API. 2. Allocate a page with the page_pool_*alloc() API and free a page with page_pool_free_va() API. For 1, it seems it is ok as some virt_to_head_page() and page_address() call between va and 'struct page' does not seem to change anything if we have enforce page_pool_*alloc_va() to be called without the __GFP_HIGHMEM flag. For 2, If the va is returned from page_address() which the allocation API is called without __GFP_HIGHMEM flag. If not, the va is from kmap*()? which means we may be calling page_pool_free_va() before kunmap*()? Is that possible? > . >
On Wed, 18 Oct 2023 19:47:16 +0800 Yunsheng Lin wrote: > > mention it in the documentation. Plus the kdoc of the function should > > say that this is just a thin wrapper around other page pool APIs, and > > it's safe to mix it with other page pool APIs? > > I am not sure I understand what do 'safe' and 'mix' mean here. > > For 'safe' part, I suppose you mean if there is a va accociated with > a 'struct page' without calling some API like kmap()? For that, I suppose > it is safe when the driver is calling page_pool API without the > __GFP_HIGHMEM flag. Maybe we should mention that in the kdoc and give a > warning if page_pool_*alloc_va() is called with the __GFP_HIGHMEM flag? Sounds good. Warning wrapped in #if CONFIG_DEBUG_NET perhaps? > For the 'mix', I suppose you mean the below: > 1. Allocate a page with the page_pool_*alloc_va() API and free a page with > page_pool_free() API. > 2. Allocate a page with the page_pool_*alloc() API and free a page with > page_pool_free_va() API. > > For 1, it seems it is ok as some virt_to_head_page() and page_address() call > between va and 'struct page' does not seem to change anything if we have > enforce page_pool_*alloc_va() to be called without the __GFP_HIGHMEM flag. > > For 2, If the va is returned from page_address() which the allocation API is > called without __GFP_HIGHMEM flag. If not, the va is from kmap*()? which means > we may be calling page_pool_free_va() before kunmap*()? Is that possible? Right, if someone passes kmap()'ed address they are trying quite hard to break their own driver. Technically possible but I wouldn't worry. I just mean that in the common case of non-HIGHMEM page, calling page_pool_free_va() with the address returned by page_address() is perfectly legal.
On 2023/10/18 23:35, Jakub Kicinski wrote: > On Wed, 18 Oct 2023 19:47:16 +0800 Yunsheng Lin wrote: >>> mention it in the documentation. Plus the kdoc of the function should >>> say that this is just a thin wrapper around other page pool APIs, and >>> it's safe to mix it with other page pool APIs? >> >> I am not sure I understand what do 'safe' and 'mix' mean here. >> >> For 'safe' part, I suppose you mean if there is a va accociated with >> a 'struct page' without calling some API like kmap()? For that, I suppose >> it is safe when the driver is calling page_pool API without the >> __GFP_HIGHMEM flag. Maybe we should mention that in the kdoc and give a >> warning if page_pool_*alloc_va() is called with the __GFP_HIGHMEM flag? > > Sounds good. Warning wrapped in #if CONFIG_DEBUG_NET perhaps? How about something like __get_free_pages() does with gfp flags? https://elixir.free-electrons.com/linux/v6.4-rc6/source/mm/page_alloc.c#L4818 how about something like below on top of this patchset: diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h index 7550beeacf3d..61cee55606c0 100644 --- a/include/net/page_pool/helpers.h +++ b/include/net/page_pool/helpers.h @@ -167,13 +167,13 @@ static inline struct page *page_pool_dev_alloc(struct page_pool *pool, return page_pool_alloc(pool, offset, size, gfp); } -static inline void *page_pool_cache_alloc(struct page_pool *pool, - unsigned int *size, gfp_t gfp) +static inline void *page_pool_alloc_va(struct page_pool *pool, + unsigned int *size, gfp_t gfp) { unsigned int offset; struct page *page; - page = page_pool_alloc(pool, &offset, size, gfp); + page = page_pool_alloc(pool, &offset, size, gfp & ~__GFP_HIGHMEM); if (unlikely(!page)) return NULL; @@ -181,21 +181,22 @@ static inline void *page_pool_cache_alloc(struct page_pool *pool, } /** - * page_pool_dev_cache_alloc() - allocate a cache. + * page_pool_dev_alloc_va() - allocate a page or a page fragment. * @pool: pool from which to allocate * @size: in as the requested size, out as the allocated size * - * Get a cache from the page allocator or page_pool caches. + * This is just a thin wrapper around the page_pool_alloc() API, and + * it returns va of the allocated page or page fragment. * * Return: - * Return the addr for the allocated cache, otherwise return NULL. + * Return the va for the allocated page or page fragment, otherwise return NULL. */ -static inline void *page_pool_dev_cache_alloc(struct page_pool *pool, - unsigned int *size) +static inline void *page_pool_dev_alloc_va(struct page_pool *pool, + unsigned int *size) { gfp_t gfp = (GFP_ATOMIC | __GFP_NOWARN); - return page_pool_cache_alloc(pool, size, gfp); + return page_pool_alloc_va(pool, size, gfp); } /** @@ -338,17 +339,17 @@ static inline void page_pool_recycle_direct(struct page_pool *pool, (sizeof(dma_addr_t) > sizeof(unsigned long)) /** - * page_pool_cache_free() - free a cache into the page_pool - * @pool: pool from which cache was allocated - * @data: addr of cache to be free + * page_pool_free_va() - free a va into the page_pool + * @pool: pool from which va was allocated + * @va: va to be free * @allow_direct: freed by the consumer, allow lockless caching * * Free a cache allocated from page_pool_dev_cache_alloc(). */ -static inline void page_pool_cache_free(struct page_pool *pool, void *data, - bool allow_direct) +static inline void page_pool_free_va(struct page_pool *pool, void *va, + bool allow_direct) { - page_pool_put_page(pool, virt_to_head_page(data), -1, allow_direct); + page_pool_put_page(pool, virt_to_head_page(va), -1, allow_direct); }
On Thu, 19 Oct 2023 21:22:07 +0800 Yunsheng Lin wrote: > > Sounds good. Warning wrapped in #if CONFIG_DEBUG_NET perhaps? > > How about something like __get_free_pages() does with gfp flags? > https://elixir.free-electrons.com/linux/v6.4-rc6/source/mm/page_alloc.c#L4818 Fine by me!