From patchwork Tue Mar 5 02:01:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mina Almasry X-Patchwork-Id: 13581492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E4025C54E49 for ; Tue, 5 Mar 2024 02:02:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1E96511282B; Tue, 5 Mar 2024 02:02:20 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="aNTe5vnv"; dkim-atps=neutral Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by gabe.freedesktop.org (Postfix) with ESMTPS id 18A4511282C for ; Tue, 5 Mar 2024 02:02:18 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dcc15b03287so7306720276.3 for ; Mon, 04 Mar 2024 18:02:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709604137; x=1710208937; darn=lists.freedesktop.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=byPVwM06Y0oEPlUZgHka14Nm3J9bKSTxMh+j9bVfhME=; b=aNTe5vnvHY89EOPPB3aGt9plBIqg786UWOM8QgNGuGVYAneETYpgnHhkqh3oQsq2sk i8CktLorSXb47DIuVU5NT8MJ4ewBCacDL4/ulJ4CwFGe/AtR08A+IBLwG3mOFHE7UBtz jinvJCc85bh4YskF8u9s45fl1MWmC76wPqIGH6I34J/sMjtVlwowGJDNhXUej/mgyrr7 kfjyj2oTqXq85q8/MYGlwqwB9vW3hbWHk9r8NQEFy0DVoNMk3arAdTuTQ5a5DD6Otsxf EP4Pn22tLT+9HjtMT77P0kMbEhyp3Ul0xPPhCX3ldAkc//xsafqMI3qTfk8h0WJ1qbvq dWyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709604137; x=1710208937; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=byPVwM06Y0oEPlUZgHka14Nm3J9bKSTxMh+j9bVfhME=; b=GiKY20hKM1yaqNFL0HE5Q76V6XRCbOBBfq3z06H7wxhzvCRIof00JcsQuRG24HQ2+J Fign5C8oPx7wzuoCNxcrOaDkOF9bhy7tJjuv0vEzecZj10aeH3Ef96smzxK1IC3b3pI+ enOmZsEaHrDwUgB8qy5i2hl9tlDXewAme9XO789PA4fMGiFYwUlAkIcBQpZcbi2jzMbZ 4FW5/jjVzJk1NBtTT6M0+hc6z6XeS2eGleLVOjma3uN5kNP4bwd7IhpcyL3+H+D655hP 6LgQXLQ7Mq5JlvhlW3HSJ6P35ZimEoAdpvUZNr1x6kRi6Gbe/Po0wGteo/kgUvNBuAXy sXvQ== X-Forwarded-Encrypted: i=1; AJvYcCXO7wPNBhP2oc6XZUc6+TPO9EXbHoBNaVj3ucvMUqHK9t+Ak4jslcg03v1tynstNawwowvdmOlokVhhLN0PjrTRCcokNgV1xpcwjwhGIWVr X-Gm-Message-State: AOJu0Ywh/2QS7wbr0asrI2eVmfmemRn7PAG4WpSmIcU8ZfSITE3D+GpK n3JfXQl6uDdldjbe3eHrGJ8OdrF6YLIjlqTCpHGZ0K6xBUtMh4eUrQbWY4EKJVD7YfmrnfuyWts aC+oGTGsWcPP2q8rfCMNyRw== X-Google-Smtp-Source: AGHT+IEoXl0wNdHnbs6D9FyjuWe8y/qyisOWLTyu22a+L//Vxmoz1SXuDH6MSW6bIn4oo2iyEmWJJoNV4OBrBIfN8g== X-Received: from almasrymina.svl.corp.google.com ([2620:15c:2c4:200:b614:914c:63cd:3830]) (user=almasrymina job=sendgmr) by 2002:a05:6902:f0b:b0:dcc:79ab:e522 with SMTP id et11-20020a0569020f0b00b00dcc79abe522mr463378ybb.11.1709604137118; Mon, 04 Mar 2024 18:02:17 -0800 (PST) Date: Mon, 4 Mar 2024 18:01:44 -0800 In-Reply-To: <20240305020153.2787423-1-almasrymina@google.com> Mime-Version: 1.0 References: <20240305020153.2787423-1-almasrymina@google.com> X-Mailer: git-send-email 2.44.0.rc1.240.g4c46232300-goog Message-ID: <20240305020153.2787423-10-almasrymina@google.com> Subject: [RFC PATCH net-next v6 09/15] memory-provider: dmabuf devmem memory provider From: Mina Almasry To: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-arch@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Mina Almasry , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Andreas Larsson , Jesper Dangaard Brouer , Ilias Apalodimas , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Arnd Bergmann , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , " =?utf-8?q?Christian_K=C3=B6nig?= " , Pavel Begunkov , David Wei , Jason Gunthorpe , Yunsheng Lin , Shailend Chand , Harshitha Ramamurthy , Shakeel Butt , Jeroen de Borst , Praveen Kaligineedi , Willem de Bruijn , Kaiyuan Zhang X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Implement a memory provider that allocates dmabuf devmem in the form of net_iov. The provider receives a reference to the struct netdev_dmabuf_binding via the pool->mp_priv pointer. The driver needs to set this pointer for the provider in the net_iov. The provider obtains a reference on the netdev_dmabuf_binding which guarantees the binding and the underlying mapping remains alive until the provider is destroyed. Usage of PP_FLAG_DMA_MAP is required for this memory provide such that the page_pool can provide the driver with the dma-addrs of the devmem. Support for PP_FLAG_DMA_SYNC_DEV is omitted for simplicity & p.order != 0. Signed-off-by: Willem de Bruijn Signed-off-by: Kaiyuan Zhang Signed-off-by: Mina Almasry --- v6: - refactor new memory provider functions into net/core/devmem.c (Pavel) v2: - Disable devmem for p.order != 0 v1: - static_branch check in page_is_page_pool_iov() (Willem & Paolo). - PP_DEVMEM -> PP_IOV (David). - Require PP_FLAG_DMA_MAP (Jakub). --- include/net/netmem.h | 14 ++++++ include/net/page_pool/helpers.h | 21 +++++++++ include/net/page_pool/types.h | 2 + net/core/devmem.c | 82 +++++++++++++++++++++++++++++++++ net/core/page_pool.c | 35 ++++++-------- 5 files changed, 132 insertions(+), 22 deletions(-) diff --git a/include/net/netmem.h b/include/net/netmem.h index 8699788d587d..a2de9411025d 100644 --- a/include/net/netmem.h +++ b/include/net/netmem.h @@ -127,6 +127,20 @@ static inline struct page *netmem_to_page(netmem_ref netmem) return (__force struct page *)netmem; } +static inline struct net_iov *netmem_to_net_iov(netmem_ref netmem) +{ + if (netmem_is_net_iov(netmem)) + return (struct net_iov *)((__force unsigned long)netmem & ~NET_IOV); + + DEBUG_NET_WARN_ON_ONCE(true); + return NULL; +} + +static inline netmem_ref net_iov_to_netmem(struct net_iov *niov) +{ + return (__force netmem_ref)((unsigned long)niov | NET_IOV); +} + static inline netmem_ref page_to_netmem(struct page *page) { return (__force netmem_ref)page; diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h index c6a55eddefae..00682b4de6e8 100644 --- a/include/net/page_pool/helpers.h +++ b/include/net/page_pool/helpers.h @@ -453,4 +453,25 @@ static inline void page_pool_nid_changed(struct page_pool *pool, int new_nid) page_pool_update_nid(pool, new_nid); } +static inline void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem) +{ + netmem_set_pp(netmem, pool); + netmem_or_pp_magic(netmem, PP_SIGNATURE); + + /* Ensuring all pages have been split into one fragment initially: + * page_pool_set_pp_info() is only called once for every page when it + * is allocated from the page allocator and page_pool_fragment_page() + * is dirtying the same cache line as the page->pp_magic above, so + * the overhead is negligible. + */ + page_pool_fragment_netmem(netmem, 1); + if (pool->has_init_callback) + pool->slow.init_callback(netmem, pool->slow.init_arg); +} + +static inline void page_pool_clear_pp_info(netmem_ref netmem) +{ + netmem_clear_pp_magic(netmem); + netmem_set_pp(netmem, NULL); +} #endif /* _NET_PAGE_POOL_HELPERS_H */ diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index e29e77f7934e..096cd2455b2c 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -136,6 +136,8 @@ struct memory_provider_ops { bool (*release_page)(struct page_pool *pool, netmem_ref netmem); }; +extern const struct memory_provider_ops dmabuf_devmem_ops; + struct page_pool { struct page_pool_params_fast p; diff --git a/net/core/devmem.c b/net/core/devmem.c index 57d3a1f223ef..3ced312f7860 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -329,3 +329,85 @@ int netdev_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd, return err; } #endif + +/*** "Dmabuf devmem memory provider" ***/ + +static int mp_dmabuf_devmem_init(struct page_pool *pool) +{ + struct netdev_dmabuf_binding *binding = pool->mp_priv; + + if (!binding) + return -EINVAL; + + if (!(pool->p.flags & PP_FLAG_DMA_MAP)) + return -EOPNOTSUPP; + + if (pool->p.flags & PP_FLAG_DMA_SYNC_DEV) + return -EOPNOTSUPP; + + if (pool->p.order != 0) + return -E2BIG; + + netdev_dmabuf_binding_get(binding); + return 0; +} + +static netmem_ref mp_dmabuf_devmem_alloc_pages(struct page_pool *pool, + gfp_t gfp) +{ + struct netdev_dmabuf_binding *binding = pool->mp_priv; + netmem_ref netmem; + struct net_iov *niov; + dma_addr_t dma_addr; + + niov = netdev_alloc_dmabuf(binding); + if (!niov) + return 0; + + dma_addr = net_iov_dma_addr(niov); + + netmem = net_iov_to_netmem(niov); + + page_pool_set_pp_info(pool, netmem); + + if (page_pool_set_dma_addr_netmem(netmem, dma_addr)) + goto err_free; + + pool->pages_state_hold_cnt++; + trace_page_pool_state_hold(pool, netmem, pool->pages_state_hold_cnt); + return netmem; + +err_free: + netdev_free_dmabuf(niov); + return 0; +} + +static void mp_dmabuf_devmem_destroy(struct page_pool *pool) +{ + struct netdev_dmabuf_binding *binding = pool->mp_priv; + + netdev_dmabuf_binding_put(binding); +} + +static bool mp_dmabuf_devmem_release_page(struct page_pool *pool, + netmem_ref netmem) +{ + WARN_ON_ONCE(!netmem_is_net_iov(netmem)); + WARN_ON_ONCE(atomic_long_read(netmem_get_pp_ref_count_ref(netmem)) + != 1); + + page_pool_clear_pp_info(netmem); + + netdev_free_dmabuf(netmem_to_net_iov(netmem)); + + /* We don't want the page pool put_page()ing our net_iovs. */ + return false; +} + +const struct memory_provider_ops dmabuf_devmem_ops = { + .init = mp_dmabuf_devmem_init, + .destroy = mp_dmabuf_devmem_destroy, + .alloc_pages = mp_dmabuf_devmem_alloc_pages, + .release_page = mp_dmabuf_devmem_release_page, +}; +EXPORT_SYMBOL(dmabuf_devmem_ops); diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 22e3d439da18..2cee7d9f6ca6 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -12,6 +12,7 @@ #include #include +#include #include #include @@ -20,12 +21,15 @@ #include #include #include +#include +#include #include #include "page_pool_priv.h" DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers); +EXPORT_SYMBOL(page_pool_mem_providers); #define DEFER_TIME (msecs_to_jiffies(1000)) #define DEFER_WARN_INTERVAL (60 * HZ) @@ -178,6 +182,7 @@ static int page_pool_init(struct page_pool *pool, const struct page_pool_params *params, int cpuid) { + struct netdev_dmabuf_binding *binding = NULL; unsigned int ring_qsize = 1024; /* Default */ int err; @@ -251,6 +256,14 @@ static int page_pool_init(struct page_pool *pool, /* Driver calling page_pool_create() also call page_pool_destroy() */ refcount_set(&pool->user_cnt, 1); + if (pool->p.queue) + binding = READ_ONCE(pool->p.queue->binding); + + if (binding) { + pool->mp_ops = &dmabuf_devmem_ops; + pool->mp_priv = binding; + } + if (pool->mp_ops) { err = pool->mp_ops->init(pool); if (err) { @@ -444,28 +457,6 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem) return false; } -static void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem) -{ - netmem_set_pp(netmem, pool); - netmem_or_pp_magic(netmem, PP_SIGNATURE); - - /* Ensuring all pages have been split into one fragment initially: - * page_pool_set_pp_info() is only called once for every page when it - * is allocated from the page allocator and page_pool_fragment_page() - * is dirtying the same cache line as the page->pp_magic above, so - * the overhead is negligible. - */ - page_pool_fragment_netmem(netmem, 1); - if (pool->has_init_callback) - pool->slow.init_callback(netmem, pool->slow.init_arg); -} - -static void page_pool_clear_pp_info(netmem_ref netmem) -{ - netmem_clear_pp_magic(netmem); - netmem_set_pp(netmem, NULL); -} - static struct page *__page_pool_alloc_page_order(struct page_pool *pool, gfp_t gfp) {