Message ID | 20221106211154.3225784-5-horatiu.vultur@microchip.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: lan966x: Add xdp support | expand |
From: Horatiu Vultur <horatiu.vultur@microchip.com> Date: Sun, 6 Nov 2022 22:11:54 +0100 > Use the page_pool API for allocation, freeing and DMA handling instead > of dev_alloc_pages, __free_pages and dma_map_page. > > Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> > --- > .../net/ethernet/microchip/lan966x/Kconfig | 1 + > .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++--------- > .../ethernet/microchip/lan966x/lan966x_main.h | 3 + > 3 files changed, 43 insertions(+), 33 deletions(-) [...] > @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx, > rx->last_entry = dcb; > } > > +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx) > +{ > + struct lan966x *lan966x = rx->lan966x; > + struct page_pool_params pp_params = { > + .order = rx->page_order, > + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, > + .pool_size = FDMA_DCB_MAX, > + .nid = NUMA_NO_NODE, > + .dev = lan966x->dev, > + .dma_dir = DMA_FROM_DEVICE, > + .offset = 0, > + .max_len = PAGE_SIZE << rx->page_order, ::max_len's primary purpose is to save time on DMA syncs. First of all, you can substract `SKB_DATA_ALIGN(sizeof(struct skb_shared_info))`, your HW never writes to those last couple hundred bytes. But I suggest calculating ::max_len basing on your current MTU value. Let's say you have 16k pages and MTU of 1500, that is a huge difference (except your DMA is always coherent, but I assume that's not the case). In lan966x_fdma_change_mtu() you do: max_mtu = lan966x_fdma_get_max_mtu(lan966x); max_mtu += IFH_LEN_BYTES; max_mtu += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); max_mtu += VLAN_HLEN * 2; `lan966x_fdma_get_max_mtu(lan966x) + IFH_LEN_BYTES + VLAN_HLEN * 2` (ie 1536 for the MTU of 1500) is your max_len value actually, given that you don't reserve any headroom (which is unfortunate, but I guess you're working on this already, since XDP requires %XDP_PACKET_HEADROOM). > + }; > + > + rx->page_pool = page_pool_create(&pp_params); > + if (IS_ERR(rx->page_pool)) > + return PTR_ERR(rx->page_pool); > + > + return 0; return PTR_ERR_OR_ZERO(rx->page_pool); > +} > + > static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx) > { > struct lan966x *lan966x = rx->lan966x; [...] > -- > 2.38.0 Thanks, Olek
The 11/07/2022 17:40, Alexander Lobakin wrote: Hi Olek, > > From: Horatiu Vultur <horatiu.vultur@microchip.com> > Date: Sun, 6 Nov 2022 22:11:54 +0100 > > > Use the page_pool API for allocation, freeing and DMA handling instead > > of dev_alloc_pages, __free_pages and dma_map_page. > > > > Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> > > --- > > .../net/ethernet/microchip/lan966x/Kconfig | 1 + > > .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++--------- > > .../ethernet/microchip/lan966x/lan966x_main.h | 3 + > > 3 files changed, 43 insertions(+), 33 deletions(-) > > [...] > > > @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx, > > rx->last_entry = dcb; > > } > > > > +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx) > > +{ > > + struct lan966x *lan966x = rx->lan966x; > > + struct page_pool_params pp_params = { > > + .order = rx->page_order, > > + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, > > + .pool_size = FDMA_DCB_MAX, > > + .nid = NUMA_NO_NODE, > > + .dev = lan966x->dev, > > + .dma_dir = DMA_FROM_DEVICE, > > + .offset = 0, > > + .max_len = PAGE_SIZE << rx->page_order, > > ::max_len's primary purpose is to save time on DMA syncs. > First of all, you can substract > `SKB_DATA_ALIGN(sizeof(struct skb_shared_info))`, your HW never > writes to those last couple hundred bytes. > But I suggest calculating ::max_len basing on your current MTU > value. Let's say you have 16k pages and MTU of 1500, that is a huge > difference (except your DMA is always coherent, but I assume that's > not the case). > > In lan966x_fdma_change_mtu() you do: > > max_mtu = lan966x_fdma_get_max_mtu(lan966x); > max_mtu += IFH_LEN_BYTES; > max_mtu += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > max_mtu += VLAN_HLEN * 2; > > `lan966x_fdma_get_max_mtu(lan966x) + IFH_LEN_BYTES + VLAN_HLEN * 2` > (ie 1536 for the MTU of 1500) is your max_len value actually, given > that you don't reserve any headroom (which is unfortunate, but I > guess you're working on this already, since XDP requires > %XDP_PACKET_HEADROOM). Thanks for the suggestion. I will try it. Regarding XDP_PACKET_HEADROOM, for the XDP_DROP, I didn't see it to be needed. Once the support for XDP_TX or XDP_REDIRECT is added, then yes I need to reserve also the headroom. > > > + }; > > + > > + rx->page_pool = page_pool_create(&pp_params); > > + if (IS_ERR(rx->page_pool)) > > + return PTR_ERR(rx->page_pool); > > + > > + return 0; > > return PTR_ERR_OR_ZERO(rx->page_pool); Yes, I will use this. > > > +} > > + > > static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx) > > { > > struct lan966x *lan966x = rx->lan966x; > > [...] > > > -- > > 2.38.0 > > Thanks, > Olek
From: Horatiu Vultur <horatiu.vultur@microchip.com> Date: Mon, 7 Nov 2022 22:35:21 +0100 > The 11/07/2022 17:40, Alexander Lobakin wrote: > > Hi Olek, > > > > > From: Horatiu Vultur <horatiu.vultur@microchip.com> > > Date: Sun, 6 Nov 2022 22:11:54 +0100 > > > > > Use the page_pool API for allocation, freeing and DMA handling instead > > > of dev_alloc_pages, __free_pages and dma_map_page. > > > > > > Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> > > > --- > > > .../net/ethernet/microchip/lan966x/Kconfig | 1 + > > > .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++--------- > > > .../ethernet/microchip/lan966x/lan966x_main.h | 3 + > > > 3 files changed, 43 insertions(+), 33 deletions(-) > > > > [...] > > > > > @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx, > > > rx->last_entry = dcb; > > > } > > > > > > +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx) > > > +{ > > > + struct lan966x *lan966x = rx->lan966x; > > > + struct page_pool_params pp_params = { > > > + .order = rx->page_order, > > > + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, > > > + .pool_size = FDMA_DCB_MAX, > > > + .nid = NUMA_NO_NODE, > > > + .dev = lan966x->dev, > > > + .dma_dir = DMA_FROM_DEVICE, > > > + .offset = 0, > > > + .max_len = PAGE_SIZE << rx->page_order, > > > > ::max_len's primary purpose is to save time on DMA syncs. > > First of all, you can substract > > `SKB_DATA_ALIGN(sizeof(struct skb_shared_info))`, your HW never > > writes to those last couple hundred bytes. > > But I suggest calculating ::max_len basing on your current MTU > > value. Let's say you have 16k pages and MTU of 1500, that is a huge > > difference (except your DMA is always coherent, but I assume that's > > not the case). > > > > In lan966x_fdma_change_mtu() you do: > > > > max_mtu = lan966x_fdma_get_max_mtu(lan966x); > > max_mtu += IFH_LEN_BYTES; > > max_mtu += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > max_mtu += VLAN_HLEN * 2; > > > > `lan966x_fdma_get_max_mtu(lan966x) + IFH_LEN_BYTES + VLAN_HLEN * 2` > > (ie 1536 for the MTU of 1500) is your max_len value actually, given > > that you don't reserve any headroom (which is unfortunate, but I > > guess you're working on this already, since XDP requires > > %XDP_PACKET_HEADROOM). > > Thanks for the suggestion. I will try it. > Regarding XDP_PACKET_HEADROOM, for the XDP_DROP, I didn't see it to be > needed. Once the support for XDP_TX or XDP_REDIRECT is added, then yes I > need to reserve also the headroom. Correct, since you're disabling metadata support in xdp_prepare_buff(), headroom is not needed for pass and drop actions. It's always good to have at least %NET_SKB_PAD headroom inside an skb, so that networking stack won't perform excessive reallocations, and your code currently misses that -- if I understand currently, after converting hardware-specific header to an Ethernet header you have 28 - 14 = 14 bytes of headroom, which sometimes can be not enough for example for forwarding cases. It's not related to XDP, but I would do that as a prerequisite patch for Tx/redirect, since you'll be adding headroom support anyway :) > > > > > > + }; > > > + > > > + rx->page_pool = page_pool_create(&pp_params); > > > + if (IS_ERR(rx->page_pool)) > > > + return PTR_ERR(rx->page_pool); [...] > > > -- > > > 2.38.0 > > > > Thanks, > > Olek > > -- > /Horatiu Thanks, Olek
The 11/08/2022 12:33, Alexander Lobakin wrote: > > From: Horatiu Vultur <horatiu.vultur@microchip.com> > Date: Mon, 7 Nov 2022 22:35:21 +0100 > > > The 11/07/2022 17:40, Alexander Lobakin wrote: > > > > Hi Olek, > > > > > > > > From: Horatiu Vultur <horatiu.vultur@microchip.com> > > > Date: Sun, 6 Nov 2022 22:11:54 +0100 > > > > > > > Use the page_pool API for allocation, freeing and DMA handling instead > > > > of dev_alloc_pages, __free_pages and dma_map_page. > > > > > > > > Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> > > > > --- > > > > .../net/ethernet/microchip/lan966x/Kconfig | 1 + > > > > .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++--------- > > > > .../ethernet/microchip/lan966x/lan966x_main.h | 3 + > > > > 3 files changed, 43 insertions(+), 33 deletions(-) > > > > > > [...] > > > > > > > @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx, > > > > rx->last_entry = dcb; > > > > } > > > > > > > > +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx) > > > > +{ > > > > + struct lan966x *lan966x = rx->lan966x; > > > > + struct page_pool_params pp_params = { > > > > + .order = rx->page_order, > > > > + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, > > > > + .pool_size = FDMA_DCB_MAX, > > > > + .nid = NUMA_NO_NODE, > > > > + .dev = lan966x->dev, > > > > + .dma_dir = DMA_FROM_DEVICE, > > > > + .offset = 0, > > > > + .max_len = PAGE_SIZE << rx->page_order, > > > > > > ::max_len's primary purpose is to save time on DMA syncs. > > > First of all, you can substract > > > `SKB_DATA_ALIGN(sizeof(struct skb_shared_info))`, your HW never > > > writes to those last couple hundred bytes. > > > But I suggest calculating ::max_len basing on your current MTU > > > value. Let's say you have 16k pages and MTU of 1500, that is a huge > > > difference (except your DMA is always coherent, but I assume that's > > > not the case). > > > > > > In lan966x_fdma_change_mtu() you do: > > > > > > max_mtu = lan966x_fdma_get_max_mtu(lan966x); > > > max_mtu += IFH_LEN_BYTES; > > > max_mtu += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > > max_mtu += VLAN_HLEN * 2; > > > > > > `lan966x_fdma_get_max_mtu(lan966x) + IFH_LEN_BYTES + VLAN_HLEN * 2` > > > (ie 1536 for the MTU of 1500) is your max_len value actually, given > > > that you don't reserve any headroom (which is unfortunate, but I > > > guess you're working on this already, since XDP requires > > > %XDP_PACKET_HEADROOM). > > > > Thanks for the suggestion. I will try it. > > Regarding XDP_PACKET_HEADROOM, for the XDP_DROP, I didn't see it to be > > needed. Once the support for XDP_TX or XDP_REDIRECT is added, then yes I > > need to reserve also the headroom. > > Correct, since you're disabling metadata support in > xdp_prepare_buff(), headroom is not needed for pass and drop > actions. > > It's always good to have at least %NET_SKB_PAD headroom inside an > skb, so that networking stack won't perform excessive reallocations, > and your code currently misses that -- if I understand currently, > after converting hardware-specific header to an Ethernet header you > have 28 - 14 = 14 bytes of headroom, which sometimes can be not > enough for example for forwarding cases. It's not related to XDP, > but I would do that as a prerequisite patch for Tx/redirect, since > you'll be adding headroom support anyway :) Just a small comment here. There is no need to convert hardware-specific header, because after that header there is the ethernet header. So I would have 28 bytes left for headroom, but that is still less then NET_SKB_PAD. But I got the idea. When I will add the Tx/redirect, one of those patches will be to make sure we have enough headroom. > > > > > > > > > > + }; > > > > + > > > > + rx->page_pool = page_pool_create(&pp_params); > > > > + if (IS_ERR(rx->page_pool)) > > > > + return PTR_ERR(rx->page_pool); > > [...] > > > > > -- > > > > 2.38.0 > > > > > > Thanks, > > > Olek > > > > -- > > /Horatiu > > Thanks, > Olek
diff --git a/drivers/net/ethernet/microchip/lan966x/Kconfig b/drivers/net/ethernet/microchip/lan966x/Kconfig index 49e1464a43139..b7ae5ce7d3f7a 100644 --- a/drivers/net/ethernet/microchip/lan966x/Kconfig +++ b/drivers/net/ethernet/microchip/lan966x/Kconfig @@ -7,5 +7,6 @@ config LAN966X_SWITCH depends on BRIDGE || BRIDGE=n select PHYLINK select PACKING + select PAGE_POOL help This driver supports the Lan966x network switch device. diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c index fa4198c617667..822c1b053e2d0 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c @@ -10,47 +10,25 @@ static int lan966x_fdma_channel_active(struct lan966x *lan966x) static struct page *lan966x_fdma_rx_alloc_page(struct lan966x_rx *rx, struct lan966x_db *db) { - struct lan966x *lan966x = rx->lan966x; - dma_addr_t dma_addr; struct page *page; - page = dev_alloc_pages(rx->page_order); + page = page_pool_dev_alloc_pages(rx->page_pool); if (unlikely(!page)) return NULL; - dma_addr = dma_map_page(lan966x->dev, page, 0, - PAGE_SIZE << rx->page_order, - DMA_FROM_DEVICE); - if (unlikely(dma_mapping_error(lan966x->dev, dma_addr))) - goto free_page; - - db->dataptr = dma_addr; + db->dataptr = page_pool_get_dma_addr(page); return page; - -free_page: - __free_pages(page, rx->page_order); - return NULL; } static void lan966x_fdma_rx_free_pages(struct lan966x_rx *rx) { - struct lan966x *lan966x = rx->lan966x; - struct lan966x_rx_dcb *dcb; - struct lan966x_db *db; int i, j; for (i = 0; i < FDMA_DCB_MAX; ++i) { - dcb = &rx->dcbs[i]; - - for (j = 0; j < FDMA_RX_DCB_MAX_DBS; ++j) { - db = &dcb->db[j]; - dma_unmap_single(lan966x->dev, - (dma_addr_t)db->dataptr, - PAGE_SIZE << rx->page_order, - DMA_FROM_DEVICE); - __free_pages(rx->page[i][j], rx->page_order); - } + for (j = 0; j < FDMA_RX_DCB_MAX_DBS; ++j) + page_pool_put_full_page(rx->page_pool, + rx->page[i][j], false); } } @@ -62,7 +40,7 @@ static void lan966x_fdma_rx_free_page(struct lan966x_rx *rx) if (unlikely(!page)) return; - __free_pages(page, rx->page_order); + page_pool_recycle_direct(rx->page_pool, page); } static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx, @@ -84,6 +62,27 @@ static void lan966x_fdma_rx_add_dcb(struct lan966x_rx *rx, rx->last_entry = dcb; } +static int lan966x_fdma_rx_alloc_page_pool(struct lan966x_rx *rx) +{ + struct lan966x *lan966x = rx->lan966x; + struct page_pool_params pp_params = { + .order = rx->page_order, + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, + .pool_size = FDMA_DCB_MAX, + .nid = NUMA_NO_NODE, + .dev = lan966x->dev, + .dma_dir = DMA_FROM_DEVICE, + .offset = 0, + .max_len = PAGE_SIZE << rx->page_order, + }; + + rx->page_pool = page_pool_create(&pp_params); + if (IS_ERR(rx->page_pool)) + return PTR_ERR(rx->page_pool); + + return 0; +} + static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx) { struct lan966x *lan966x = rx->lan966x; @@ -93,6 +92,9 @@ static int lan966x_fdma_rx_alloc(struct lan966x_rx *rx) int i, j; int size; + if (lan966x_fdma_rx_alloc_page_pool(rx)) + return PTR_ERR(rx->page_pool); + /* calculate how many pages are needed to allocate the dcbs */ size = sizeof(struct lan966x_rx_dcb) * FDMA_DCB_MAX; size = ALIGN(size, PAGE_SIZE); @@ -436,10 +438,6 @@ static int lan966x_fdma_rx_check_frame(struct lan966x_rx *rx, u64 *src_port) FDMA_DCB_STATUS_BLOCKL(db->status), DMA_FROM_DEVICE); - dma_unmap_single_attrs(lan966x->dev, (dma_addr_t)db->dataptr, - PAGE_SIZE << rx->page_order, DMA_FROM_DEVICE, - DMA_ATTR_SKIP_CPU_SYNC); - lan966x_ifh_get_src_port(page_address(page), src_port); if (WARN_ON(*src_port >= lan966x->num_phys_ports)) return FDMA_ERROR; @@ -468,6 +466,8 @@ static struct sk_buff *lan966x_fdma_rx_get_frame(struct lan966x_rx *rx, if (unlikely(!skb)) goto free_page; + skb_mark_for_recycle(skb); + skb_put(skb, FDMA_DCB_STATUS_BLOCKL(db->status)); lan966x_ifh_get_timestamp(skb->data, ×tamp); @@ -495,7 +495,7 @@ static struct sk_buff *lan966x_fdma_rx_get_frame(struct lan966x_rx *rx, return skb; free_page: - __free_pages(page, rx->page_order); + page_pool_recycle_direct(rx->page_pool, page); return NULL; } @@ -740,6 +740,7 @@ static int lan966x_qsys_sw_status(struct lan966x *lan966x) static int lan966x_fdma_reload(struct lan966x *lan966x, int new_mtu) { + struct page_pool *page_pool; dma_addr_t rx_dma; void *rx_dcbs; u32 size; @@ -748,6 +749,7 @@ static int lan966x_fdma_reload(struct lan966x *lan966x, int new_mtu) /* Store these for later to free them */ rx_dma = lan966x->rx.dma; rx_dcbs = lan966x->rx.dcbs; + page_pool = lan966x->rx.page_pool; napi_synchronize(&lan966x->napi); napi_disable(&lan966x->napi); @@ -765,11 +767,14 @@ static int lan966x_fdma_reload(struct lan966x *lan966x, int new_mtu) size = ALIGN(size, PAGE_SIZE); dma_free_coherent(lan966x->dev, size, rx_dcbs, rx_dma); + page_pool_destroy(page_pool); + lan966x_fdma_wakeup_netdev(lan966x); napi_enable(&lan966x->napi); return err; restore: + lan966x->rx.page_pool = page_pool; lan966x->rx.dma = rx_dma; lan966x->rx.dcbs = rx_dcbs; lan966x_fdma_rx_start(&lan966x->rx); @@ -876,5 +881,6 @@ void lan966x_fdma_deinit(struct lan966x *lan966x) lan966x_fdma_rx_free_pages(&lan966x->rx); lan966x_fdma_rx_free(&lan966x->rx); + page_pool_destroy(lan966x->rx.page_pool); lan966x_fdma_tx_free(&lan966x->tx); } diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h index 18f727f397fd6..97652a73017a2 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h @@ -9,6 +9,7 @@ #include <linux/phy.h> #include <linux/phylink.h> #include <linux/ptp_clock_kernel.h> +#include <net/page_pool.h> #include <net/pkt_cls.h> #include <net/pkt_sched.h> #include <net/switchdev.h> @@ -162,6 +163,8 @@ struct lan966x_rx { u8 page_order; u8 channel_id; + + struct page_pool *page_pool; }; struct lan966x_tx_dcb_buf {
Use the page_pool API for allocation, freeing and DMA handling instead of dev_alloc_pages, __free_pages and dma_map_page. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> --- .../net/ethernet/microchip/lan966x/Kconfig | 1 + .../ethernet/microchip/lan966x/lan966x_fdma.c | 72 ++++++++++--------- .../ethernet/microchip/lan966x/lan966x_main.h | 3 + 3 files changed, 43 insertions(+), 33 deletions(-)