Message ID | 20240620085200.583709-1-0x1207@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v1] net: stmmac: xgmac: increase length limit of descriptor ring | expand |
Hi Furong On Thu, Jun 20, 2024 at 04:52:00PM +0800, Furong Xu wrote: > DWXGMAC CORE supports a ring length of 65536 descriptors, bump max length > from 1024 to 65536 What XGMAC IP-core version are you talking about? The DW XGMAC IP-core databooks I have define the upper limit much lesser than that. Do you understand that specifying 65K descriptors will cause a huge amount of memory consumed, right? Each descriptor is equipped with at least 1-page buffer. If QoS/XGMAC SPH is enabled then each descriptor is equipped with a second buffer. So 65K-descriptor will cause allocation of at least 65536 * (4 * 4) bytes + 65536 * PAGE_SIZE bytes. So it's ~256MB for the smallest possible 4K-pages. Not to mention that there can be more than one queue, two buffers assigned to each descriptor and more than a single page allocated for each buffer in case of jumbos. All of that will multiply the basic ~256MB memory consumption. Taking all of the above into account, what is the practical reason of having so many descriptors allocated? Are you afraid your CPU won't keep up with some heavy incoming traffic? Just a note about GMACs. The only GMAC having the ring-length limited is DW QoS Eth (v4.x/v5.x). It may have up to 1K descriptors in the ring. DW GMAC v3.73a doesn't have the descriptors array length constraint. The last descriptor is marked by a special flag TDESC0.21 and RDESC1.15, after meeting which the DMA-engine gets back to the first descriptor in the ring. -Serge(y) > > Signed-off-by: Furong Xu <0x1207@gmail.com> > --- > .../net/ethernet/stmicro/stmmac/dwxgmac2.h | 2 ++ > .../ethernet/stmicro/stmmac/stmmac_ethtool.c | 24 +++++++++++++++---- > 2 files changed, 22 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > index 6a2c7d22df1e..264f4f876c74 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > @@ -11,6 +11,8 @@ > > /* Misc */ > #define XGMAC_JUMBO_LEN 16368 > +#define XGMAC_DMA_MAX_TX_SIZE 65536 > +#define XGMAC_DMA_MAX_RX_SIZE 65536 > > /* MAC Registers */ > #define XGMAC_TX_CONFIG 0x00000000 > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c > index 18468c0228f0..3ae465c5a712 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c > @@ -491,9 +491,16 @@ static void stmmac_get_ringparam(struct net_device *netdev, > struct netlink_ext_ack *extack) > { > struct stmmac_priv *priv = netdev_priv(netdev); > + u32 dma_max_rx_size = DMA_MAX_RX_SIZE; > + u32 dma_max_tx_size = DMA_MAX_TX_SIZE; > > - ring->rx_max_pending = DMA_MAX_RX_SIZE; > - ring->tx_max_pending = DMA_MAX_TX_SIZE; > + if (priv->plat->has_xgmac) { > + dma_max_rx_size = XGMAC_DMA_MAX_RX_SIZE; > + dma_max_tx_size = XGMAC_DMA_MAX_TX_SIZE; > + } > + > + ring->rx_max_pending = dma_max_rx_size; > + ring->tx_max_pending = dma_max_tx_size; Do you understand the consequence of this change, right? De > ring->rx_pending = priv->dma_conf.dma_rx_size; > ring->tx_pending = priv->dma_conf.dma_tx_size; > } > @@ -503,12 +510,21 @@ static int stmmac_set_ringparam(struct net_device *netdev, > struct kernel_ethtool_ringparam *kernel_ring, > struct netlink_ext_ack *extack) > { > + struct stmmac_priv *priv = netdev_priv(netdev); > + u32 dma_max_rx_size = DMA_MAX_RX_SIZE; > + u32 dma_max_tx_size = DMA_MAX_TX_SIZE; > + > + if (priv->plat->has_xgmac) { > + dma_max_rx_size = XGMAC_DMA_MAX_RX_SIZE; > + dma_max_tx_size = XGMAC_DMA_MAX_TX_SIZE; > + } > + > if (ring->rx_mini_pending || ring->rx_jumbo_pending || > ring->rx_pending < DMA_MIN_RX_SIZE || > - ring->rx_pending > DMA_MAX_RX_SIZE || > + ring->rx_pending > dma_max_rx_size || > !is_power_of_2(ring->rx_pending) || > ring->tx_pending < DMA_MIN_TX_SIZE || > - ring->tx_pending > DMA_MAX_TX_SIZE || > + ring->tx_pending > dma_max_tx_size || > !is_power_of_2(ring->tx_pending)) > return -EINVAL; > > -- > 2.34.1 > >
On Thu, Jun 20, 2024 at 04:52:00PM +0800, Furong Xu wrote: > DWXGMAC CORE supports a ring length of 65536 descriptors, bump max length > from 1024 to 65536 If anything, you want to be reducing the number of descriptors, especially for TX. Does stmmac implement BQL? If not, try adding that, so packets are kept in the queues for longer, and only passed to the MAC at a rate it can put them on the wire. Andrew --- pw-bot: cr
On Thu, 20 Jun 2024 15:06:57 +0300 Serge Semin <fancer.lancer@gmail.com> wrote: > Hi Furong > > On Thu, Jun 20, 2024 at 04:52:00PM +0800, Furong Xu wrote: > > DWXGMAC CORE supports a ring length of 65536 descriptors, bump max length > > from 1024 to 65536 > > What XGMAC IP-core version are you talking about? The DW XGMAC > IP-core databooks I have define the upper limit much lesser than that. > Hi Serge Thanks for this information. I double checked 3.10a and 3.20a, 3.10a do have a limit to 16K, and 3.20a bump the limit to 64K. So we need to lower the limit to fit all XGMAC versions. And what about your advice to set this limit? > Do you understand that specifying 65K descriptors will cause a huge > amount of memory consumed, right? Each descriptor is equipped with at > least 1-page buffer. If QoS/XGMAC SPH is enabled then each descriptor > is equipped with a second buffer. So 65K-descriptor will cause > allocation of at least 65536 * (4 * 4) bytes + 65536 * PAGE_SIZE > bytes. So it's ~256MB for the smallest possible 4K-pages. Not to > mention that there can be more than one queue, two buffers assigned to > each descriptor and more than a single page allocated for each buffer > in case of jumbos. All of that will multiply the basic ~256MB memory > consumption. > Fully agree with you. This patch is trying to make it possible for ethtool to set a longer descriptor length against XGMAC. All MAC cores still have 512 descriptors allocated by default for both TX and RX, which is defined by DMA_DEFAULT_TX_SIZE and DMA_DEFAULT_RX_SIZE in drivers/net/ethernet/stmicro/stmmac/common.h This patch does not change the default descriptor length for XGMAC core, but give ethtool a chance to set a bigger value than DMA_MAX_TX_SIZE and DMA_MAX_RX_SIZE defined in drivers/net/ethernet/stmicro/stmmac/common.h > Taking all of the above into account, what is the practical reason of > having so many descriptors allocated? Are you afraid your CPU won't > keep up with some heavy incoming traffic? > Heavy incoming traffic on some heavy load system, the max 1024 limit defined by DMA_MAX_RX_SIZE in drivers/net/ethernet/stmicro/stmmac/common.h is too few to achieve high throughput for XGMAC. With this patch, ethtool can set a new length than 1024 > Just a note about GMACs. The only GMAC having the ring-length limited > is DW QoS Eth (v4.x/v5.x). It may have up to 1K descriptors in the > ring. DW GMAC v3.73a doesn't have the descriptors array length constraint. > The last descriptor is marked by a special flag TDESC0.21 and > RDESC1.15, after meeting which the DMA-engine gets back to the first > descriptor in the ring. > > -Serge(y) > > > > > Signed-off-by: Furong Xu <0x1207@gmail.com> > > --- > > .../net/ethernet/stmicro/stmmac/dwxgmac2.h | 2 ++ > > .../ethernet/stmicro/stmmac/stmmac_ethtool.c | 24 +++++++++++++++---- > > 2 files changed, 22 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > index 6a2c7d22df1e..264f4f876c74 100644 > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > @@ -11,6 +11,8 @@ > > > > /* Misc */ > > #define XGMAC_JUMBO_LEN 16368 > > +#define XGMAC_DMA_MAX_TX_SIZE 65536 > > +#define XGMAC_DMA_MAX_RX_SIZE 65536 > > > > /* MAC Registers */ > > #define XGMAC_TX_CONFIG 0x00000000 > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c > > index 18468c0228f0..3ae465c5a712 100644 > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c > > @@ -491,9 +491,16 @@ static void stmmac_get_ringparam(struct net_device *netdev, > > struct netlink_ext_ack *extack) > > { > > struct stmmac_priv *priv = netdev_priv(netdev); > > > + u32 dma_max_rx_size = DMA_MAX_RX_SIZE; > > + u32 dma_max_tx_size = DMA_MAX_TX_SIZE; > > > > - ring->rx_max_pending = DMA_MAX_RX_SIZE; > > - ring->tx_max_pending = DMA_MAX_TX_SIZE; > > + if (priv->plat->has_xgmac) { > > + dma_max_rx_size = XGMAC_DMA_MAX_RX_SIZE; > > + dma_max_tx_size = XGMAC_DMA_MAX_TX_SIZE; > > + } > > + > > + ring->rx_max_pending = dma_max_rx_size; > > + ring->tx_max_pending = dma_max_tx_size; > > Do you understand the consequence of this change, right? > De > > > ring->rx_pending = priv->dma_conf.dma_rx_size; > > ring->tx_pending = priv->dma_conf.dma_tx_size; > > } > > @@ -503,12 +510,21 @@ static int stmmac_set_ringparam(struct net_device *netdev, > > struct kernel_ethtool_ringparam *kernel_ring, > > struct netlink_ext_ack *extack) > > { > > + struct stmmac_priv *priv = netdev_priv(netdev); > > + u32 dma_max_rx_size = DMA_MAX_RX_SIZE; > > + u32 dma_max_tx_size = DMA_MAX_TX_SIZE; > > + > > + if (priv->plat->has_xgmac) { > > + dma_max_rx_size = XGMAC_DMA_MAX_RX_SIZE; > > + dma_max_tx_size = XGMAC_DMA_MAX_TX_SIZE; > > + } > > + > > if (ring->rx_mini_pending || ring->rx_jumbo_pending || > > ring->rx_pending < DMA_MIN_RX_SIZE || > > - ring->rx_pending > DMA_MAX_RX_SIZE || > > + ring->rx_pending > dma_max_rx_size || > > !is_power_of_2(ring->rx_pending) || > > ring->tx_pending < DMA_MIN_TX_SIZE || > > - ring->tx_pending > DMA_MAX_TX_SIZE || > > + ring->tx_pending > dma_max_tx_size || > > !is_power_of_2(ring->tx_pending)) > > return -EINVAL; > > > > -- > > 2.34.1 > > > >
> Heavy incoming traffic on some heavy load system, the max 1024 limit defined > by DMA_MAX_RX_SIZE in drivers/net/ethernet/stmicro/stmmac/common.h is too > few to achieve high throughput for XGMAC. > With this patch, ethtool can set a new length than 1024 Please include some benchmark results to show the improvement. But at some point, more buffers don't help you. If you are consistently overloaded, you will still overflow the buffers. So you might want to look at where is the bottleneck and how do you prioritise processing packets over whatever else is loading the system. Maybe this would help, if the bus is the problem: https://www.spinics.net/lists/netdev/msg1006370.html Andrew
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h index 6a2c7d22df1e..264f4f876c74 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h @@ -11,6 +11,8 @@ /* Misc */ #define XGMAC_JUMBO_LEN 16368 +#define XGMAC_DMA_MAX_TX_SIZE 65536 +#define XGMAC_DMA_MAX_RX_SIZE 65536 /* MAC Registers */ #define XGMAC_TX_CONFIG 0x00000000 diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c index 18468c0228f0..3ae465c5a712 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c @@ -491,9 +491,16 @@ static void stmmac_get_ringparam(struct net_device *netdev, struct netlink_ext_ack *extack) { struct stmmac_priv *priv = netdev_priv(netdev); + u32 dma_max_rx_size = DMA_MAX_RX_SIZE; + u32 dma_max_tx_size = DMA_MAX_TX_SIZE; - ring->rx_max_pending = DMA_MAX_RX_SIZE; - ring->tx_max_pending = DMA_MAX_TX_SIZE; + if (priv->plat->has_xgmac) { + dma_max_rx_size = XGMAC_DMA_MAX_RX_SIZE; + dma_max_tx_size = XGMAC_DMA_MAX_TX_SIZE; + } + + ring->rx_max_pending = dma_max_rx_size; + ring->tx_max_pending = dma_max_tx_size; ring->rx_pending = priv->dma_conf.dma_rx_size; ring->tx_pending = priv->dma_conf.dma_tx_size; } @@ -503,12 +510,21 @@ static int stmmac_set_ringparam(struct net_device *netdev, struct kernel_ethtool_ringparam *kernel_ring, struct netlink_ext_ack *extack) { + struct stmmac_priv *priv = netdev_priv(netdev); + u32 dma_max_rx_size = DMA_MAX_RX_SIZE; + u32 dma_max_tx_size = DMA_MAX_TX_SIZE; + + if (priv->plat->has_xgmac) { + dma_max_rx_size = XGMAC_DMA_MAX_RX_SIZE; + dma_max_tx_size = XGMAC_DMA_MAX_TX_SIZE; + } + if (ring->rx_mini_pending || ring->rx_jumbo_pending || ring->rx_pending < DMA_MIN_RX_SIZE || - ring->rx_pending > DMA_MAX_RX_SIZE || + ring->rx_pending > dma_max_rx_size || !is_power_of_2(ring->rx_pending) || ring->tx_pending < DMA_MIN_TX_SIZE || - ring->tx_pending > DMA_MAX_TX_SIZE || + ring->tx_pending > dma_max_tx_size || !is_power_of_2(ring->tx_pending)) return -EINVAL;
DWXGMAC CORE supports a ring length of 65536 descriptors, bump max length from 1024 to 65536 Signed-off-by: Furong Xu <0x1207@gmail.com> --- .../net/ethernet/stmicro/stmmac/dwxgmac2.h | 2 ++ .../ethernet/stmicro/stmmac/stmmac_ethtool.c | 24 +++++++++++++++---- 2 files changed, 22 insertions(+), 4 deletions(-)