Message ID | 1742732906-166564-1-git-send-email-tariqt@nvidia.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [net] net/mlx5e: SHAMPO, Make reserved size independent of page size | expand |
On Sun, Mar 23, 2025 at 02:28:26PM +0200, Tariq Toukan wrote: > From: Lama Kayal <lkayal@nvidia.com> > > When hw-gro is enabled, the maximum number of header entries that are > needed per wqe (hd_per_wqe) is calculated based on the size of the > reservations among other parameters. > > Miscalculation of the size of reservations leads to incorrect > calculation of hd_per_wqe as 0, particularly in the case of large page > size like in aarch64, this prevents the SHAMPO header from being > correctly initialized in the device, ultimately causing the following > cqe err that indicates a violation of PD. Hi Lama, Tariq, all, If I understand things correctly, hd_per_wqe is calculated in mlx5e_shampo_hd_per_wqe() like this: u32 mlx5e_shampo_hd_per_wqe(struct mlx5_core_dev *mdev, struct mlx5e_params *params, struct mlx5e_rq_param *rq_param) { int resv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * PAGE_SIZE; u16 num_strides = BIT(mlx5e_mpwqe_get_log_num_strides(mdev, params, NULL)); int pkt_per_resv = BIT(mlx5e_shampo_get_log_pkt_per_rsrv(mdev, params)); u8 log_stride_sz = mlx5e_mpwqe_get_log_stride_size(mdev, params, NULL); int wqe_size = BIT(log_stride_sz) * num_strides; u32 hd_per_wqe; /* Assumption: hd_per_wqe % 8 == 0. */ hd_per_wqe = (wqe_size / resv_size) * pkt_per_resv; mlx5_core_dbg(mdev, "%s hd_per_wqe = %d rsrv_size = %d wqe_size = %d pkt_per_resv = %d\n", __func__, hd_per_wqe, resv_size, wqe_size, pkt_per_resv); return hd_per_wqe; } I can see that if PAGE_SIZE was some multiple of 4k, and thus larger than wqe_size, then this could lead to hd_per_wqe being zero. But I note that mlx5e_mpwqe_get_log_stride_size() may return PAGE_SHIFT. And I wonder if that leads to wqe_size being larger than expected by this patch in cases where the PAGE_SIZE is greater than 4k. Likewise in mlx5e_shampo_get_log_cq_size(), which seems to have a large overlap codewise with mlx5e_shampo_hd_per_wqe(). > > mlx5_core 0000:00:08.0 eth2: ERR CQE on RQ: 0x1180 > mlx5_core 0000:00:08.0 eth2: Error cqe on cqn 0x510, ci 0x0, qn 0x1180, opcode 0xe, syndrome 0x4, vendor syndrome 0x32 > 00000000: 00 00 00 00 04 4a 00 00 00 00 00 00 20 00 93 32 > 00000010: 55 00 00 00 fb cc 00 00 00 00 00 00 07 18 00 00 > 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4a > 00000030: 00 00 00 9a 93 00 32 04 00 00 00 00 00 00 da e1 > > Use the correct formula for calculating the size of reservations, > precisely it shouldn't be dependent on page size, instead use the > correct multiply of MLX5E_SHAMPO_WQ_BASE_RESRV_SIZE. > > Fixes: e5ca8fb08ab2 ("net/mlx5e: Add control path for SHAMPO feature") > Signed-off-by: Lama Kayal <lkayal@nvidia.com> > Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> > Signed-off-by: Tariq Toukan <tariqt@nvidia.com> > --- > drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c > index 64b62ed17b07..31eb99f09c63 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c > @@ -423,7 +423,7 @@ u8 mlx5e_shampo_get_log_pkt_per_rsrv(struct mlx5_core_dev *mdev, > struct mlx5e_params *params) > { > u32 resrv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * > - PAGE_SIZE; > + MLX5E_SHAMPO_WQ_BASE_RESRV_SIZE; > > return order_base_2(DIV_ROUND_UP(resrv_size, params->sw_mtu)); > } > @@ -827,7 +827,8 @@ static u32 mlx5e_shampo_get_log_cq_size(struct mlx5_core_dev *mdev, > struct mlx5e_params *params, > struct mlx5e_xsk_param *xsk) > { > - int rsrv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * PAGE_SIZE; > + int rsrv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * > + MLX5E_SHAMPO_WQ_BASE_RESRV_SIZE; > u16 num_strides = BIT(mlx5e_mpwqe_get_log_num_strides(mdev, params, xsk)); > int pkt_per_rsrv = BIT(mlx5e_shampo_get_log_pkt_per_rsrv(mdev, params)); > u8 log_stride_sz = mlx5e_mpwqe_get_log_stride_size(mdev, params, xsk); > @@ -1036,7 +1037,8 @@ u32 mlx5e_shampo_hd_per_wqe(struct mlx5_core_dev *mdev, > struct mlx5e_params *params, > struct mlx5e_rq_param *rq_param) > { > - int resv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * PAGE_SIZE; > + int resv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * > + MLX5E_SHAMPO_WQ_BASE_RESRV_SIZE; > u16 num_strides = BIT(mlx5e_mpwqe_get_log_num_strides(mdev, params, NULL)); > int pkt_per_resv = BIT(mlx5e_shampo_get_log_pkt_per_rsrv(mdev, params)); > u8 log_stride_sz = mlx5e_mpwqe_get_log_stride_size(mdev, params, NULL); > > base-commit: ed3ba9b6e280e14cc3148c1b226ba453f02fa76c > -- > 2.31.1 > >
On 25/03/2025 16:04, Simon Horman wrote: > On Sun, Mar 23, 2025 at 02:28:26PM +0200, Tariq Toukan wrote: >> From: Lama Kayal <lkayal@nvidia.com> >> >> When hw-gro is enabled, the maximum number of header entries that are >> needed per wqe (hd_per_wqe) is calculated based on the size of the >> reservations among other parameters. >> >> Miscalculation of the size of reservations leads to incorrect >> calculation of hd_per_wqe as 0, particularly in the case of large page >> size like in aarch64, this prevents the SHAMPO header from being >> correctly initialized in the device, ultimately causing the following >> cqe err that indicates a violation of PD. > > Hi Lama, Tariq, all, > > If I understand things correctly, hd_per_wqe is calculated > in mlx5e_shampo_hd_per_wqe() like this: > > u32 mlx5e_shampo_hd_per_wqe(struct mlx5_core_dev *mdev, > struct mlx5e_params *params, struct mlx5e_rq_param *rq_param) > { > int resv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * PAGE_SIZE; > u16 num_strides = BIT(mlx5e_mpwqe_get_log_num_strides(mdev, params, NULL)); > int pkt_per_resv = BIT(mlx5e_shampo_get_log_pkt_per_rsrv(mdev, params)); > u8 log_stride_sz = mlx5e_mpwqe_get_log_stride_size(mdev, params, NULL); > int wqe_size = BIT(log_stride_sz) * num_strides; u32 hd_per_wqe; > > /* Assumption: hd_per_wqe % 8 == 0. */ > hd_per_wqe = (wqe_size / resv_size) * pkt_per_resv; mlx5_core_dbg(mdev, "%s hd_per_wqe = %d rsrv_size = %d wqe_size = %d pkt_per_resv = %d\n", __func__, hd_per_wqe, resv_size, wqe_size, pkt_per_resv); > return hd_per_wqe; > } > > I can see that if PAGE_SIZE was some multiple of 4k, and thus > larger than wqe_size, then this could lead to hd_per_wqe being zero. > > But I note that mlx5e_mpwqe_get_log_stride_size() may return PAGE_SHIFT. > And I wonder if that leads to wqe_size being larger than expected by this > patch in cases where the PAGE_SIZE is greater than 4k. > > Likewise in mlx5e_shampo_get_log_cq_size(), which seems to have a large overlap > codewise with mlx5e_shampo_hd_per_wqe(). > >> Hi Simon, Different settings lead to different combinations of num_strides and stride_size. However, they affect each other in a way that the resulting wqe_size has the expected (~preset) value. In mlx5e_mpwqe_get_log_num_strides() you can see that if stride_size grows, then num_strides decreases accordingly. In addition, to reduce mistakes/bugs, we have a few WARNs() along the calculations, in addition to a verifier function mlx5e_verify_rx_mpwqe_strides(). Thanks, Tariq
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Sun, 23 Mar 2025 14:28:26 +0200 you wrote: > From: Lama Kayal <lkayal@nvidia.com> > > When hw-gro is enabled, the maximum number of header entries that are > needed per wqe (hd_per_wqe) is calculated based on the size of the > reservations among other parameters. > > Miscalculation of the size of reservations leads to incorrect > calculation of hd_per_wqe as 0, particularly in the case of large page > size like in aarch64, this prevents the SHAMPO header from being > correctly initialized in the device, ultimately causing the following > cqe err that indicates a violation of PD. > > [...] Here is the summary with links: - [net] net/mlx5e: SHAMPO, Make reserved size independent of page size https://git.kernel.org/netdev/net/c/fab058356885 You are awesome, thank you!
On Thu, Mar 27, 2025 at 07:58:30PM +0200, Tariq Toukan wrote: > > > On 25/03/2025 16:04, Simon Horman wrote: > > On Sun, Mar 23, 2025 at 02:28:26PM +0200, Tariq Toukan wrote: > > > From: Lama Kayal <lkayal@nvidia.com> > > > > > > When hw-gro is enabled, the maximum number of header entries that are > > > needed per wqe (hd_per_wqe) is calculated based on the size of the > > > reservations among other parameters. > > > > > > Miscalculation of the size of reservations leads to incorrect > > > calculation of hd_per_wqe as 0, particularly in the case of large page > > > size like in aarch64, this prevents the SHAMPO header from being > > > correctly initialized in the device, ultimately causing the following > > > cqe err that indicates a violation of PD. > > > > Hi Lama, Tariq, all, > > > > If I understand things correctly, hd_per_wqe is calculated > > in mlx5e_shampo_hd_per_wqe() like this: > > > > u32 mlx5e_shampo_hd_per_wqe(struct mlx5_core_dev *mdev, > > struct mlx5e_params *params, struct mlx5e_rq_param *rq_param) > > { > > int resv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * PAGE_SIZE; > > u16 num_strides = BIT(mlx5e_mpwqe_get_log_num_strides(mdev, params, NULL)); > > int pkt_per_resv = BIT(mlx5e_shampo_get_log_pkt_per_rsrv(mdev, params)); > > u8 log_stride_sz = mlx5e_mpwqe_get_log_stride_size(mdev, params, NULL); > > int wqe_size = BIT(log_stride_sz) * num_strides; u32 hd_per_wqe; > > > > /* Assumption: hd_per_wqe % 8 == 0. */ > > hd_per_wqe = (wqe_size / resv_size) * pkt_per_resv; mlx5_core_dbg(mdev, "%s hd_per_wqe = %d rsrv_size = %d wqe_size = %d pkt_per_resv = %d\n", __func__, hd_per_wqe, resv_size, wqe_size, pkt_per_resv); > > return hd_per_wqe; > > } > > > > I can see that if PAGE_SIZE was some multiple of 4k, and thus > > larger than wqe_size, then this could lead to hd_per_wqe being zero. > > > > But I note that mlx5e_mpwqe_get_log_stride_size() may return PAGE_SHIFT. > > And I wonder if that leads to wqe_size being larger than expected by this > > patch in cases where the PAGE_SIZE is greater than 4k. > > > > Likewise in mlx5e_shampo_get_log_cq_size(), which seems to have a large overlap > > codewise with mlx5e_shampo_hd_per_wqe(). > > > > > > > Hi Simon, > > Different settings lead to different combinations of num_strides and > stride_size. However, they affect each other in a way that the resulting > wqe_size has the expected (~preset) value. > > In mlx5e_mpwqe_get_log_num_strides() you can see that if stride_size grows, > then num_strides decreases accordingly. > > In addition, to reduce mistakes/bugs, we have a few WARNs() along the > calculations, in addition to a verifier function > mlx5e_verify_rx_mpwqe_strides(). Hi Tariq, Sorry for the slow response, I was AFK for a few days. Thanks for confirming that this is as expected.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c index 64b62ed17b07..31eb99f09c63 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c @@ -423,7 +423,7 @@ u8 mlx5e_shampo_get_log_pkt_per_rsrv(struct mlx5_core_dev *mdev, struct mlx5e_params *params) { u32 resrv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * - PAGE_SIZE; + MLX5E_SHAMPO_WQ_BASE_RESRV_SIZE; return order_base_2(DIV_ROUND_UP(resrv_size, params->sw_mtu)); } @@ -827,7 +827,8 @@ static u32 mlx5e_shampo_get_log_cq_size(struct mlx5_core_dev *mdev, struct mlx5e_params *params, struct mlx5e_xsk_param *xsk) { - int rsrv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * PAGE_SIZE; + int rsrv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * + MLX5E_SHAMPO_WQ_BASE_RESRV_SIZE; u16 num_strides = BIT(mlx5e_mpwqe_get_log_num_strides(mdev, params, xsk)); int pkt_per_rsrv = BIT(mlx5e_shampo_get_log_pkt_per_rsrv(mdev, params)); u8 log_stride_sz = mlx5e_mpwqe_get_log_stride_size(mdev, params, xsk); @@ -1036,7 +1037,8 @@ u32 mlx5e_shampo_hd_per_wqe(struct mlx5_core_dev *mdev, struct mlx5e_params *params, struct mlx5e_rq_param *rq_param) { - int resv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * PAGE_SIZE; + int resv_size = BIT(mlx5e_shampo_get_log_rsrv_size(mdev, params)) * + MLX5E_SHAMPO_WQ_BASE_RESRV_SIZE; u16 num_strides = BIT(mlx5e_mpwqe_get_log_num_strides(mdev, params, NULL)); int pkt_per_resv = BIT(mlx5e_shampo_get_log_pkt_per_rsrv(mdev, params)); u8 log_stride_sz = mlx5e_mpwqe_get_log_stride_size(mdev, params, NULL);