mbox series

[net-next,0/4] eth: mlx4: use the page pool for Rx buffers

Message ID 20250205031213.358973-1-kuba@kernel.org (mailing list archive)
Headers show
Series eth: mlx4: use the page pool for Rx buffers | expand

Message

Jakub Kicinski Feb. 5, 2025, 3:12 a.m. UTC
Convert mlx4 to page pool. I've been sitting on these patches for
over a year, and Jonathan Lemon had a similar series years before.
We never deployed it or sent upstream because it didn't really show
much perf win under normal load (admittedly I think the real testing
was done before Ilias's work on recycling).

During the v6.9 kernel rollout Meta's CDN team noticed that machines
with CX3 Pro (mlx4) are prone to overloads (double digit % of CPU time
spent mapping buffers in the IOMMU). The problem does not occur with
modern NICs, so I dusted off this series and reportedly it still works.
And it makes the problem go away, no overloads, perf back in line with
older kernels. Something must have changed in IOMMU code, I guess.

This series is very simple, and can very likely be optimized further.
Thing is, I don't have access to any CX3 Pro NICs. They only exist
in CDN locations which haven't had a HW refresh for a while. So I can
say this series survives a week under traffic w/ XDP enabled, but
my ability to iterate and improve is a bit limited.

Jakub Kicinski (4):
  eth: mlx4: create a page pool for Rx
  eth: mlx4: don't try to complete XDP frames in netpoll
  eth: mlx4: remove the local XDP fast-recycling ring
  eth: mlx4: use the page pool for Rx buffers

 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  15 +--
 drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 120 +++++++------------
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   |  17 ++-
 3 files changed, 53 insertions(+), 99 deletions(-)

Comments

Tariq Toukan Feb. 6, 2025, 12:57 p.m. UTC | #1
On 05/02/2025 5:12, Jakub Kicinski wrote:
> Convert mlx4 to page pool. I've been sitting on these patches for
> over a year, and Jonathan Lemon had a similar series years before.
> We never deployed it or sent upstream because it didn't really show
> much perf win under normal load (admittedly I think the real testing
> was done before Ilias's work on recycling).
> 
> During the v6.9 kernel rollout Meta's CDN team noticed that machines
> with CX3 Pro (mlx4) are prone to overloads (double digit % of CPU time
> spent mapping buffers in the IOMMU). The problem does not occur with
> modern NICs, so I dusted off this series and reportedly it still works.
> And it makes the problem go away, no overloads, perf back in line with
> older kernels. Something must have changed in IOMMU code, I guess.
> 
> This series is very simple, and can very likely be optimized further.
> Thing is, I don't have access to any CX3 Pro NICs. They only exist
> in CDN locations which haven't had a HW refresh for a while. So I can
> say this series survives a week under traffic w/ XDP enabled, but
> my ability to iterate and improve is a bit limited.

Hi Jakub,

Thanks for your patches.

As this series touches critical data-path area, and you had no real 
option of testing it, we are taking it through a regression cycle, in 
parallel to the code review.

We should have results early next week. We'll update.

Regards,
Tariq

> 
> Jakub Kicinski (4):
>    eth: mlx4: create a page pool for Rx
>    eth: mlx4: don't try to complete XDP frames in netpoll
>    eth: mlx4: remove the local XDP fast-recycling ring
>    eth: mlx4: use the page pool for Rx buffers
> 
>   drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  15 +--
>   drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 120 +++++++------------
>   drivers/net/ethernet/mellanox/mlx4/en_tx.c   |  17 ++-
>   3 files changed, 53 insertions(+), 99 deletions(-)
>
Jakub Kicinski Feb. 6, 2025, 3:58 p.m. UTC | #2
On Thu, 6 Feb 2025 14:57:59 +0200 Tariq Toukan wrote:
> On 05/02/2025 5:12, Jakub Kicinski wrote:
> > Convert mlx4 to page pool. I've been sitting on these patches for
> > over a year, and Jonathan Lemon had a similar series years before.
> > We never deployed it or sent upstream because it didn't really show
> > much perf win under normal load (admittedly I think the real testing
> > was done before Ilias's work on recycling).
> > 
> > During the v6.9 kernel rollout Meta's CDN team noticed that machines
> > with CX3 Pro (mlx4) are prone to overloads (double digit % of CPU time
> > spent mapping buffers in the IOMMU). The problem does not occur with
> > modern NICs, so I dusted off this series and reportedly it still works.
> > And it makes the problem go away, no overloads, perf back in line with
> > older kernels. Something must have changed in IOMMU code, I guess.
> > 
> > This series is very simple, and can very likely be optimized further.
> > Thing is, I don't have access to any CX3 Pro NICs. They only exist
> > in CDN locations which haven't had a HW refresh for a while. So I can
> > say this series survives a week under traffic w/ XDP enabled, but
> > my ability to iterate and improve is a bit limited.  
> 
> Hi Jakub,
> 
> Thanks for your patches.
> 
> As this series touches critical data-path area, and you had no real 
> option of testing it, we are taking it through a regression cycle, in 
> parallel to the code review.
> 
> We should have results early next week. We'll update.

Sounds good, could you repost once ready?
I'll mark it as awaiting upstream in patchwork for now.
And feel free to drop the line pointed out by Ido, no real
preference either way there.
Tariq Toukan Feb. 11, 2025, 6:05 p.m. UTC | #3
On 06/02/2025 17:58, Jakub Kicinski wrote:
> On Thu, 6 Feb 2025 14:57:59 +0200 Tariq Toukan wrote:
>> On 05/02/2025 5:12, Jakub Kicinski wrote:
>>> Convert mlx4 to page pool. I've been sitting on these patches for
>>> over a year, and Jonathan Lemon had a similar series years before.
>>> We never deployed it or sent upstream because it didn't really show
>>> much perf win under normal load (admittedly I think the real testing
>>> was done before Ilias's work on recycling).
>>>
>>> During the v6.9 kernel rollout Meta's CDN team noticed that machines
>>> with CX3 Pro (mlx4) are prone to overloads (double digit % of CPU time
>>> spent mapping buffers in the IOMMU). The problem does not occur with
>>> modern NICs, so I dusted off this series and reportedly it still works.
>>> And it makes the problem go away, no overloads, perf back in line with
>>> older kernels. Something must have changed in IOMMU code, I guess.
>>>
>>> This series is very simple, and can very likely be optimized further.
>>> Thing is, I don't have access to any CX3 Pro NICs. They only exist
>>> in CDN locations which haven't had a HW refresh for a while. So I can
>>> say this series survives a week under traffic w/ XDP enabled, but
>>> my ability to iterate and improve is a bit limited.
>>
>> Hi Jakub,
>>
>> Thanks for your patches.
>>
>> As this series touches critical data-path area, and you had no real
>> option of testing it, we are taking it through a regression cycle, in
>> parallel to the code review.
>>
>> We should have results early next week. We'll update.
> 
> Sounds good, could you repost once ready?
> I'll mark it as awaiting upstream in patchwork for now.
> And feel free to drop the line pointed out by Ido, no real
> preference either way there.

Hi,

Patches passed functional tests.

Overall, the patches look good.

Only a few comments:
1. Nit by Ido.
2. pool size.
3. xdp xmit support description.

How do you want to proceed?
Do you want to fix and re-spin?
Jakub Kicinski Feb. 11, 2025, 6:50 p.m. UTC | #4
On Tue, 11 Feb 2025 20:05:01 +0200 Tariq Toukan wrote:
> On 06/02/2025 17:58, Jakub Kicinski wrote:
> > On Thu, 6 Feb 2025 14:57:59 +0200 Tariq Toukan wrote:  
> >> Thanks for your patches.
> >>
> >> As this series touches critical data-path area, and you had no real
> >> option of testing it, we are taking it through a regression cycle, in
> >> parallel to the code review.
> >>
> >> We should have results early next week. We'll update.  
> > 
> > Sounds good, could you repost once ready?
> > I'll mark it as awaiting upstream in patchwork for now.
> > And feel free to drop the line pointed out by Ido, no real
> > preference either way there.  
> 
> Hi,
> 
> Patches passed functional tests.
> 
> Overall, the patches look good.

Thanks!

> Only a few comments:
> 1. Nit by Ido.
> 2. pool size.
> 3. xdp xmit support description.
> 
> How do you want to proceed?
> Do you want to fix and re-spin?

Sure thing, will do.