mbox series

[net-next,v3,0/3] Fix late DMA unmap crash for page pool

Message ID 20250326-page-pool-track-dma-v3-0-8e464016e0ac@redhat.com (mailing list archive)
Headers show
Series Fix late DMA unmap crash for page pool | expand

Message

Toke Høiland-Jørgensen March 26, 2025, 8:18 a.m. UTC
This series fixes the late dma_unmap crash for page pool first reported
by Yonglong Liu in [0]. It is an alternative approach to the one
submitted by Yunsheng Lin, most recently in [1]. The first two commits
are small refactors of the page pool code, in preparation of the main
change in patch 3. See the commit message of patch 3 for the details.

-Toke

[0] https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@kernel.org/T/
[1] https://lore.kernel.org/r/20250307092356.638242-1-linyunsheng@huawei.com

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
Changes in v3:
- Use a full-width bool for pp->dma_sync instead of a full unsigned
  long (in patch 2), and leave pp->dma_sync_cpu alone.

- Link to v2: https://lore.kernel.org/r/20250325-page-pool-track-dma-v2-0-113ebc1946f3@redhat.com

Changes in v2:
- Always leave two bits at the top of pp_magic as zero, instead of one

- Add an rcu_read_lock() around __page_pool_dma_sync_for_device()

- Add a comment in poison.h with a reference to the bitmask definition

- Add a longer description of the logic of the bitmask definitions to
  the comment in types.h, and a summary of the security implications of
  using the pp_magic field to the commit message of patch 3

- Collect Mina's Reviewed-by and Yonglong's Tested-by tags

- Link to v1: https://lore.kernel.org/r/20250314-page-pool-track-dma-v1-0-c212e57a74c2@redhat.com

---
Toke Høiland-Jørgensen (3):
      page_pool: Move pp_magic check into helper functions
      page_pool: Turn dma_sync into a full-width bool field
      page_pool: Track DMA-mapped pages and unmap them when destroying the pool

 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c |  4 +-
 include/linux/poison.h                           |  4 ++
 include/net/page_pool/types.h                    | 65 ++++++++++++++++++-
 mm/page_alloc.c                                  |  9 +--
 net/core/netmem_priv.h                           | 33 +++++++++-
 net/core/page_pool.c                             | 81 ++++++++++++++++++++----
 net/core/skbuff.c                                | 16 +----
 net/core/xdp.c                                   |  4 +-
 8 files changed, 176 insertions(+), 40 deletions(-)
---
base-commit: 45e36a8e3c17c4d50ecbc863893f253fb46ac070
change-id: 20250310-page-pool-track-dma-0332343a460e

Comments

Jakub Kicinski March 26, 2025, 11:48 a.m. UTC | #1
On Wed, 26 Mar 2025 09:18:37 +0100 Toke Høiland-Jørgensen wrote:
> This series fixes the late dma_unmap crash for page pool first reported
> by Yonglong Liu in [0]. It is an alternative approach to the one
> submitted by Yunsheng Lin, most recently in [1]. The first two commits
> are small refactors of the page pool code, in preparation of the main
> change in patch 3. See the commit message of patch 3 for the details.

Doesn't apply, FWIW, maybe rebase/repost after Linus pull net-next, 
in case something conflicts on the MM side
Toke Høiland-Jørgensen March 26, 2025, 12:20 p.m. UTC | #2
Jakub Kicinski <kuba@kernel.org> writes:

> On Wed, 26 Mar 2025 09:18:37 +0100 Toke Høiland-Jørgensen wrote:
>> This series fixes the late dma_unmap crash for page pool first reported
>> by Yonglong Liu in [0]. It is an alternative approach to the one
>> submitted by Yunsheng Lin, most recently in [1]. The first two commits
>> are small refactors of the page pool code, in preparation of the main
>> change in patch 3. See the commit message of patch 3 for the details.
>
> Doesn't apply, FWIW,

Ugh, sorry about that; rebased yesterday before reposting, but forgot to
do so this morning :/

> maybe rebase/repost after Linus pull net-next, in case something
> conflicts on the MM side

As in, you want to wait until after the merge window? Sure, can do.

-Toke
Jakub Kicinski March 26, 2025, 2:49 p.m. UTC | #3
On Wed, 26 Mar 2025 13:20:37 +0100 Toke Høiland-Jørgensen wrote:
> > maybe rebase/repost after Linus pull net-next, in case something
> > conflicts on the MM side  
> 
> As in, you want to wait until after the merge window? Sure, can do.

I think we can try in this merge window, just after our first big PR
which will hopefully come out today.
Toke Høiland-Jørgensen March 27, 2025, 10:41 a.m. UTC | #4
Jakub Kicinski <kuba@kernel.org> writes:

> On Wed, 26 Mar 2025 13:20:37 +0100 Toke Høiland-Jørgensen wrote:
>> > maybe rebase/repost after Linus pull net-next, in case something
>> > conflicts on the MM side  
>> 
>> As in, you want to wait until after the merge window? Sure, can do.
>
> I think we can try in this merge window, just after our first big PR
> which will hopefully come out today.

Alright, cool. That seems to have been merged this morning, so I'll
rebase and send a v4.

-Toke