diff mbox series

[bpf] xsk: fix memory leak for failed bind

Message ID 20201214085127.3960-1-magnus.karlsson@gmail.com (mailing list archive)
State Accepted
Commit 8bee683384087a6275c9183a483435225f7bb209
Delegated to: BPF
Headers show
Series [bpf] xsk: fix memory leak for failed bind | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf
netdev/subject_prefix success Link
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 18 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Magnus Karlsson Dec. 14, 2020, 8:51 a.m. UTC
From: Magnus Karlsson <magnus.karlsson@intel.com>

Fix a possible memory leak when a bind of an AF_XDP socket fails. When
the fill and completion rings are created, they are tied to the
socket. But when the buffer pool is later created at bind time, the
ownership of these two rings are transferred to the buffer pool as
they might be shared between sockets (and the buffer pool cannot be
created until we know what we are binding to). So, before the buffer
pool is created, these two rings are cleaned up with the socket, and
after they have been transferred they are cleaned up together with
the buffer pool.

The problem is that ownership was transferred before it was absolutely
certain that the buffer pool could be created and initialized
correctly and when one of these errors occurred, the fill and
completion rings did neither belong to the socket nor the pool and
where therefore leaked. Solve this by moving the ownership transfer
to the point where the buffer pool has been completely set up and
there is no way it can fail.

Fixes: 7361f9c3d719 ("xsk: Move fill and completion rings to buffer pool")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com
---
 net/xdp/xsk.c           | 4 ++++
 net/xdp/xsk_buff_pool.c | 2 --
 2 files changed, 4 insertions(+), 2 deletions(-)


base-commit: d9838b1d39283c1200c13f9076474c7624b8ec34

Comments

Björn Töpel Dec. 14, 2020, 9:45 a.m. UTC | #1
On 2020-12-14 09:51, Magnus Karlsson wrote:
> From: Magnus Karlsson <magnus.karlsson@intel.com>
> 
> Fix a possible memory leak when a bind of an AF_XDP socket fails. When
> the fill and completion rings are created, they are tied to the
> socket. But when the buffer pool is later created at bind time, the
> ownership of these two rings are transferred to the buffer pool as
> they might be shared between sockets (and the buffer pool cannot be
> created until we know what we are binding to). So, before the buffer
> pool is created, these two rings are cleaned up with the socket, and
> after they have been transferred they are cleaned up together with
> the buffer pool.
> 
> The problem is that ownership was transferred before it was absolutely
> certain that the buffer pool could be created and initialized
> correctly and when one of these errors occurred, the fill and
> completion rings did neither belong to the socket nor the pool and
> where therefore leaked. Solve this by moving the ownership transfer
> to the point where the buffer pool has been completely set up and
> there is no way it can fail.
> 
> Fixes: 7361f9c3d719 ("xsk: Move fill and completion rings to buffer pool")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com

Acked-by: Björn Töpel <bjorn.topel@intel.com>
Fijalkowski, Maciej Dec. 14, 2020, 11:09 a.m. UTC | #2
On Mon, Dec 14, 2020 at 09:51:27AM +0100, Magnus Karlsson wrote:
> From: Magnus Karlsson <magnus.karlsson@intel.com>
> 
> Fix a possible memory leak when a bind of an AF_XDP socket fails. When
> the fill and completion rings are created, they are tied to the
> socket. But when the buffer pool is later created at bind time, the
> ownership of these two rings are transferred to the buffer pool as
> they might be shared between sockets (and the buffer pool cannot be
> created until we know what we are binding to). So, before the buffer
> pool is created, these two rings are cleaned up with the socket, and
> after they have been transferred they are cleaned up together with
> the buffer pool.
> 
> The problem is that ownership was transferred before it was absolutely
> certain that the buffer pool could be created and initialized
> correctly and when one of these errors occurred, the fill and
> completion rings did neither belong to the socket nor the pool and
> where therefore leaked. Solve this by moving the ownership transfer
> to the point where the buffer pool has been completely set up and
> there is no way it can fail.
> 
> Fixes: 7361f9c3d719 ("xsk: Move fill and completion rings to buffer pool")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com
> ---
>  net/xdp/xsk.c           | 4 ++++
>  net/xdp/xsk_buff_pool.c | 2 --
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 62504471fd20..189cfbbcccc0 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -772,6 +772,10 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
>  		}
>  	}
>  
> +	/* FQ and CQ are now owned by the buffer pool and cleaned up with it. */
> +	xs->fq_tmp = NULL;
> +	xs->cq_tmp = NULL;
> +
>  	xs->dev = dev;
>  	xs->zc = xs->umem->zc;
>  	xs->queue_id = qid;
> diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
> index d5adeee9d5d9..46c2ae7d91d1 100644
> --- a/net/xdp/xsk_buff_pool.c
> +++ b/net/xdp/xsk_buff_pool.c
> @@ -75,8 +75,6 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs,
>  
>  	pool->fq = xs->fq_tmp;
>  	pool->cq = xs->cq_tmp;
> -	xs->fq_tmp = NULL;
> -	xs->cq_tmp = NULL;

Given this change, are there any circumstances that we could hit
xsk_release with xs->{f,c}q_tmp != NULL ?
>  
>  	for (i = 0; i < pool->free_heads_cnt; i++) {
>  		xskb = &pool->heads[i];
> 
> base-commit: d9838b1d39283c1200c13f9076474c7624b8ec34
> -- 
> 2.29.0
>
Magnus Karlsson Dec. 14, 2020, 11:33 a.m. UTC | #3
On Mon, Dec 14, 2020 at 12:19 PM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Mon, Dec 14, 2020 at 09:51:27AM +0100, Magnus Karlsson wrote:
> > From: Magnus Karlsson <magnus.karlsson@intel.com>
> >
> > Fix a possible memory leak when a bind of an AF_XDP socket fails. When
> > the fill and completion rings are created, they are tied to the
> > socket. But when the buffer pool is later created at bind time, the
> > ownership of these two rings are transferred to the buffer pool as
> > they might be shared between sockets (and the buffer pool cannot be
> > created until we know what we are binding to). So, before the buffer
> > pool is created, these two rings are cleaned up with the socket, and
> > after they have been transferred they are cleaned up together with
> > the buffer pool.
> >
> > The problem is that ownership was transferred before it was absolutely
> > certain that the buffer pool could be created and initialized
> > correctly and when one of these errors occurred, the fill and
> > completion rings did neither belong to the socket nor the pool and
> > where therefore leaked. Solve this by moving the ownership transfer
> > to the point where the buffer pool has been completely set up and
> > there is no way it can fail.
> >
> > Fixes: 7361f9c3d719 ("xsk: Move fill and completion rings to buffer pool")
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com
> > ---
> >  net/xdp/xsk.c           | 4 ++++
> >  net/xdp/xsk_buff_pool.c | 2 --
> >  2 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 62504471fd20..189cfbbcccc0 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -772,6 +772,10 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
> >               }
> >       }
> >
> > +     /* FQ and CQ are now owned by the buffer pool and cleaned up with it. */
> > +     xs->fq_tmp = NULL;
> > +     xs->cq_tmp = NULL;
> > +
> >       xs->dev = dev;
> >       xs->zc = xs->umem->zc;
> >       xs->queue_id = qid;
> > diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
> > index d5adeee9d5d9..46c2ae7d91d1 100644
> > --- a/net/xdp/xsk_buff_pool.c
> > +++ b/net/xdp/xsk_buff_pool.c
> > @@ -75,8 +75,6 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs,
> >
> >       pool->fq = xs->fq_tmp;
> >       pool->cq = xs->cq_tmp;
> > -     xs->fq_tmp = NULL;
> > -     xs->cq_tmp = NULL;
>
> Given this change, are there any circumstances that we could hit
> xsk_release with xs->{f,c}q_tmp != NULL ?

Yes, if the user has not registered any fill or completion ring and
the socket is torn down.

> >
> >       for (i = 0; i < pool->free_heads_cnt; i++) {
> >               xskb = &pool->heads[i];
> >
> > base-commit: d9838b1d39283c1200c13f9076474c7624b8ec34
> > --
> > 2.29.0
> >
Magnus Karlsson Dec. 14, 2020, 12:33 p.m. UTC | #4
On Mon, Dec 14, 2020 at 12:33 PM Magnus Karlsson
<magnus.karlsson@gmail.com> wrote:
>
> On Mon, Dec 14, 2020 at 12:19 PM Maciej Fijalkowski
> <maciej.fijalkowski@intel.com> wrote:
> >
> > On Mon, Dec 14, 2020 at 09:51:27AM +0100, Magnus Karlsson wrote:
> > > From: Magnus Karlsson <magnus.karlsson@intel.com>
> > >
> > > Fix a possible memory leak when a bind of an AF_XDP socket fails. When
> > > the fill and completion rings are created, they are tied to the
> > > socket. But when the buffer pool is later created at bind time, the
> > > ownership of these two rings are transferred to the buffer pool as
> > > they might be shared between sockets (and the buffer pool cannot be
> > > created until we know what we are binding to). So, before the buffer
> > > pool is created, these two rings are cleaned up with the socket, and
> > > after they have been transferred they are cleaned up together with
> > > the buffer pool.
> > >
> > > The problem is that ownership was transferred before it was absolutely
> > > certain that the buffer pool could be created and initialized
> > > correctly and when one of these errors occurred, the fill and
> > > completion rings did neither belong to the socket nor the pool and
> > > where therefore leaked. Solve this by moving the ownership transfer
> > > to the point where the buffer pool has been completely set up and
> > > there is no way it can fail.
> > >
> > > Fixes: 7361f9c3d719 ("xsk: Move fill and completion rings to buffer pool")
> > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com
> > > ---
> > >  net/xdp/xsk.c           | 4 ++++
> > >  net/xdp/xsk_buff_pool.c | 2 --
> > >  2 files changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > > index 62504471fd20..189cfbbcccc0 100644
> > > --- a/net/xdp/xsk.c
> > > +++ b/net/xdp/xsk.c
> > > @@ -772,6 +772,10 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
> > >               }
> > >       }
> > >
> > > +     /* FQ and CQ are now owned by the buffer pool and cleaned up with it. */
> > > +     xs->fq_tmp = NULL;
> > > +     xs->cq_tmp = NULL;
> > > +
> > >       xs->dev = dev;
> > >       xs->zc = xs->umem->zc;
> > >       xs->queue_id = qid;
> > > diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
> > > index d5adeee9d5d9..46c2ae7d91d1 100644
> > > --- a/net/xdp/xsk_buff_pool.c
> > > +++ b/net/xdp/xsk_buff_pool.c
> > > @@ -75,8 +75,6 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs,
> > >
> > >       pool->fq = xs->fq_tmp;
> > >       pool->cq = xs->cq_tmp;
> > > -     xs->fq_tmp = NULL;
> > > -     xs->cq_tmp = NULL;
> >
> > Given this change, are there any circumstances that we could hit
> > xsk_release with xs->{f,c}q_tmp != NULL ?
>
> Yes, if the user has not registered any fill or completion ring and
> the socket is torn down.

Sorry Maciej. I answered the inverse of your question, i.e. == NULL.
For != NULL answer:

Yes, this is possible if the user registers a fill ring and/or
completion ring but does not bind and then closes the socket.

> > >
> > >       for (i = 0; i < pool->free_heads_cnt; i++) {
> > >               xskb = &pool->heads[i];
> > >
> > > base-commit: d9838b1d39283c1200c13f9076474c7624b8ec34
> > > --
> > > 2.29.0
> > >
patchwork-bot+netdevbpf@kernel.org Dec. 17, 2020, 9:51 p.m. UTC | #5
Hello:

This patch was applied to bpf/bpf.git (refs/heads/master):

On Mon, 14 Dec 2020 09:51:27 +0100 you wrote:
> From: Magnus Karlsson <magnus.karlsson@intel.com>
> 
> Fix a possible memory leak when a bind of an AF_XDP socket fails. When
> the fill and completion rings are created, they are tied to the
> socket. But when the buffer pool is later created at bind time, the
> ownership of these two rings are transferred to the buffer pool as
> they might be shared between sockets (and the buffer pool cannot be
> created until we know what we are binding to). So, before the buffer
> pool is created, these two rings are cleaned up with the socket, and
> after they have been transferred they are cleaned up together with
> the buffer pool.
> 
> [...]

Here is the summary with links:
  - [bpf] xsk: fix memory leak for failed bind
    https://git.kernel.org/bpf/bpf/c/8bee68338408

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
diff mbox series

Patch

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 62504471fd20..189cfbbcccc0 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -772,6 +772,10 @@  static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
 		}
 	}
 
+	/* FQ and CQ are now owned by the buffer pool and cleaned up with it. */
+	xs->fq_tmp = NULL;
+	xs->cq_tmp = NULL;
+
 	xs->dev = dev;
 	xs->zc = xs->umem->zc;
 	xs->queue_id = qid;
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index d5adeee9d5d9..46c2ae7d91d1 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -75,8 +75,6 @@  struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs,
 
 	pool->fq = xs->fq_tmp;
 	pool->cq = xs->cq_tmp;
-	xs->fq_tmp = NULL;
-	xs->cq_tmp = NULL;
 
 	for (i = 0; i < pool->free_heads_cnt; i++) {
 		xskb = &pool->heads[i];