diff mbox series

[v2] audit, io_uring, io-wq: Fix memory leak in io_sq_thread() and io_wqe_worker()

Message ID 20220803222343.31673-1-yepeilin.cs@gmail.com (mailing list archive)
State New
Headers show
Series [v2] audit, io_uring, io-wq: Fix memory leak in io_sq_thread() and io_wqe_worker() | expand

Commit Message

Peilin Ye Aug. 3, 2022, 10:23 p.m. UTC
From: Peilin Ye <peilin.ye@bytedance.com>

Currently @audit_context is allocated twice for io_uring workers:

  1. copy_process() calls audit_alloc();
  2. io_sq_thread() or io_wqe_worker() calls audit_alloc_kernel() (which
     is effectively audit_alloc()) and overwrites @audit_context,
     causing:

  BUG: memory leak
  unreferenced object 0xffff888144547400 (size 1024):
<...>
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<ffffffff8135cfc3>] audit_alloc+0x133/0x210
      [<ffffffff81239e63>] copy_process+0xcd3/0x2340
      [<ffffffff8123b5f3>] create_io_thread+0x63/0x90
      [<ffffffff81686604>] create_io_worker+0xb4/0x230
      [<ffffffff81686f68>] io_wqe_enqueue+0x248/0x3b0
      [<ffffffff8167663a>] io_queue_iowq+0xba/0x200
      [<ffffffff816768b3>] io_queue_async+0x113/0x180
      [<ffffffff816840df>] io_req_task_submit+0x18f/0x1a0
      [<ffffffff816841cd>] io_apoll_task_func+0xdd/0x120
      [<ffffffff8167d49f>] tctx_task_work+0x11f/0x570
      [<ffffffff81272c4e>] task_work_run+0x7e/0xc0
      [<ffffffff8125a688>] get_signal+0xc18/0xf10
      [<ffffffff8111645b>] arch_do_signal_or_restart+0x2b/0x730
      [<ffffffff812ea44e>] exit_to_user_mode_prepare+0x5e/0x180
      [<ffffffff844ae1b2>] syscall_exit_to_user_mode+0x12/0x20
      [<ffffffff844a7e80>] do_syscall_64+0x40/0x80

Then,

  3. io_sq_thread() or io_wqe_worker() frees @audit_context using
     audit_free();
  4. do_exit() eventually calls audit_free() again, which is okay
     because audit_free() does a NULL check.

As suggested by Paul Moore, fix it by deleting audit_alloc_kernel() and
redundant audit_free() calls.

Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring")
Suggested-by: Paul Moore <paul@paul-moore.com>
Cc: stable@vger.kernel.org
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
---
Change since v1:
  - Delete audit_alloc_kernel() (Paul Moore)

 fs/io-wq.c            |  3 ---
 fs/io_uring.c         |  4 ----
 include/linux/audit.h |  5 -----
 kernel/auditsc.c      | 25 -------------------------
 4 files changed, 37 deletions(-)

Comments

Paul Moore Aug. 4, 2022, 1:51 p.m. UTC | #1
On Wed, Aug 3, 2022 at 6:24 PM Peilin Ye <yepeilin.cs@gmail.com> wrote:
>
> From: Peilin Ye <peilin.ye@bytedance.com>
>
> Currently @audit_context is allocated twice for io_uring workers:
>
>   1. copy_process() calls audit_alloc();
>   2. io_sq_thread() or io_wqe_worker() calls audit_alloc_kernel() (which
>      is effectively audit_alloc()) and overwrites @audit_context,
>      causing:
>
>   BUG: memory leak
>   unreferenced object 0xffff888144547400 (size 1024):
> <...>
>     hex dump (first 32 bytes):
>       00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
>       00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     backtrace:
>       [<ffffffff8135cfc3>] audit_alloc+0x133/0x210
>       [<ffffffff81239e63>] copy_process+0xcd3/0x2340
>       [<ffffffff8123b5f3>] create_io_thread+0x63/0x90
>       [<ffffffff81686604>] create_io_worker+0xb4/0x230
>       [<ffffffff81686f68>] io_wqe_enqueue+0x248/0x3b0
>       [<ffffffff8167663a>] io_queue_iowq+0xba/0x200
>       [<ffffffff816768b3>] io_queue_async+0x113/0x180
>       [<ffffffff816840df>] io_req_task_submit+0x18f/0x1a0
>       [<ffffffff816841cd>] io_apoll_task_func+0xdd/0x120
>       [<ffffffff8167d49f>] tctx_task_work+0x11f/0x570
>       [<ffffffff81272c4e>] task_work_run+0x7e/0xc0
>       [<ffffffff8125a688>] get_signal+0xc18/0xf10
>       [<ffffffff8111645b>] arch_do_signal_or_restart+0x2b/0x730
>       [<ffffffff812ea44e>] exit_to_user_mode_prepare+0x5e/0x180
>       [<ffffffff844ae1b2>] syscall_exit_to_user_mode+0x12/0x20
>       [<ffffffff844a7e80>] do_syscall_64+0x40/0x80
>
> Then,
>
>   3. io_sq_thread() or io_wqe_worker() frees @audit_context using
>      audit_free();
>   4. do_exit() eventually calls audit_free() again, which is okay
>      because audit_free() does a NULL check.
>
> As suggested by Paul Moore, fix it by deleting audit_alloc_kernel() and
> redundant audit_free() calls.
>
> Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring")
> Suggested-by: Paul Moore <paul@paul-moore.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
> ---
> Change since v1:
>   - Delete audit_alloc_kernel() (Paul Moore)
>
>  fs/io-wq.c            |  3 ---
>  fs/io_uring.c         |  4 ----
>  include/linux/audit.h |  5 -----
>  kernel/auditsc.c      | 25 -------------------------
>  4 files changed, 37 deletions(-)

This looks good to me, thanks!  Although it looks like the io_uring
related changes will need to be applied by hand as they are pointing
to the old layout under fs/ as opposed to the newer layout in
io_uring/ introduced during this merge window.

Jens, did you want to take this via the io_uring tree or should I take
it via the audit tree?  If the latter, an ACK would be appreciated, if
the former my ACK is below.

Acked-by: Paul Moore <paul@paul-moore.com>
Jens Axboe Aug. 4, 2022, 2:32 p.m. UTC | #2
On 8/4/22 7:51 AM, Paul Moore wrote:
> On Wed, Aug 3, 2022 at 6:24 PM Peilin Ye <yepeilin.cs@gmail.com> wrote:
>>
>> From: Peilin Ye <peilin.ye@bytedance.com>
>>
>> Currently @audit_context is allocated twice for io_uring workers:
>>
>>   1. copy_process() calls audit_alloc();
>>   2. io_sq_thread() or io_wqe_worker() calls audit_alloc_kernel() (which
>>      is effectively audit_alloc()) and overwrites @audit_context,
>>      causing:
>>
>>   BUG: memory leak
>>   unreferenced object 0xffff888144547400 (size 1024):
>> <...>
>>     hex dump (first 32 bytes):
>>       00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
>>       00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>     backtrace:
>>       [<ffffffff8135cfc3>] audit_alloc+0x133/0x210
>>       [<ffffffff81239e63>] copy_process+0xcd3/0x2340
>>       [<ffffffff8123b5f3>] create_io_thread+0x63/0x90
>>       [<ffffffff81686604>] create_io_worker+0xb4/0x230
>>       [<ffffffff81686f68>] io_wqe_enqueue+0x248/0x3b0
>>       [<ffffffff8167663a>] io_queue_iowq+0xba/0x200
>>       [<ffffffff816768b3>] io_queue_async+0x113/0x180
>>       [<ffffffff816840df>] io_req_task_submit+0x18f/0x1a0
>>       [<ffffffff816841cd>] io_apoll_task_func+0xdd/0x120
>>       [<ffffffff8167d49f>] tctx_task_work+0x11f/0x570
>>       [<ffffffff81272c4e>] task_work_run+0x7e/0xc0
>>       [<ffffffff8125a688>] get_signal+0xc18/0xf10
>>       [<ffffffff8111645b>] arch_do_signal_or_restart+0x2b/0x730
>>       [<ffffffff812ea44e>] exit_to_user_mode_prepare+0x5e/0x180
>>       [<ffffffff844ae1b2>] syscall_exit_to_user_mode+0x12/0x20
>>       [<ffffffff844a7e80>] do_syscall_64+0x40/0x80
>>
>> Then,
>>
>>   3. io_sq_thread() or io_wqe_worker() frees @audit_context using
>>      audit_free();
>>   4. do_exit() eventually calls audit_free() again, which is okay
>>      because audit_free() does a NULL check.
>>
>> As suggested by Paul Moore, fix it by deleting audit_alloc_kernel() and
>> redundant audit_free() calls.
>>
>> Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring")
>> Suggested-by: Paul Moore <paul@paul-moore.com>
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
>> ---
>> Change since v1:
>>   - Delete audit_alloc_kernel() (Paul Moore)
>>
>>  fs/io-wq.c            |  3 ---
>>  fs/io_uring.c         |  4 ----
>>  include/linux/audit.h |  5 -----
>>  kernel/auditsc.c      | 25 -------------------------
>>  4 files changed, 37 deletions(-)
> 
> This looks good to me, thanks!  Although it looks like the io_uring
> related changes will need to be applied by hand as they are pointing
> to the old layout under fs/ as opposed to the newer layout in
> io_uring/ introduced during this merge window.
> 
> Jens, did you want to take this via the io_uring tree or should I take
> it via the audit tree?  If the latter, an ACK would be appreciated, if
> the former my ACK is below.
> 
> Acked-by: Paul Moore <paul@paul-moore.com>

Probably better if I take it, since I need to massage it into the
current tree anyway. We can then use this one as the base for the stable
backports that are going to be required.
Jens Axboe Aug. 4, 2022, 2:36 p.m. UTC | #3
On Wed, 3 Aug 2022 15:23:43 -0700, Peilin Ye wrote:
> From: Peilin Ye <peilin.ye@bytedance.com>
> 
> Currently @audit_context is allocated twice for io_uring workers:
> 
>   1. copy_process() calls audit_alloc();
>   2. io_sq_thread() or io_wqe_worker() calls audit_alloc_kernel() (which
>      is effectively audit_alloc()) and overwrites @audit_context,
>      causing:
> 
> [...]

Applied, thanks!

[1/1] audit, io_uring, io-wq: Fix memory leak in io_sq_thread() and io_wqe_worker()
      commit: f482aa98652795846cc55da98ebe331eb74f3d0b

Best regards,
Paul Moore Aug. 4, 2022, 2:44 p.m. UTC | #4
On Thu, Aug 4, 2022 at 10:32 AM Jens Axboe <axboe@kernel.dk> wrote:
> On 8/4/22 7:51 AM, Paul Moore wrote:
> > On Wed, Aug 3, 2022 at 6:24 PM Peilin Ye <yepeilin.cs@gmail.com> wrote:
> >>
> >> From: Peilin Ye <peilin.ye@bytedance.com>
> >>
> >> Currently @audit_context is allocated twice for io_uring workers:
> >>
> >>   1. copy_process() calls audit_alloc();
> >>   2. io_sq_thread() or io_wqe_worker() calls audit_alloc_kernel() (which
> >>      is effectively audit_alloc()) and overwrites @audit_context,
> >>      causing:
> >>
> >>   BUG: memory leak
> >>   unreferenced object 0xffff888144547400 (size 1024):
> >> <...>
> >>     hex dump (first 32 bytes):
> >>       00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
> >>       00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>     backtrace:
> >>       [<ffffffff8135cfc3>] audit_alloc+0x133/0x210
> >>       [<ffffffff81239e63>] copy_process+0xcd3/0x2340
> >>       [<ffffffff8123b5f3>] create_io_thread+0x63/0x90
> >>       [<ffffffff81686604>] create_io_worker+0xb4/0x230
> >>       [<ffffffff81686f68>] io_wqe_enqueue+0x248/0x3b0
> >>       [<ffffffff8167663a>] io_queue_iowq+0xba/0x200
> >>       [<ffffffff816768b3>] io_queue_async+0x113/0x180
> >>       [<ffffffff816840df>] io_req_task_submit+0x18f/0x1a0
> >>       [<ffffffff816841cd>] io_apoll_task_func+0xdd/0x120
> >>       [<ffffffff8167d49f>] tctx_task_work+0x11f/0x570
> >>       [<ffffffff81272c4e>] task_work_run+0x7e/0xc0
> >>       [<ffffffff8125a688>] get_signal+0xc18/0xf10
> >>       [<ffffffff8111645b>] arch_do_signal_or_restart+0x2b/0x730
> >>       [<ffffffff812ea44e>] exit_to_user_mode_prepare+0x5e/0x180
> >>       [<ffffffff844ae1b2>] syscall_exit_to_user_mode+0x12/0x20
> >>       [<ffffffff844a7e80>] do_syscall_64+0x40/0x80
> >>
> >> Then,
> >>
> >>   3. io_sq_thread() or io_wqe_worker() frees @audit_context using
> >>      audit_free();
> >>   4. do_exit() eventually calls audit_free() again, which is okay
> >>      because audit_free() does a NULL check.
> >>
> >> As suggested by Paul Moore, fix it by deleting audit_alloc_kernel() and
> >> redundant audit_free() calls.
> >>
> >> Fixes: 5bd2182d58e9 ("audit,io_uring,io-wq: add some basic audit support to io_uring")
> >> Suggested-by: Paul Moore <paul@paul-moore.com>
> >> Cc: stable@vger.kernel.org
> >> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
> >> ---
> >> Change since v1:
> >>   - Delete audit_alloc_kernel() (Paul Moore)
> >>
> >>  fs/io-wq.c            |  3 ---
> >>  fs/io_uring.c         |  4 ----
> >>  include/linux/audit.h |  5 -----
> >>  kernel/auditsc.c      | 25 -------------------------
> >>  4 files changed, 37 deletions(-)
> >
> > This looks good to me, thanks!  Although it looks like the io_uring
> > related changes will need to be applied by hand as they are pointing
> > to the old layout under fs/ as opposed to the newer layout in
> > io_uring/ introduced during this merge window.
> >
> > Jens, did you want to take this via the io_uring tree or should I take
> > it via the audit tree?  If the latter, an ACK would be appreciated, if
> > the former my ACK is below.
> >
> > Acked-by: Paul Moore <paul@paul-moore.com>
>
> Probably better if I take it, since I need to massage it into the
> current tree anyway. We can then use this one as the base for the stable
> backports that are going to be required.

Sounds good to me, thanks everyone.
diff mbox series

Patch

diff --git a/fs/io-wq.c b/fs/io-wq.c
index 824623bcf1a5..4d8a77a2a150 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -634,8 +634,6 @@  static int io_wqe_worker(void *data)
 	snprintf(buf, sizeof(buf), "iou-wrk-%d", wq->task->pid);
 	set_task_comm(current, buf);
 
-	audit_alloc_kernel(current);
-
 	while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
 		long ret;
 
@@ -670,7 +668,6 @@  static int io_wqe_worker(void *data)
 	if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
 		io_worker_handle_work(worker);
 
-	audit_free(current);
 	io_worker_exit(worker);
 	return 0;
 }
diff --git a/fs/io_uring.c b/fs/io_uring.c
index e8e769be9ed0..e0da0d2f71f2 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -9208,8 +9208,6 @@  static int io_sq_thread(void *data)
 		set_cpus_allowed_ptr(current, cpu_online_mask);
 	current->flags |= PF_NO_SETAFFINITY;
 
-	audit_alloc_kernel(current);
-
 	mutex_lock(&sqd->lock);
 	while (1) {
 		bool cap_entries, sqt_spin = false;
@@ -9283,8 +9281,6 @@  static int io_sq_thread(void *data)
 	io_run_task_work();
 	mutex_unlock(&sqd->lock);
 
-	audit_free(current);
-
 	complete(&sqd->exited);
 	do_exit(0);
 }
diff --git a/include/linux/audit.h b/include/linux/audit.h
index cece70231138..c313df466e6e 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -287,7 +287,6 @@  static inline int audit_signal_info(int sig, struct task_struct *t)
 /* These are defined in auditsc.c */
 				/* Public API */
 extern int  audit_alloc(struct task_struct *task);
-extern int  audit_alloc_kernel(struct task_struct *task);
 extern void __audit_free(struct task_struct *task);
 extern void __audit_uring_entry(u8 op);
 extern void __audit_uring_exit(int success, long code);
@@ -580,10 +579,6 @@  static inline int audit_alloc(struct task_struct *task)
 {
 	return 0;
 }
-static inline int audit_alloc_kernel(struct task_struct *task)
-{
-	return 0;
-}
 static inline void audit_free(struct task_struct *task)
 { }
 static inline void audit_uring_entry(u8 op)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 3a8c9d744800..dd8d9ab747c3 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1073,31 +1073,6 @@  int audit_alloc(struct task_struct *tsk)
 	return 0;
 }
 
-/**
- * audit_alloc_kernel - allocate an audit_context for a kernel task
- * @tsk: the kernel task
- *
- * Similar to the audit_alloc() function, but intended for kernel private
- * threads.  Returns zero on success, negative values on failure.
- */
-int audit_alloc_kernel(struct task_struct *tsk)
-{
-	/*
-	 * At the moment we are just going to call into audit_alloc() to
-	 * simplify the code, but there two things to keep in mind with this
-	 * approach:
-	 *
-	 * 1. Filtering internal kernel tasks is a bit laughable in almost all
-	 * cases, but there is at least one case where there is a benefit:
-	 * the '-a task,never' case allows the admin to effectively disable
-	 * task auditing at runtime.
-	 *
-	 * 2. The {set,clear}_task_syscall_work() ops likely have zero effect
-	 * on these internal kernel tasks, but they probably don't hurt either.
-	 */
-	return audit_alloc(tsk);
-}
-
 static inline void audit_free_context(struct audit_context *context)
 {
 	/* resetting is extra work, but it is likely just noise */