Message ID | 55888b6a644b4fc490849832fd5c5e5bfed523ef.1730687879.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | io_uring/cmd: let cmds to know about dying task | expand |
On 11/4/24 9:12 AM, Pavel Begunkov wrote: > When the taks that submitted a request is dying, a task work for that > request might get run by a kernel thread or even worse by a half > dismantled task. We can't just cancel the task work without running the > callback as the cmd might need to do some clean up, so pass a flag > instead. If set, it's not safe to access any task resources and the > callback is expected to cancel the cmd ASAP. > > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > --- > > Made a bit fancier to avoid conflicts. Mark, as before I'd suggest you > to take it and send together with the fix. That's fine, or we can just take it through the io_uring tree, it's not like this matters as both will land before -rc1. But if it goes through the btrfs tree, we can adjust this to use io_should_terminate_tw() after the fact. Reviewed-by: Jens Axboe <axboe@kernel.dk>
On 11/4/24 16:15, Jens Axboe wrote: > On 11/4/24 9:12 AM, Pavel Begunkov wrote: >> When the taks that submitted a request is dying, a task work for that >> request might get run by a kernel thread or even worse by a half >> dismantled task. We can't just cancel the task work without running the >> callback as the cmd might need to do some clean up, so pass a flag >> instead. If set, it's not safe to access any task resources and the >> callback is expected to cancel the cmd ASAP. >> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >> --- >> >> Made a bit fancier to avoid conflicts. Mark, as before I'd suggest you >> to take it and send together with the fix. > > That's fine, or we can just take it through the io_uring tree, it's not > like this matters as both will land before -rc1. There should be a btrfs patch that depends on it and I would hope it gets squashed into the main patchset or at least goes into the same pull and not delayed to rc2. > But if it goes through the btrfs tree, we can adjust this to use > io_should_terminate_tw() after the fact. > > Reviewed-by: Jens Axboe <axboe@kernel.dk> >
On 11/4/24 9:47 AM, Pavel Begunkov wrote: > On 11/4/24 16:15, Jens Axboe wrote: >> On 11/4/24 9:12 AM, Pavel Begunkov wrote: >>> When the taks that submitted a request is dying, a task work for that >>> request might get run by a kernel thread or even worse by a half >>> dismantled task. We can't just cancel the task work without running the >>> callback as the cmd might need to do some clean up, so pass a flag >>> instead. If set, it's not safe to access any task resources and the >>> callback is expected to cancel the cmd ASAP. >>> >>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >>> --- >>> >>> Made a bit fancier to avoid conflicts. Mark, as before I'd suggest you >>> to take it and send together with the fix. >> >> That's fine, or we can just take it through the io_uring tree, it's not >> like this matters as both will land before -rc1. > > There should be a btrfs patch that depends on it and I would hope > it gets squashed into the main patchset or at least goes into the > same pull and not delayed to rc2. Right, all I'm saying is that both will land in -rc1 and it doesn't really matter. Even if it's -rc2 it's not like a potential breakage with this for certain exiting conditions is an issue. All that really matters is that the final release is fine. But like I said, I don't really care - it can go through the btrfs tree as-is, or I can take it and it'll land in -rc1. If the latter, then I'd just modify it to use io_should_terminate_tw() fro the get-go, if it goes via the btrfs tree, then we can do a separate patch for that after the fact. I just need to know what the btrfs people intend to do here, so I can plan accordingly.
On Mon, Nov 04, 2024 at 10:31:19AM -0700, Jens Axboe wrote: > On 11/4/24 9:47 AM, Pavel Begunkov wrote: > > On 11/4/24 16:15, Jens Axboe wrote: > >> On 11/4/24 9:12 AM, Pavel Begunkov wrote: > >>> When the taks that submitted a request is dying, a task work for that > >>> request might get run by a kernel thread or even worse by a half > >>> dismantled task. We can't just cancel the task work without running the > >>> callback as the cmd might need to do some clean up, so pass a flag > >>> instead. If set, it's not safe to access any task resources and the > >>> callback is expected to cancel the cmd ASAP. > >>> > >>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > >>> --- > >>> > >>> Made a bit fancier to avoid conflicts. Mark, as before I'd suggest you > >>> to take it and send together with the fix. > >> > >> That's fine, or we can just take it through the io_uring tree, it's not > >> like this matters as both will land before -rc1. > > > > There should be a btrfs patch that depends on it and I would hope > > it gets squashed into the main patchset or at least goes into the > > same pull and not delayed to rc2. > > Right, all I'm saying is that both will land in -rc1 and it doesn't > really matter. Even if it's -rc2 it's not like a potential breakage with > this for certain exiting conditions is an issue. All that really matters > is that the final release is fine. > > But like I said, I don't really care - it can go through the btrfs tree > as-is, or I can take it and it'll land in -rc1. If the latter, then I'd > just modify it to use io_should_terminate_tw() fro the get-go, if it > goes via the btrfs tree, then we can do a separate patch for that after > the fact. > > I just need to know what the btrfs people intend to do here, so I can > plan accordingly. I'll add it to btrfs tree, branch for-next and it will be in the main merge window pull request.
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index ad5001102c86..bfdf9cbceda9 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -37,6 +37,7 @@ enum io_uring_cmd_flags { /* set when uring wants to cancel a previously issued command */ IO_URING_F_CANCEL = (1 << 11), IO_URING_F_COMPAT = (1 << 12), + IO_URING_F_TASK_DEAD = (1 << 13), }; struct io_wq_work_node { diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 40b8b777ba12..f0113548ec92 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -119,9 +119,13 @@ EXPORT_SYMBOL_GPL(io_uring_cmd_mark_cancelable); static void io_uring_cmd_work(struct io_kiocb *req, struct io_tw_state *ts) { struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); + unsigned int flags = IO_URING_F_COMPLETE_DEFER; + + if (current->flags & (PF_EXITING | PF_KTHREAD)) + flags |= IO_URING_F_TASK_DEAD; /* task_work executor checks the deffered list completion */ - ioucmd->task_work_cb(ioucmd, IO_URING_F_COMPLETE_DEFER); + ioucmd->task_work_cb(ioucmd, flags); } void __io_uring_cmd_do_in_task(struct io_uring_cmd *ioucmd,
When the taks that submitted a request is dying, a task work for that request might get run by a kernel thread or even worse by a half dismantled task. We can't just cancel the task work without running the callback as the cmd might need to do some clean up, so pass a flag instead. If set, it's not safe to access any task resources and the callback is expected to cancel the cmd ASAP. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- Made a bit fancier to avoid conflicts. Mark, as before I'd suggest you to take it and send together with the fix. include/linux/io_uring_types.h | 1 + io_uring/uring_cmd.c | 6 +++++- 2 files changed, 6 insertions(+), 1 deletion(-)