Message ID | 20230825161947.GA16871@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | bpf: task_group_seq_get_next: use __next_thread() | expand |
Forgot to mention in the changelog... In any case this doesn't look right. ->group_leader can exit before other threads, call exit_files(), and in this case task_group_seq_get_next() will check task->files == NULL. On 08/25, Oleg Nesterov wrote: > > Unless I am notally confused it is wrong. We are going to return or > skip next_task so we need to check next_task-files, not task->files. > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > --- > kernel/bpf/task_iter.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > index 1589ec3faded..2264870ae3fc 100644 > --- a/kernel/bpf/task_iter.c > +++ b/kernel/bpf/task_iter.c > @@ -82,7 +82,7 @@ static struct task_struct *task_group_seq_get_next(struct bpf_iter_seq_task_comm > > common->pid_visiting = *tid; > > - if (skip_if_dup_files && task->files == task->group_leader->files) { > + if (skip_if_dup_files && next_task->files == next_task->group_leader->files) { > task = next_task; > goto retry; > } > -- > 2.25.1.362.g51ebf55
On 8/25/23 9:19 AM, Oleg Nesterov wrote: > Unless I am notally confused it is wrong. We are going to return or > skip next_task so we need to check next_task-files, not task->files. Thanks for capturing this. This is indeed an oversight. Acked-by: Yonghong Song <yonghong.song@linux.dev> > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > --- > kernel/bpf/task_iter.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > index 1589ec3faded..2264870ae3fc 100644 > --- a/kernel/bpf/task_iter.c > +++ b/kernel/bpf/task_iter.c > @@ -82,7 +82,7 @@ static struct task_struct *task_group_seq_get_next(struct bpf_iter_seq_task_comm > > common->pid_visiting = *tid; > > - if (skip_if_dup_files && task->files == task->group_leader->files) { > + if (skip_if_dup_files && next_task->files == next_task->group_leader->files) { > task = next_task; > goto retry; > }
On 8/25/23 10:04 AM, Oleg Nesterov wrote: > Forgot to mention in the changelog... > > In any case this doesn't look right. ->group_leader can exit before other > threads, call exit_files(), and in this case task_group_seq_get_next() will > check task->files == NULL. It is okay. This won't be affecting correctness. We will end with calling bpf program for 'next_task'. > > On 08/25, Oleg Nesterov wrote: >> >> Unless I am notally confused it is wrong. We are going to return or >> skip next_task so we need to check next_task-files, not task->files. >> >> Signed-off-by: Oleg Nesterov <oleg@redhat.com> >> --- >> kernel/bpf/task_iter.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c >> index 1589ec3faded..2264870ae3fc 100644 >> --- a/kernel/bpf/task_iter.c >> +++ b/kernel/bpf/task_iter.c >> @@ -82,7 +82,7 @@ static struct task_struct *task_group_seq_get_next(struct bpf_iter_seq_task_comm >> >> common->pid_visiting = *tid; >> >> - if (skip_if_dup_files && task->files == task->group_leader->files) { >> + if (skip_if_dup_files && next_task->files == next_task->group_leader->files) { >> task = next_task; >> goto retry; >> } >> -- >> 2.25.1.362.g51ebf55 > >
On 08/25, Yonghong Song wrote: > > On 8/25/23 10:04 AM, Oleg Nesterov wrote: > >Forgot to mention in the changelog... > > > >In any case this doesn't look right. ->group_leader can exit before other > >threads, call exit_files(), and in this case task_group_seq_get_next() will > >check task->files == NULL. > > It is okay. This won't be affecting correctness. We will end with > calling bpf program for 'next_task'. Well, I didn't mean it is necessarily wrong, I simply do not know. But let's suppose that we have a thread group with the main thread M + 1000 sub-threads. In the likely case they all have the same ->files, CLONE_THREAD without CLONE_FILES is not that common. Let's assume the BPF_TASK_ITER_TGID case for simplicity. Now lets look at task_file_seq_get_next() which passes skip_if_dup_files == 1 to task_seq_get_next() and thus to task_group_seq_get_next(). Now, in this case task_seq_get_next() will return non-NULL only once (OK, unless task_file_seq_ops.stop() was called), it will return the group leader M first, then after task_file_seq_get_next() "reports" all the fd's of M and increments info->tid, the next task_seq_get_next(&info->tid, true) should return NULL because of the skip_if_dup_files check in task_group_seq_get_next(). Right? But. if the group leader M exits then M->files == NULL. And in this case task_seq_get_next() will need to "inspect" all the sub-threads even if they all have the same ->files pointer. No? Again, I am not saying this is a bug and quite possibly I misread this code, but in any case the skip_if_dup_files logic looks sub-optimal and confusing to me. Nevermind, please forget. This is minor even if I am right. Thanks for rewiev! Oleg.
On 8/27/23 1:19 PM, Oleg Nesterov wrote: > On 08/25, Yonghong Song wrote: >> >> On 8/25/23 10:04 AM, Oleg Nesterov wrote: >>> Forgot to mention in the changelog... >>> >>> In any case this doesn't look right. ->group_leader can exit before other >>> threads, call exit_files(), and in this case task_group_seq_get_next() will >>> check task->files == NULL. >> >> It is okay. This won't be affecting correctness. We will end with >> calling bpf program for 'next_task'. > > Well, I didn't mean it is necessarily wrong, I simply do not know. > > But let's suppose that we have a thread group with the main thread M + 1000 > sub-threads. In the likely case they all have the same ->files, CLONE_THREAD > without CLONE_FILES is not that common. > > Let's assume the BPF_TASK_ITER_TGID case for simplicity. > > Now lets look at task_file_seq_get_next() which passes skip_if_dup_files == 1 > to task_seq_get_next() and thus to task_group_seq_get_next(). > > Now, in this case task_seq_get_next() will return non-NULL only once (OK, unless > task_file_seq_ops.stop() was called), it will return the group leader M first, > then after task_file_seq_get_next() "reports" all the fd's of M and increments > info->tid, the next task_seq_get_next(&info->tid, true) should return NULL because > of the skip_if_dup_files check in task_group_seq_get_next(). > > Right? > > But. if the group leader M exits then M->files == NULL. And in this case > task_seq_get_next() will need to "inspect" all the sub-threads even if they all > have the same ->files pointer. That is correct. I do not have practical experience on how much possibility this scenario may happen. I assume it should be very low. If this is not the case, we might need to revisit. > > No? > > Again, I am not saying this is a bug and quite possibly I misread this code, but > in any case the skip_if_dup_files logic looks sub-optimal and confusing to me. > > Nevermind, please forget. This is minor even if I am right. > > Thanks for rewiev! > > Oleg. >
On 08/27, Yonghong Song wrote: > > On 8/27/23 1:19 PM, Oleg Nesterov wrote: > > > >But. if the group leader M exits then M->files == NULL. And in this case > >task_seq_get_next() will need to "inspect" all the sub-threads even if they all > >have the same ->files pointer. > > That is correct. I do not have practical experience on how much > possibility this scenario may happen. I assume it should be very low. Yes. I just tried to explain why the ->files check looks confusing to me. Nevermind. Could you review 6/6 as well? Should I fold 1-5 into a single patch? I tried to document every change and simplify the review, but I do not want to blow the git history. Oleg.
On 8/28/23 3:54 AM, Oleg Nesterov wrote: > On 08/27, Yonghong Song wrote: >> >> On 8/27/23 1:19 PM, Oleg Nesterov wrote: >>> >>> But. if the group leader M exits then M->files == NULL. And in this case >>> task_seq_get_next() will need to "inspect" all the sub-threads even if they all >>> have the same ->files pointer. >> >> That is correct. I do not have practical experience on how much >> possibility this scenario may happen. I assume it should be very low. > > Yes. I just tried to explain why the ->files check looks confusing to me. > Nevermind. > > Could you review 6/6 as well? I think we can wait patch 6/6 after https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ is merged. > > Should I fold 1-5 into a single patch? I tried to document every change > and simplify the review, but I do not want to blow the git history. Currently, because patch 6, the whole patch set cannot be tested by bpf CI since it has a build failure: https://github.com/kernel-patches/bpf/pull/5580 I suggest you get patch 1-5 and resubmit with tag like "bpf-next v2" [Patch bpf-next v2 x/5] ... so CI can build with different architectures and compilers to ensure everything builds and runs fine. > > Oleg. >
On 08/28, Yonghong Song wrote: > > On 8/28/23 3:54 AM, Oleg Nesterov wrote: > > > >Could you review 6/6 as well? > > I think we can wait patch 6/6 after > https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ > is merged. OK. > >Should I fold 1-5 into a single patch? I tried to document every change > >and simplify the review, but I do not want to blow the git history. > > Currently, because patch 6, the whole patch set cannot be tested by > bpf CI since it has a build failure: > https://github.com/kernel-patches/bpf/pull/5580 Heh. I thought this is obvious. I thought you can test 1-5 without 6/6 and _review_ 6/6. I simply can't understand how can this pull/5580 come when I specially mentioned > 6/6 obviously depends on > > [PATCH 1/2] introduce __next_thread(), fix next_tid() vs exec() race > https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ > > which was not merged yet. in 0/6. > I suggest you get patch 1-5 and resubmit with tag like > "bpf-next v2" > [Patch bpf-next v2 x/5] ... > so CI can build with different architectures and compilers to > ensure everything builds and runs fine. I think we can wait for https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ as you suggest above, then I'll send the s/next_thread/__next_thread/ oneliner without 1-5. I no longer think it makes sense to try to cleanup the poor task_group_seq_get_next() when IMHO the whole task_iter logic needs the complete rewrite. Yes, yes, I know, it is very easy to blame someone else's code, sorry can't resist ;) The only "fix" in this series is 3/6, but this code has more serious bugs, so I guess we can forget it. Oleg.
On 8/30/23 7:54 PM, Oleg Nesterov wrote: > On 08/28, Yonghong Song wrote: >> >> On 8/28/23 3:54 AM, Oleg Nesterov wrote: >>> >>> Could you review 6/6 as well? >> >> I think we can wait patch 6/6 after >> https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ >> is merged. > > OK. > >>> Should I fold 1-5 into a single patch? I tried to document every change >>> and simplify the review, but I do not want to blow the git history. >> >> Currently, because patch 6, the whole patch set cannot be tested by >> bpf CI since it has a build failure: >> https://github.com/kernel-patches/bpf/pull/5580 > > Heh. I thought this is obvious. I thought you can test 1-5 without 6/6 > and _review_ 6/6. > > I simply can't understand how can this pull/5580 come when I specially > mentioned > > > 6/6 obviously depends on > > > > [PATCH 1/2] introduce __next_thread(), fix next_tid() vs exec() race > > https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ > > > > which was not merged yet. > > in 0/6. The process in CI for testing is fully automated, and it does not look at commit message. That is why it takes the whole series. This is true for all other patch set. > >> I suggest you get patch 1-5 and resubmit with tag like >> "bpf-next v2" >> [Patch bpf-next v2 x/5] ... >> so CI can build with different architectures and compilers to >> ensure everything builds and runs fine. > > I think we can wait for > > https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ > > as you suggest above, then I'll send the s/next_thread/__next_thread/ > oneliner without 1-5. I no longer think it makes sense to try to cleanup > the poor task_group_seq_get_next() when IMHO the whole task_iter logic > needs the complete rewrite. Yes, yes, I know, it is very easy to blame > someone else's code, sorry can't resist ;) > > The only "fix" in this series is 3/6, but this code has more serious > bugs, so I guess we can forget it. > > Oleg. >
On 08/31, Yonghong Song wrote: > > On 8/30/23 7:54 PM, Oleg Nesterov wrote: > > > >I simply can't understand how can this pull/5580 come when I specially > >mentioned > > > > > 6/6 obviously depends on > > > > > > [PATCH 1/2] introduce __next_thread(), fix next_tid() vs exec() race > > > https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ > > > > > > which was not merged yet. > > > >in 0/6. > > The process in CI for testing is fully automated, Ah, OK, sorry then. > >>I suggest you get patch 1-5 and resubmit with tag like > >> "bpf-next v2" > >> [Patch bpf-next v2 x/5] ... > >>so CI can build with different architectures and compilers to > >>ensure everything builds and runs fine. OK, will do when I have time. Thanks, Oleg.
diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index 1589ec3faded..2264870ae3fc 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -82,7 +82,7 @@ static struct task_struct *task_group_seq_get_next(struct bpf_iter_seq_task_comm common->pid_visiting = *tid; - if (skip_if_dup_files && task->files == task->group_leader->files) { + if (skip_if_dup_files && next_task->files == next_task->group_leader->files) { task = next_task; goto retry; }
Unless I am notally confused it is wrong. We are going to return or skip next_task so we need to check next_task-files, not task->files. Signed-off-by: Oleg Nesterov <oleg@redhat.com> --- kernel/bpf/task_iter.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)