[v4,tip/perf/core,0/4] uprobes,mm: speculative lockless VMA-to-uprobe lookup

Message ID	20241028010818.2487581-1-andrii@kernel.org (mailing list archive)
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDD4A8BEC; Mon, 28 Oct 2024 01:09:06 +0000 (UTC) From: Andrii Nakryiko <andrii@kernel.org> To: linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, peterz@infradead.org Cc: oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, shakeel.butt@linux.dev, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, arnd@arndb.de, richard.weiyang@gmail.com, zhangpeng.00@bytedance.com, linmiaohe@huawei.com, viro@zeniv.linux.org.uk, hca@linux.ibm.com, Andrii Nakryiko <andrii@kernel.org> Subject: [PATCH v4 tip/perf/core 0/4] uprobes,mm: speculative lockless VMA-to-uprobe lookup Date: Sun, 27 Oct 2024 18:08:14 -0700 Message-ID: <20241028010818.2487581-1-andrii@kernel.org> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Series	uprobes,mm: speculative lockless VMA-to-uprobe lookup \| expand [v4,tip/perf/core,0/4] uprobes,mm: speculative lockless VMA-to-uprobe lookup [v4,tip/perf/core,1/4] mm: Convert mm_lock_seq to a proper seqcount [v4,tip/perf/core,2/4] mm: Introduce mmap_lock_speculation_{begin\|end} [v4,tip/perf/core,3/4] uprobes: simplify find_active_uprobe_rcu() VMA checks [v4,tip/perf/core,4/4] uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution

Andrii Nakryiko Oct. 28, 2024, 1:08 a.m. UTC

Implement speculative (lockless) resolution of VMA to inode to uprobe,
bypassing the need to take mmap_lock for reads, if possible. First two patches
by Suren adds mm_struct helpers that help detect whether mm_struct was
changed, which is used by uprobe logic to validate that speculative results
can be trusted after all the lookup logic results in a valid uprobe instance.

Patch #3 is a simplification to uprobe VMA flag checking, suggested by Oleg.

And, finally, patch #4 is the speculative VMA-to-uprobe resolution logic
itself, and is the focal point of this patch set. It makes entry uprobes in
common case scale very well with number of CPUs, as we avoid any locking or
cache line bouncing between CPUs. See corresponding patch for details and
benchmarking results.

Note, this patch set assumes that FMODE_BACKING files were switched to have
SLAB_TYPE_SAFE_BY_RCU semantics, which was recently done by Christian Brauner
in [0]. This change can be pulled into perf/core through stable
tags/vfs-6.13.for-bpf.file tag from [1].

  [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs-6.13.for-bpf.file&id=8b1bc2590af61129b82a189e9dc7c2804c34400e
  [1] git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git

v3->v4:
- rebased and dropped data_race(), given mm_struct uses real seqcount (Peter);
v2->v3:
- dropped kfree_rcu() patch (Christian);
- added data_race() annotations for fields of vma and vma->vm_file which could
  be modified during speculative lookup (Oleg);
- fixed int->long problem in stubs for mmap_lock_speculation_{start,end}(),
  caught by Kernel test robot;
v1->v2:
- adjusted vma_end_write_all() comment to point out it should never be called
  manually now, but I wasn't sure how ACQUIRE/RELEASE comments should be
  reworded (previously requested by Jann), so I'd appreciate some help there
  (Jann);
- int -> long change for mm_lock_seq, as agreed at LPC2024 (Jann, Suren, Liam);
- kfree_rcu_mightsleep() for FMODE_BACKING (Suren, Christian);
- vm_flags simplification in find_active_uprobe_rcu() and
  find_active_uprobe_speculative() (Oleg);
- guard(rcu)() simplified find_active_uprobe_speculative() implementation.

Andrii Nakryiko (2):
  uprobes: simplify find_active_uprobe_rcu() VMA checks
  uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution

Suren Baghdasaryan (2):
  mm: Convert mm_lock_seq to a proper seqcount
  mm: Introduce mmap_lock_speculation_{begin|end}

 include/linux/mm.h               | 12 ++---
 include/linux/mm_types.h         |  7 ++-
 include/linux/mmap_lock.h        | 87 ++++++++++++++++++++++++--------
 kernel/events/uprobes.c          | 47 ++++++++++++++++-
 kernel/fork.c                    |  5 +-
 mm/init-mm.c                     |  2 +-
 tools/testing/vma/vma.c          |  4 +-
 tools/testing/vma/vma_internal.h |  4 +-
 8 files changed, 129 insertions(+), 39 deletions(-)

Andrii Nakryiko Nov. 6, 2024, 2:01 a.m. UTC | #1

On Sun, Oct 27, 2024 at 6:09 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> Implement speculative (lockless) resolution of VMA to inode to uprobe,
> bypassing the need to take mmap_lock for reads, if possible. First two patches
> by Suren adds mm_struct helpers that help detect whether mm_struct was
> changed, which is used by uprobe logic to validate that speculative results
> can be trusted after all the lookup logic results in a valid uprobe instance.
>
> Patch #3 is a simplification to uprobe VMA flag checking, suggested by Oleg.
>
> And, finally, patch #4 is the speculative VMA-to-uprobe resolution logic
> itself, and is the focal point of this patch set. It makes entry uprobes in
> common case scale very well with number of CPUs, as we avoid any locking or
> cache line bouncing between CPUs. See corresponding patch for details and
> benchmarking results.
>
> Note, this patch set assumes that FMODE_BACKING files were switched to have
> SLAB_TYPE_SAFE_BY_RCU semantics, which was recently done by Christian Brauner
> in [0]. This change can be pulled into perf/core through stable
> tags/vfs-6.13.for-bpf.file tag from [1].
>
>   [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs-6.13.for-bpf.file&id=8b1bc2590af61129b82a189e9dc7c2804c34400e
>   [1] git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
>
> v3->v4:
> - rebased and dropped data_race(), given mm_struct uses real seqcount (Peter);
> v2->v3:
> - dropped kfree_rcu() patch (Christian);
> - added data_race() annotations for fields of vma and vma->vm_file which could
>   be modified during speculative lookup (Oleg);
> - fixed int->long problem in stubs for mmap_lock_speculation_{start,end}(),
>   caught by Kernel test robot;
> v1->v2:
> - adjusted vma_end_write_all() comment to point out it should never be called
>   manually now, but I wasn't sure how ACQUIRE/RELEASE comments should be
>   reworded (previously requested by Jann), so I'd appreciate some help there
>   (Jann);
> - int -> long change for mm_lock_seq, as agreed at LPC2024 (Jann, Suren, Liam);
> - kfree_rcu_mightsleep() for FMODE_BACKING (Suren, Christian);
> - vm_flags simplification in find_active_uprobe_rcu() and
>   find_active_uprobe_speculative() (Oleg);
> - guard(rcu)() simplified find_active_uprobe_speculative() implementation.
>
> Andrii Nakryiko (2):
>   uprobes: simplify find_active_uprobe_rcu() VMA checks
>   uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution
>
> Suren Baghdasaryan (2):
>   mm: Convert mm_lock_seq to a proper seqcount
>   mm: Introduce mmap_lock_speculation_{begin|end}
>
>  include/linux/mm.h               | 12 ++---
>  include/linux/mm_types.h         |  7 ++-
>  include/linux/mmap_lock.h        | 87 ++++++++++++++++++++++++--------
>  kernel/events/uprobes.c          | 47 ++++++++++++++++-
>  kernel/fork.c                    |  5 +-
>  mm/init-mm.c                     |  2 +-
>  tools/testing/vma/vma.c          |  4 +-
>  tools/testing/vma/vma_internal.h |  4 +-
>  8 files changed, 129 insertions(+), 39 deletions(-)
>
> --
> 2.43.5
>

Hi!

What's the status of this patch set? Are there any blockers for it to
be applied to perf/core? MM folks are OK with landing the first two
patches in perf/core, so hopefully we should be good to go?

Andrii Nakryiko Nov. 11, 2024, 5:26 p.m. UTC | #2

On Tue, Nov 5, 2024 at 6:01 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Sun, Oct 27, 2024 at 6:09 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Implement speculative (lockless) resolution of VMA to inode to uprobe,
> > bypassing the need to take mmap_lock for reads, if possible. First two patches
> > by Suren adds mm_struct helpers that help detect whether mm_struct was
> > changed, which is used by uprobe logic to validate that speculative results
> > can be trusted after all the lookup logic results in a valid uprobe instance.
> >
> > Patch #3 is a simplification to uprobe VMA flag checking, suggested by Oleg.
> >
> > And, finally, patch #4 is the speculative VMA-to-uprobe resolution logic
> > itself, and is the focal point of this patch set. It makes entry uprobes in
> > common case scale very well with number of CPUs, as we avoid any locking or
> > cache line bouncing between CPUs. See corresponding patch for details and
> > benchmarking results.
> >
> > Note, this patch set assumes that FMODE_BACKING files were switched to have
> > SLAB_TYPE_SAFE_BY_RCU semantics, which was recently done by Christian Brauner
> > in [0]. This change can be pulled into perf/core through stable
> > tags/vfs-6.13.for-bpf.file tag from [1].
> >
> >   [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs-6.13.for-bpf.file&id=8b1bc2590af61129b82a189e9dc7c2804c34400e
> >   [1] git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> >
> > v3->v4:
> > - rebased and dropped data_race(), given mm_struct uses real seqcount (Peter);
> > v2->v3:
> > - dropped kfree_rcu() patch (Christian);
> > - added data_race() annotations for fields of vma and vma->vm_file which could
> >   be modified during speculative lookup (Oleg);
> > - fixed int->long problem in stubs for mmap_lock_speculation_{start,end}(),
> >   caught by Kernel test robot;
> > v1->v2:
> > - adjusted vma_end_write_all() comment to point out it should never be called
> >   manually now, but I wasn't sure how ACQUIRE/RELEASE comments should be
> >   reworded (previously requested by Jann), so I'd appreciate some help there
> >   (Jann);
> > - int -> long change for mm_lock_seq, as agreed at LPC2024 (Jann, Suren, Liam);
> > - kfree_rcu_mightsleep() for FMODE_BACKING (Suren, Christian);
> > - vm_flags simplification in find_active_uprobe_rcu() and
> >   find_active_uprobe_speculative() (Oleg);
> > - guard(rcu)() simplified find_active_uprobe_speculative() implementation.
> >
> > Andrii Nakryiko (2):
> >   uprobes: simplify find_active_uprobe_rcu() VMA checks
> >   uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution
> >
> > Suren Baghdasaryan (2):
> >   mm: Convert mm_lock_seq to a proper seqcount
> >   mm: Introduce mmap_lock_speculation_{begin|end}
> >
> >  include/linux/mm.h               | 12 ++---
> >  include/linux/mm_types.h         |  7 ++-
> >  include/linux/mmap_lock.h        | 87 ++++++++++++++++++++++++--------
> >  kernel/events/uprobes.c          | 47 ++++++++++++++++-
> >  kernel/fork.c                    |  5 +-
> >  mm/init-mm.c                     |  2 +-
> >  tools/testing/vma/vma.c          |  4 +-
> >  tools/testing/vma/vma_internal.h |  4 +-
> >  8 files changed, 129 insertions(+), 39 deletions(-)
> >
> > --
> > 2.43.5
> >
>
> Hi!
>
> What's the status of this patch set? Are there any blockers for it to
> be applied to perf/core? MM folks are OK with landing the first two
> patches in perf/core, so hopefully we should be good to go?

Another week, another ping. Peter, what can I do to make this land? MM
parts are clearly ok with Andrew Morton, uprobe-side logic didn't
change (modulo inconsequential data_race() back and forth) since at
least August, was approved by Oleg, and seems to be very stable in
testing. I think it's time to let me forget about this patch set and
make actual use of it in production, please.

Andrii Nakryiko Nov. 20, 2024, 3:40 p.m. UTC | #3

Linus,

I'm not sure what's going on here, this patch set seems to be in some
sort of "ignore list" on Peter's side with no indication on its
destiny.

I'd really like for this change to go into the new release with the
rest of uprobe improvements that happened this cycle, as they all
nicely complement each other. This patch set has been done-done since
Oct 24 when Suren sent the final version of mm-side changes ([0]),
which I subsequently resent as part of this mm+uprobe patch set on Oct
27, after coordinating that this will go through uprobe subsystem with
Andrew Morton ([1]). The uprobe part was effectively unchanged since
this summer, when this speculative uprobe lookup logic was posted as
part of an earlier RFC series ([2]). That's just to say that this was
thoroughly reviewed, discussed, and stress-tested, meanwhile, and I
see no reason to delay landing it for so long.

I've even written a separate overview email with a summary of all the
uprobe-related work and how it all fits together ([3]), realizing that
there are a few seemingly independent email threads and patch sets,
trying to engage involved maintainers. The outcome was:
  - two patch sets did land (uretprobe + SRCU and Jiri's uprobe
session prerequisites) after a bunch of extra pings, but that's at
least something;
  - Liao's siglock optimization ([4]) still hasn't landed with no
explanation what's the delay;
  - this patch set is also stuck in limbo for weeks now;
  - there was little engagement on arm64 front for Liao's optimization
of uprobes on STP instructions [5], which is perhaps a separate topic
for another email, but just another instance of maintainers not
engaging in timely fashion.

In short, I hope to get your help with the next steps. What can I do
to help land this patch set (and hopefully also others I mentioned
above)?

More broadly, what should be contributors' expectations on timeliness
of maintainers' engagement? Maintainer record in MAINTAINERS can't be
just a veto power, right? It is also a responsibility before others to
move the kernel development along. I'd like to understand what you
think is reasonable to expect here? Same question for patch handling
(applying, reviewing, rejecting, etc.) latency.

Thank you!


  [0] https://lore.kernel.org/linux-mm/20241024205231.1944747-1-surenb@google.com/
  [1] https://lore.kernel.org/linux-mm/20241028204822.6638f330fad809381eafb49c@linux-foundation.org/
  [2] https://lore.kernel.org/linux-trace-kernel/20240813042917.506057-14-andrii@kernel.org/
  [3] https://lore.kernel.org/linux-trace-kernel/CAEf4BzY-0Eu27jyT_s2kRO1UuUPOkE9_SRrBOqu2gJfmxsv+3A@mail.gmail.com/
  [4] https://lore.kernel.org/linux-trace-kernel/CAEf4BzarhiBHAQXECJzP5e-z0fbSaTpfQNPaSXwdgErz2f0vUA@mail.gmail.com/
  [5] https://lore.kernel.org/linux-trace-kernel/CAEf4BzZ3trjMWjvWX4Zy1GzW5RN1ihXZSnLZax7V-mCzAUg2cg@mail.gmail.com/
  [6] https://lore.kernel.org/all/172074397710.247544.17045299807723238107.stgit@devnote2/


On Mon, Nov 11, 2024 at 9:26 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Nov 5, 2024 at 6:01 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Sun, Oct 27, 2024 at 6:09 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> > >
> > > Implement speculative (lockless) resolution of VMA to inode to uprobe,
> > > bypassing the need to take mmap_lock for reads, if possible. First two patches
> > > by Suren adds mm_struct helpers that help detect whether mm_struct was
> > > changed, which is used by uprobe logic to validate that speculative results
> > > can be trusted after all the lookup logic results in a valid uprobe instance.
> > >
> > > Patch #3 is a simplification to uprobe VMA flag checking, suggested by Oleg.
> > >
> > > And, finally, patch #4 is the speculative VMA-to-uprobe resolution logic
> > > itself, and is the focal point of this patch set. It makes entry uprobes in
> > > common case scale very well with number of CPUs, as we avoid any locking or
> > > cache line bouncing between CPUs. See corresponding patch for details and
> > > benchmarking results.
> > >
> > > Note, this patch set assumes that FMODE_BACKING files were switched to have
> > > SLAB_TYPE_SAFE_BY_RCU semantics, which was recently done by Christian Brauner
> > > in [0]. This change can be pulled into perf/core through stable
> > > tags/vfs-6.13.for-bpf.file tag from [1].
> > >
> > >   [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs-6.13.for-bpf.file&id=8b1bc2590af61129b82a189e9dc7c2804c34400e
> > >   [1] git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> > >
> > > v3->v4:
> > > - rebased and dropped data_race(), given mm_struct uses real seqcount (Peter);
> > > v2->v3:
> > > - dropped kfree_rcu() patch (Christian);
> > > - added data_race() annotations for fields of vma and vma->vm_file which could
> > >   be modified during speculative lookup (Oleg);
> > > - fixed int->long problem in stubs for mmap_lock_speculation_{start,end}(),
> > >   caught by Kernel test robot;
> > > v1->v2:
> > > - adjusted vma_end_write_all() comment to point out it should never be called
> > >   manually now, but I wasn't sure how ACQUIRE/RELEASE comments should be
> > >   reworded (previously requested by Jann), so I'd appreciate some help there
> > >   (Jann);
> > > - int -> long change for mm_lock_seq, as agreed at LPC2024 (Jann, Suren, Liam);
> > > - kfree_rcu_mightsleep() for FMODE_BACKING (Suren, Christian);
> > > - vm_flags simplification in find_active_uprobe_rcu() and
> > >   find_active_uprobe_speculative() (Oleg);
> > > - guard(rcu)() simplified find_active_uprobe_speculative() implementation.
> > >
> > > Andrii Nakryiko (2):
> > >   uprobes: simplify find_active_uprobe_rcu() VMA checks
> > >   uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution
> > >
> > > Suren Baghdasaryan (2):
> > >   mm: Convert mm_lock_seq to a proper seqcount
> > >   mm: Introduce mmap_lock_speculation_{begin|end}
> > >
> > >  include/linux/mm.h               | 12 ++---
> > >  include/linux/mm_types.h         |  7 ++-
> > >  include/linux/mmap_lock.h        | 87 ++++++++++++++++++++++++--------
> > >  kernel/events/uprobes.c          | 47 ++++++++++++++++-
> > >  kernel/fork.c                    |  5 +-
> > >  mm/init-mm.c                     |  2 +-
> > >  tools/testing/vma/vma.c          |  4 +-
> > >  tools/testing/vma/vma_internal.h |  4 +-
> > >  8 files changed, 129 insertions(+), 39 deletions(-)
> > >
> > > --
> > > 2.43.5
> > >
> >
> > Hi!
> >
> > What's the status of this patch set? Are there any blockers for it to
> > be applied to perf/core? MM folks are OK with landing the first two
> > patches in perf/core, so hopefully we should be good to go?
>
> Another week, another ping. Peter, what can I do to make this land? MM
> parts are clearly ok with Andrew Morton, uprobe-side logic didn't
> change (modulo inconsequential data_race() back and forth) since at
> least August, was approved by Oleg, and seems to be very stable in
> testing. I think it's time to let me forget about this patch set and
> make actual use of it in production, please.

Peter Zijlstra Nov. 20, 2024, 3:43 p.m. UTC | #4

On Wed, Nov 20, 2024 at 07:40:15AM -0800, Andrii Nakryiko wrote:
> Linus,
> 
> I'm not sure what's going on here, this patch set seems to be in some
> sort of "ignore list" on Peter's side with no indication on its
> destiny.

*sigh* it is not, but my inbox is like drinking from a firehose :/

Ingo Molnar Nov. 20, 2024, 4:03 p.m. UTC | #5

* Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, Nov 20, 2024 at 07:40:15AM -0800, Andrii Nakryiko wrote:
> > Linus,
> > 
> > I'm not sure what's going on here, this patch set seems to be in some
> > sort of "ignore list" on Peter's side with no indication on its
> > destiny.
> 
> *sigh* it is not, but my inbox is like drinking from a firehose :/

And I've been considering that particular series WIP for two reasons:

 1) Oleg was still unconvinced about patch 5/5 in the v2 discussion. 
    Upon re-reading it I think he might have come around and has agreed 
    to the current approach - but sending a v3 & not seeing Oleg object 
    would ascertain that.

 2) There was a build failure reported against -v2 at:

       https://lore.kernel.org/all/202410050745.2Nuvusy4-lkp@intel.com/t.mbox.gz

    We cannot and will not merge patches with build failures.

Andrii did get some other uprobes scalability work merged in v6.13:

    - Switch to RCU Tasks Trace flavor for better performance (Andrii Nakryiko)

    - Massively increase uretprobe SMP scalability by SRCU-protecting
      the uretprobe lifetime (Andrii Nakryiko)

So we've certainly not been ignoring his patches, to the contrary ...

Thanks,

	Ingo

Andrii Nakryiko Nov. 20, 2024, 5:23 p.m. UTC | #6

On Wed, Nov 20, 2024 at 7:43 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Nov 20, 2024 at 07:40:15AM -0800, Andrii Nakryiko wrote:
> > Linus,
> >
> > I'm not sure what's going on here, this patch set seems to be in some
> > sort of "ignore list" on Peter's side with no indication on its
> > destiny.
>
> *sigh* it is not, but my inbox is like drinking from a firehose :/

Yet, you had time to look at and reply to much more recent patch sets
(e.g., [0] and [1], which landed 5 and 3 days ago).

And to be clear, your reviews and input there is appreciated, but
there has to be some wider timeliness and fairness here. This
particular patch set has been ready for a month, it's not that much
time to apply patches. Liao's patch set is even more stale. And for
the latter one I did give you a ping as well ([2]), just in case it
slipped through the cracks. That wasn't enough, unfortunately.

I'm not going to advise you on handling emails out of respect, sorry.
I'm sure you can figure it out. But if you feel overloaded and
overwhelmed, consider not *gaining* more responsibilities, like what
happened with the uprobe subsystem ([2]). Work can be shared,
delegated, and, sometimes, maybe just be "let go" and trust others to
do the right thing.

  [0] https://lore.kernel.org/bpf/20241116194202.GR22801@noisy.programming.kicks-ass.net/
  [1] https://lore.kernel.org/bpf/20241119111809.GB2328@noisy.programming.kicks-ass.net/
  [2] https://lore.kernel.org/linux-trace-kernel/CAEf4BzY-0Eu27jyT_s2kRO1UuUPOkE9_SRrBOqu2gJfmxsv+3A@mail.gmail.com/
  [3] https://lore.kernel.org/all/172074397710.247544.17045299807723238107.stgit@devnote2/

Andrii Nakryiko Nov. 20, 2024, 5:23 p.m. UTC | #7

On Wed, Nov 20, 2024 at 8:03 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Peter Zijlstra <peterz@infradead.org> wrote:
>
> > On Wed, Nov 20, 2024 at 07:40:15AM -0800, Andrii Nakryiko wrote:
> > > Linus,
> > >
> > > I'm not sure what's going on here, this patch set seems to be in some
> > > sort of "ignore list" on Peter's side with no indication on its
> > > destiny.
> >
> > *sigh* it is not, but my inbox is like drinking from a firehose :/
>
> And I've been considering that particular series WIP for two reasons:
>
>  1) Oleg was still unconvinced about patch 5/5 in the v2 discussion.
>     Upon re-reading it I think he might have come around and has agreed
>     to the current approach - but sending a v3 & not seeing Oleg object
>     would ascertain that.

Is this about Liao's siglock patch set? We are at v4 (!) already (see
[0]) with Oleg's Acked-by added.

>
>  2) There was a build failure reported against -v2 at:
>
>        https://lore.kernel.org/all/202410050745.2Nuvusy4-lkp@intel.com/t.mbox.gz
>
>     We cannot and will not merge patches with build failures.

This one is about this patch set (speculative uprobe lookup), right?
It is already at v4 ([1]), while you are mentioning v2 as the reason
for this to not yet be applied. Those build failures were fixed *a
long time ago*, v4 itself has been sitting idle for almost a month
(since Oct 27). If there are any other problems, do bring them up,
don't wait for weeks.

>
> Andrii did get some other uprobes scalability work merged in v6.13:
>
>     - Switch to RCU Tasks Trace flavor for better performance (Andrii Nakryiko)
>
>     - Massively increase uretprobe SMP scalability by SRCU-protecting
>       the uretprobe lifetime (Andrii Nakryiko)
>
> So we've certainly not been ignoring his patches, to the contrary ...

Yes, and as I mentioned, this one is a) ready, reviewed, tested and b)
complements the other work you mention. It removes mmap_lock which
limits scalability of the rest of the work. Is there some rule that I
get to land only two patch sets in a single release?

>
> Thanks,
>
>         Ingo


[0] https://lore.kernel.org/linux-trace-kernel/20241022073141.3291245-1-liaochang1@huawei.com/
[1] https://lore.kernel.org/linux-trace-kernel/20241028010818.2487581-1-andrii@kernel.org/
[2] https://lore.kernel.org/linux-trace-kernel/CAEf4BzYPajbgyvcvm7z1EiPgkee1D1r=a8gaqxzd7k13gh9Uzw@mail.gmail.com/
[3] https://lore.kernel.org/linux-trace-kernel/CAEf4Bza=pwrZvd+3dz-a7eiAQMk9rwBDO1Kk_iwXSCM70CAARw@mail.gmail.com/

Ingo Molnar Nov. 21, 2024, 9:33 a.m. UTC | #8

* Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> Is this about Liao's siglock patch set? We are at v4 (!) already (see 
> [0]) with Oleg's Acked-by added.

AFAICS you didn't Cc: me to -v3 and -v4 - and while I'll generally see 
those patches too, eventually, there's a delay.

> > Andrii did get some other uprobes scalability work merged in v6.13:
> >
> >     - Switch to RCU Tasks Trace flavor for better performance 
> >     (Andrii Nakryiko)
> >
> >     - Massively increase uretprobe SMP scalability by 
> >       SRCU-protecting the uretprobe lifetime (Andrii Nakryiko)
> >
> > So we've certainly not been ignoring his patches, to the contrary 
> > ...
> 
> Yes, and as I mentioned, this one is a) ready, reviewed, tested and 
> b) complements the other work you mention.

Sorry, but patchsets that didn't even build a few weeks before the 
development window closed are generally pushed further down the 
backlog. Think of this as rate-limiting the risk of potentially broken 
code entering the kernel. You can avoid this problem by doing more 
testing, or by accepting that sometimes one more cycle is needed to get 
your patchsets merged.

> [...] It removes mmap_lock which limits scalability of the rest of 
> the work. Is there some rule that I get to land only two patch sets 
> in a single release?

Your facetous question and the hostile tone of your emails is not 
appreciated.

Me pointing out that two other patchsets of yours got integrated simply 
demonstrates how your original complaint of an 'ignore list' is not 
just unprofessional on its face, but also demonstrably unfair:

> > > > I'm not sure what's going on here, this patch set seems to be 
> > > > in some sort of "ignore list" on Peter's side with no 
> > > > indication on its destiny.

Trying to pressure maintainers over a patchset that recently had build 
failures isn't going to get your patches applied faster.

Thanks,

	Ingo

Andrii Nakryiko Nov. 21, 2024, 2:43 p.m. UTC | #9

cc'ing Liao for awareness (should have done it earlier, sorry)

On Thu, Nov 21, 2024 at 1:33 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > Is this about Liao's siglock patch set? We are at v4 (!) already (see
> > [0]) with Oleg's Acked-by added.
>
> AFAICS you didn't Cc: me to -v3 and -v4 - and while I'll generally see
> those patches too, eventually, there's a delay.

Ok, I think we are now switching to my patch set here ([0]), because
Liao's v4 ([1]) does have mingo@redhat.com in CC. So, on Liao's
behalf, there wasn't really anything specific pointed out that would
explain a month's delay.

But let's switch to my patch set. Yes, my bad, I didn't CC you
directly, that wasn't in any way intentional, and that's my bad and I
will make sure to CC you on every patch for uprobes subsystem, even
though you are not explicitly listed in UPROBES section of MAINTAINERS
([2]), and Peter was the one who was handling all the uprobe-related
stuff since before this summer.

But let's please not randomly jump between discussing two separate
patch sets here, it's confusing.

>
> > > Andrii did get some other uprobes scalability work merged in v6.13:
> > >
> > >     - Switch to RCU Tasks Trace flavor for better performance
> > >     (Andrii Nakryiko)
> > >
> > >     - Massively increase uretprobe SMP scalability by
> > >       SRCU-protecting the uretprobe lifetime (Andrii Nakryiko)
> > >
> > > So we've certainly not been ignoring his patches, to the contrary
> > > ...
> >
> > Yes, and as I mentioned, this one is a) ready, reviewed, tested and
> > b) complements the other work you mention.
>
> Sorry, but patchsets that didn't even build a few weeks before the
> development window closed are generally pushed further down the
> backlog. Think of this as rate-limiting the risk of potentially broken
> code entering the kernel. You can avoid this problem by doing more
> testing, or by accepting that sometimes one more cycle is needed to get
> your patchsets merged.

Those build failures were happening in stub implementations, which
were used only on !CONFIG_PER_VMA_LOCK configuration, which I'm not
even sure if possible to get on x86-64. When I tried to reproduce this
locally I couldn't even get such configuration. Thankfully we have a
kernel test robot that would test multiple architectures and
configurations and it did catch it on i386 and loongarch64. And was
fixed quickly.

It doesn't seem fair or reasonable to penalize patch sets for *weeks*,
silently and without any notice just for this, IMO.

The patch set was very thoroughly tested, actually. Not just building,
but also running various unit tests (BPF selftests in particular). But
even more so, I built an entire uprobe stress-testing tool just to
test all my uprobe-related. I deployed custom kernels and ran these
stress tests on all uprobe patch sets and their revisions, over many
hours.

Sure, my main platform is x86-64, so that's where all the testing was
done. But you can't accuse me of negligence.

>
> > [...] It removes mmap_lock which limits scalability of the rest of
> > the work. Is there some rule that I get to land only two patch sets
> > in a single release?
>
> Your facetous question and the hostile tone of your emails is not
> appreciated.

I'm sticking to the facts in these emails. And when I get a response
in the style of "you got two patch sets in, why are you complaining",
that's not exactly friendly and fair. I put a lot of effort and time
not just into producing and testing all those patches, but also into
the logistics of it, coordinating with other people working within
uprobes subsystem.

And instead of accusations, I'd like to get an understanding of
expectations I can have in terms of handling patches. Being ignored
for many weeks is not OK. If you don't like something about what I do
or how, procedurally or technically, please call it out and I'll try
to fix whatever the problem might be. Silent treatment is not
productive.

But while on the topic. Those two patch sets you mentioned didn't go
in smoothly and quickly either. "Switch to RCU Tasks Trace flavor for
better performance" in particular should have gone in with the
original patch set one release earlier. But instead that patch was
dropped from the tree after applying it. Silently. I was not notified
at all (5 days that passed before I asked would be plenty of time to
mention this, IMO). It's good I noticed this, inquired with an email
reply (after making sure it's not some transient patch handling
issue), and only after that I got a reply that there was a build
failure I had to fix. You can see the thread at [4].

>
> Me pointing out that two other patchsets of yours got integrated simply
> demonstrates how your original complaint of an 'ignore list' is not
> just unprofessional on its face, but also demonstrably unfair:
>
> > > > > I'm not sure what's going on here, this patch set seems to be
> > > > > in some sort of "ignore list" on Peter's side with no
> > > > > indication on its destiny.

My "ignore list" complaint is specifically about this patch set, which
I explicitly stated above in the quote you provided. So yes, it's a
professional, and demonstrably fair statement, and I provided the
timeline and supporting links.

>
> Trying to pressure maintainers over a patchset that recently had build
> failures isn't going to get your patches applied faster.
>

I'm not asking to apply my patches blindly without critical review or
anything like that. I'm not expecting reviews or even just email
replies within a few days of posting. I *do* expect some sort of
reaction, though, and not after many weeks of inactivity and pinging
from my side, yes. And note, I got replies only after sending an email
straight to Linus.

I'm not pressuring anyone into anything. But as a maintainer myself, I
do think that being a maintainer is not just about having rights, it's
also about responsibilities.

Let's please stop with the excuses and instead discuss constructive solutions.

> Thanks,
>
>         Ingo

[0] https://lore.kernel.org/linux-trace-kernel/20241028010818.2487581-1-andrii@kernel.org/
[1] https://lore.kernel.org/linux-trace-kernel/20241022073141.3291245-1-liaochang1@huawei.com/
[2] https://lore.kernel.org/all/172074397710.247544.17045299807723238107.stgit@devnote2/
[3] https://github.com/libbpf/libbpf-bootstrap/commit/2f88cef90f9728ec8c7bee7bd48fdbcf197806c3
[4] https://lore.kernel.org/all/CAEf4BzZihPPiReE3anhrVOzjoZW5v4vFVouK_Arm8vJexCTT4g@mail.gmail.com/

[v4,tip/perf/core,0/4] uprobes,mm: speculative lockless VMA-to-uprobe lookup

Message

Comments