[net-next,0/5] net: add refcnt tracking for some common objects

Message ID	cover.1638849511.git.lucien.xin@gmail.com (mailing list archive)
Headers	show Return-Path: <netdev-owner@kernel.org> From: Xin Long <lucien.xin@gmail.com> To: network dev <netdev@vger.kernel.org> Cc: Eric Dumazet <edumazet@google.com>, davem@davemloft.net, kuba@kernel.org, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>, Davide Caratti <dcaratti@redhat.com>, Paolo Abeni <pabeni@redhat.com> Subject: [PATCH net-next 0/5] net: add refcnt tracking for some common objects Date: Mon, 6 Dec 2021 23:02:03 -0500 Message-Id: <cover.1638849511.git.lucien.xin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	net: add refcnt tracking for some common objects \| expand [net-next,0/5] net: add refcnt tracking for some common objects [net-next,1/5] lib: add obj_cnt infrastructure [net-next,2/5] net: track netdev refcnt with obj_cnt [net-next,3/5] net: track dst refcnt with obj_cnt [net-next,4/5] net: track in6_dev refcnt with obj_cnt [net-next,5/5] net: track xfrm_state refcnt with obj_cnt

Message ID

cover.1638849511.git.lucien.xin@gmail.com (mailing list archive)

Headers

From: Xin Long <lucien.xin@gmail.com>
To: network dev <netdev@vger.kernel.org>
Cc: Eric Dumazet <edumazet@google.com>, davem@davemloft.net,
        kuba@kernel.org,
        Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
        Davide Caratti <dcaratti@redhat.com>,
        Paolo Abeni <pabeni@redhat.com>
Subject: [PATCH net-next 0/5] net: add refcnt tracking for some common objects
Date: Mon,  6 Dec 2021 23:02:03 -0500
Message-Id: <cover.1638849511.git.lucien.xin@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

net: add refcnt tracking for some common objects | expand

Message

Xin Long Dec. 7, 2021, 4:02 a.m. UTC

This patchset provides a simple lib(obj_cnt) to count the operatings on any
objects, and saves them into a gobal hashtable. Each node in this hashtable
can be identified with a calltrace and an object pointer. A calltrace could
be a function called from somewhere, like dev_hold() called by:

    inetdev_init+0xff/0x1c0
    inetdev_event+0x4b7/0x600
    raw_notifier_call_chain+0x41/0x50
    register_netdevice+0x481/0x580

and an object pointer would be the dev that this function is accessing:

    dev_hold(dev).

When this call comes to this object, a node including calltrace + object +
counter will be created if it doesn't exist, and the counter in this node
will increment if it already exists. Pretty simple.

So naturally this lib can be used to track the refcnt of any objects, all
it has to do is put obj_cnt_track() to the place where this object is
held or put. It will count how many times this call has operated this
object after checking if this object and this type(hold/put) accessing
are being tracked.

After the 1st lib patch, the other patches add the refcnt tracking for
netdev, dst, in6_dev and xfrm_state, and each has example how to use
in the changelog. The common use is:

    # sysctl -w obj_cnt.control="clear" # clear the old result

    # sysctl -w obj_cnt.type=0x1     # track type 0x1 operating
    # sysctl -w obj_cnt.name=test    # match name == test or
    # sysctl -w obj_cnt.index=1      # match index == 1
    # sysctl -w obj_cnt.nr_entries=4 # save 4 frames' calltrace

    ... (reproduce the issue)

    # sysctl -w obj_cnt.control="scan"  # print the new result

Note that after seeing Eric's another patchset for refcnt tracking I
decided to post this patchset. As in this implemenation, it has some
benefits which I think worth sharing:

  - it runs fast:
    1. it doesn't create nodes for the repeatitive calls to the same
       objects, and it saves memory and time.
    2. the depth of the calltrace to record is configurable, at most
       time small calltrace also saves memory and time, but will not
       affect the analysis.
    3. kmem_cache used also contributes to the performance.

  - easy to use:
    1. it doesn't add any members to the object structure, just place
       an API to the hold/put functions, and it keep the kernel code
       clear and won't break any ABIs.
    2. three types of matching conditions for tracking can be set up,
       int, string by sysctl and API, and pointer by API.

This patchset has helped solve quite some refcnt leaks, from netdev to
dst, in6_dev, xfrm_dst. There are also some difficult cases that we've
addressed with this pathset:

  - some leaks were only reproduciable in customer's environment by running
    for a couple of months, "probe" data was even too huge to save and
    analyse, so saving memory is crucial.

  - some are not able to reproduce if the tracking patch worked slowly,
    like not using kmem cache, so running fast is important.

  - some leak was a chain, such as a leak we see it as a dev leak, but it
    was caused by a dst or in6_dev leak, and this dst or in6_dev leak was
    caused by another object, so tracking multiple types at the same time
    is effective.

Xin Long (5):
  lib: add obj_cnt infrastructure
  net: track netdev refcnt with obj_cnt
  net: track dst refcnt with obj_cnt
  net: track in6_dev refcnt with obj_cnt
  net: track xfrm_state refcnt with obj_cnt

 include/linux/netdevice.h |  11 ++
 include/linux/obj_cnt.h   |  20 +++
 include/net/addrconf.h    |   7 +-
 include/net/dst.h         |   8 +-
 include/net/sock.h        |   3 +-
 include/net/xfrm.h        |  11 ++
 lib/Kconfig.debug         |   7 +
 lib/Makefile              |   1 +
 lib/obj_cnt.c             | 285 ++++++++++++++++++++++++++++++++++++++
 net/core/dst.c            |   2 +
 10 files changed, 352 insertions(+), 3 deletions(-)
 create mode 100644 include/linux/obj_cnt.h
 create mode 100644 lib/obj_cnt.c

Comments

Eric Dumazet Dec. 7, 2021, 4:41 a.m. UTC | #1

On Mon, Dec 6, 2021 at 8:02 PM Xin Long <lucien.xin@gmail.com> wrote:
>
> This patchset provides a simple lib(obj_cnt) to count the operatings on any
> objects, and saves them into a gobal hashtable. Each node in this hashtable
> can be identified with a calltrace and an object pointer. A calltrace could
> be a function called from somewhere, like dev_hold() called by:
>
>     inetdev_init+0xff/0x1c0
>     inetdev_event+0x4b7/0x600
>     raw_notifier_call_chain+0x41/0x50
>     register_netdevice+0x481/0x580
>
> and an object pointer would be the dev that this function is accessing:
>
>     dev_hold(dev).
>
> When this call comes to this object, a node including calltrace + object +
> counter will be created if it doesn't exist, and the counter in this node
> will increment if it already exists. Pretty simple.
>
> So naturally this lib can be used to track the refcnt of any objects, all
> it has to do is put obj_cnt_track() to the place where this object is
> held or put. It will count how many times this call has operated this
> object after checking if this object and this type(hold/put) accessing
> are being tracked.
>
> After the 1st lib patch, the other patches add the refcnt tracking for
> netdev, dst, in6_dev and xfrm_state, and each has example how to use
> in the changelog. The common use is:
>
>     # sysctl -w obj_cnt.control="clear" # clear the old result
>
>     # sysctl -w obj_cnt.type=0x1     # track type 0x1 operating
>     # sysctl -w obj_cnt.name=test    # match name == test or
>     # sysctl -w obj_cnt.index=1      # match index == 1
>     # sysctl -w obj_cnt.nr_entries=4 # save 4 frames' calltrace
>
>     ... (reproduce the issue)
>
>     # sysctl -w obj_cnt.control="scan"  # print the new result
>
> Note that after seeing Eric's another patchset for refcnt tracking I
> decided to post this patchset. As in this implemenation, it has some
> benefits which I think worth sharing:
>

How can your code coexist with ref_tracker ?

>   - it runs fast:
>     1. it doesn't create nodes for the repeatitive calls to the same
>        objects, and it saves memory and time.
>     2. the depth of the calltrace to record is configurable, at most
>        time small calltrace also saves memory and time, but will not
>        affect the analysis.
>     3. kmem_cache used also contributes to the performance.

Points 2/3 can be implemented right away in the ref_tracker infra,
please send patches.

Quite frankly using a global hash table seems wrong, stack_depot
already has this logic, why reimplement it ?
stack_depot is damn fast (no spinlock in fast path)

Seeing that your patches add chunks in lib/obj_cnt.c, I do not see how
you can claim this is generic code.

I don't know, it seems very strange to send this patch series now I
have done about 60 patches on these issues.

And by doing this work, I found already two bugs in our stack.

You can be sure syzbot will send us many reports, most syzbot repros
use a very limited number of objects.

About performance : You use a single spinlock to protect your hash table.
In my implementation, there is a spinlock per 'directory (eg one
spinlock per struct net_device, one spinlock per struct net), it is
more scalable.

My tests have not shown a significant cost of the ref_tracker
(the major cost comes from stack_trace_save() which you also use)

Xin Long Dec. 7, 2021, 6:17 p.m. UTC | #2

On Mon, Dec 6, 2021 at 11:41 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Dec 6, 2021 at 8:02 PM Xin Long <lucien.xin@gmail.com> wrote:
> >
> > This patchset provides a simple lib(obj_cnt) to count the operatings on any
> > objects, and saves them into a gobal hashtable. Each node in this hashtable
> > can be identified with a calltrace and an object pointer. A calltrace could
> > be a function called from somewhere, like dev_hold() called by:
> >
> >     inetdev_init+0xff/0x1c0
> >     inetdev_event+0x4b7/0x600
> >     raw_notifier_call_chain+0x41/0x50
> >     register_netdevice+0x481/0x580
> >
> > and an object pointer would be the dev that this function is accessing:
> >
> >     dev_hold(dev).
> >
> > When this call comes to this object, a node including calltrace + object +
> > counter will be created if it doesn't exist, and the counter in this node
> > will increment if it already exists. Pretty simple.
> >
> > So naturally this lib can be used to track the refcnt of any objects, all
> > it has to do is put obj_cnt_track() to the place where this object is
> > held or put. It will count how many times this call has operated this
> > object after checking if this object and this type(hold/put) accessing
> > are being tracked.
> >
> > After the 1st lib patch, the other patches add the refcnt tracking for
> > netdev, dst, in6_dev and xfrm_state, and each has example how to use
> > in the changelog. The common use is:
> >
> >     # sysctl -w obj_cnt.control="clear" # clear the old result
> >
> >     # sysctl -w obj_cnt.type=0x1     # track type 0x1 operating
> >     # sysctl -w obj_cnt.name=test    # match name == test or
> >     # sysctl -w obj_cnt.index=1      # match index == 1
> >     # sysctl -w obj_cnt.nr_entries=4 # save 4 frames' calltrace
> >
> >     ... (reproduce the issue)
> >
> >     # sysctl -w obj_cnt.control="scan"  # print the new result
> >
> > Note that after seeing Eric's another patchset for refcnt tracking I
> > decided to post this patchset. As in this implemenation, it has some
> > benefits which I think worth sharing:
> >
>
> How can your code coexist with ref_tracker ?
Hi, Eric, Thanks for your checking

It won't affect ref_tracker, one can even use both at the same time.

>
> >   - it runs fast:
> >     1. it doesn't create nodes for the repeatitive calls to the same
> >        objects, and it saves memory and time.
> >     2. the depth of the calltrace to record is configurable, at most
> >        time small calltrace also saves memory and time, but will not
> >        affect the analysis.
> >     3. kmem_cache used also contributes to the performance.
>
> Points 2/3 can be implemented right away in the ref_tracker infra,
> please send patches.
>
> Quite frankly using a global hash table seems wrong, stack_depot
> already has this logic, why reimplement it ?
> stack_depot is damn fast (no spinlock in fast path)
What this patchset is trying to add is a calltrace+object counter.
I was looking at stack_depot after seeing you patch, stack_depot saves
calltrace only, no object(I guess this is okay, I can save object to
to entries[0] if I want to use it), but also it's not a counter.

I'm not sure if it's allowed to do some change and add a counter to
the node of stack_depot, like when it's found in saving, the counter
increments. That will be perfect for this patchset.

This global spinlock will eventually be used only to protect the new
node's insertion. For the fast path (lookup), rcu_read_lock() will take
care of it. I haven't got time to add it. but this won't be a problem.

>
> Seeing that your patches add chunks in lib/obj_cnt.c, I do not see how
> you can claim this is generic code.
I planned it as a obj operating counter, it can be used for counting any
operatings, not just for the refcnt tracker which is only _put and _hold
operatings.

>
> I don't know, it seems very strange to send this patch series now I
> have done about 60 patches on these issues.
This patch is not to do exactly the same things as your patchset, I think your
patch saves more information into the objects in the kernel memory, it will
be useful for vmcore analysis.

This patchset is working in a different way, it's going to target a
specific object with index or name or pointer matched and some types
of function calls to it, we have to plan in advance after we know
which object (like it's name, index or string to match) is leaked.

>
> And by doing this work, I found already two bugs in our stack.
Great effects!
I can see that you must go over all networking stack for dev operations.

>
> You can be sure syzbot will send us many reports, most syzbot repros
> use a very limited number of objects.
>
> About performance : You use a single spinlock to protect your hash table.
> In my implementation, there is a spinlock per 'directory (eg one
> spinlock per struct net_device, one spinlock per struct net), it is
> more scalable.
I used per net spinlock at first, but I want to make the code more generic,
and not only for the network, then I decided to make it not related to net.

After using rcu_lock in the fast path, I think this single spinlock won't
affect much, besides, this single lock can be replaced by a per hlist lock
on each hlist_head, it will also save some.

>
> My tests have not shown a significant cost of the ref_tracker
> (the major cost comes from stack_trace_save() which you also use)
I added "run fast" in cover, mostly because it won't create many nodes
if dev_hold/put are called many times, it only increments the count if it's
the same call to the same object already existing in the hashtable.

dev could be fine, thinking about tracking dst, when sending packets, dst
can be hold/put too many times, creating nodes for each call is not a good
idea, especially for some leak only occurs once for few months which I've
seen quite a few times in our customer envs.


Other things are:
most net_dev leaks I've met are actually dst leak, some are in6_dev leak,
only tracking net_dev is not enough to address it, do you have plans to
add dst track for dst too? That may be a lot of changes too?

I think adding new members into core structures only for debugging
may not be a good choice, as it will bring troubles to downstream for
the backport because of kABI.

Thanks.

Eric Dumazet Dec. 7, 2021, 6:34 p.m. UTC | #3

On Tue, Dec 7, 2021 at 10:17 AM Xin Long <lucien.xin@gmail.com> wrote:
>
> On Mon, Dec 6, 2021 at 11:41 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Mon, Dec 6, 2021 at 8:02 PM Xin Long <lucien.xin@gmail.com> wrote:
> > >
> > > This patchset provides a simple lib(obj_cnt) to count the operatings on any
> > > objects, and saves them into a gobal hashtable. Each node in this hashtable
> > > can be identified with a calltrace and an object pointer. A calltrace could
> > > be a function called from somewhere, like dev_hold() called by:
> > >
> > >     inetdev_init+0xff/0x1c0
> > >     inetdev_event+0x4b7/0x600
> > >     raw_notifier_call_chain+0x41/0x50
> > >     register_netdevice+0x481/0x580
> > >
> > > and an object pointer would be the dev that this function is accessing:
> > >
> > >     dev_hold(dev).
> > >
> > > When this call comes to this object, a node including calltrace + object +
> > > counter will be created if it doesn't exist, and the counter in this node
> > > will increment if it already exists. Pretty simple.
> > >
> > > So naturally this lib can be used to track the refcnt of any objects, all
> > > it has to do is put obj_cnt_track() to the place where this object is
> > > held or put. It will count how many times this call has operated this
> > > object after checking if this object and this type(hold/put) accessing
> > > are being tracked.
> > >
> > > After the 1st lib patch, the other patches add the refcnt tracking for
> > > netdev, dst, in6_dev and xfrm_state, and each has example how to use
> > > in the changelog. The common use is:
> > >
> > >     # sysctl -w obj_cnt.control="clear" # clear the old result
> > >
> > >     # sysctl -w obj_cnt.type=0x1     # track type 0x1 operating
> > >     # sysctl -w obj_cnt.name=test    # match name == test or
> > >     # sysctl -w obj_cnt.index=1      # match index == 1
> > >     # sysctl -w obj_cnt.nr_entries=4 # save 4 frames' calltrace
> > >
> > >     ... (reproduce the issue)
> > >
> > >     # sysctl -w obj_cnt.control="scan"  # print the new result
> > >
> > > Note that after seeing Eric's another patchset for refcnt tracking I
> > > decided to post this patchset. As in this implemenation, it has some
> > > benefits which I think worth sharing:
> > >
> >
> > How can your code coexist with ref_tracker ?
> Hi, Eric, Thanks for your checking
>
> It won't affect ref_tracker, one can even use both at the same time.
>
> >
> > >   - it runs fast:
> > >     1. it doesn't create nodes for the repeatitive calls to the same
> > >        objects, and it saves memory and time.
> > >     2. the depth of the calltrace to record is configurable, at most
> > >        time small calltrace also saves memory and time, but will not
> > >        affect the analysis.
> > >     3. kmem_cache used also contributes to the performance.
> >
> > Points 2/3 can be implemented right away in the ref_tracker infra,
> > please send patches.
> >
> > Quite frankly using a global hash table seems wrong, stack_depot
> > already has this logic, why reimplement it ?
> > stack_depot is damn fast (no spinlock in fast path)
> What this patchset is trying to add is a calltrace+object counter.
> I was looking at stack_depot after seeing you patch, stack_depot saves
> calltrace only, no object(I guess this is okay, I can save object to
> to entries[0] if I want to use it), but also it's not a counter.
>
> I'm not sure if it's allowed to do some change and add a counter to
> the node of stack_depot, like when it's found in saving, the counter
> increments. That will be perfect for this patchset.
>
> This global spinlock will eventually be used only to protect the new
> node's insertion. For the fast path (lookup), rcu_read_lock() will take
> care of it. I haven't got time to add it. but this won't be a problem.
>
> >
> > Seeing that your patches add chunks in lib/obj_cnt.c, I do not see how
> > you can claim this is generic code.
> I planned it as a obj operating counter, it can be used for counting any
> operatings, not just for the refcnt tracker which is only _put and _hold
> operatings.
>
> >
> > I don't know, it seems very strange to send this patch series now I
> > have done about 60 patches on these issues.
> This patch is not to do exactly the same things as your patchset, I think your
> patch saves more information into the objects in the kernel memory, it will
> be useful for vmcore analysis.
>
> This patchset is working in a different way, it's going to target a
> specific object with index or name or pointer matched and some types
> of function calls to it, we have to plan in advance after we know
> which object (like it's name, index or string to match) is leaked.
>
> >
> > And by doing this work, I found already two bugs in our stack.
> Great effects!
> I can see that you must go over all networking stack for dev operations.
>
> >
> > You can be sure syzbot will send us many reports, most syzbot repros
> > use a very limited number of objects.
> >
> > About performance : You use a single spinlock to protect your hash table.
> > In my implementation, there is a spinlock per 'directory (eg one
> > spinlock per struct net_device, one spinlock per struct net), it is
> > more scalable.
> I used per net spinlock at first, but I want to make the code more generic,
> and not only for the network, then I decided to make it not related to net.
>
> After using rcu_lock in the fast path, I think this single spinlock won't
> affect much, besides, this single lock can be replaced by a per hlist lock
> on each hlist_head, it will also save some.
>
> >
> > My tests have not shown a significant cost of the ref_tracker
> > (the major cost comes from stack_trace_save() which you also use)
> I added "run fast" in cover, mostly because it won't create many nodes
> if dev_hold/put are called many times, it only increments the count if it's
> the same call to the same object already existing in the hashtable.
>
> dev could be fine, thinking about tracking dst, when sending packets, dst
> can be hold/put too many times, creating nodes for each call is not a good
> idea, especially for some leak only occurs once for few months which I've
> seen quite a few times in our customer envs.

That is why I chose to not automatically track all dev_hold()/dev_put()

For the ones that are contained in a short function, with clear
contract, there is
no need for adding tracking.

But before taking this decision, I am looking at each case, very carefully.

>
>
> Other things are:
> most net_dev leaks I've met are actually dst leak, some are in6_dev leak,
> only tracking net_dev is not enough to address it, do you have plans to
> add dst track for dst too? That may be a lot of changes too?

Yes, the plan is to add trackers when we think there is a high
probability of having leaks.

I think you missed the one major point of the exercise :

By carefully doing the pairing, we can spot bugs already, even without
turning the CONFIG_ options.

Once this (patient) work done, it is done, the infra will catch future
bugs right away.

>
> I think adding new members into core structures only for debugging
> may not be a good choice, as it will bring troubles to downstream for
> the backport because of kABI.

The main point of this stuff, is to allow syzbot to simply turn on a
few CONFIG options and let
it discover past/new issues. No special sysctls magics.

I see you added introspection, to list all active references, I think
this could be added on a per object
basis, to help developers have ways to better understand the kernel behavior.

Again, I think there is no reason your work can not complement mine.
They serve different purposes.

Xin Long Dec. 7, 2021, 9:58 p.m. UTC | #4

On Tue, Dec 7, 2021 at 1:34 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Dec 7, 2021 at 10:17 AM Xin Long <lucien.xin@gmail.com> wrote:
> >
> > On Mon, Dec 6, 2021 at 11:41 PM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Mon, Dec 6, 2021 at 8:02 PM Xin Long <lucien.xin@gmail.com> wrote:
> > > >
> > > > This patchset provides a simple lib(obj_cnt) to count the operatings on any
> > > > objects, and saves them into a gobal hashtable. Each node in this hashtable
> > > > can be identified with a calltrace and an object pointer. A calltrace could
> > > > be a function called from somewhere, like dev_hold() called by:
> > > >
> > > >     inetdev_init+0xff/0x1c0
> > > >     inetdev_event+0x4b7/0x600
> > > >     raw_notifier_call_chain+0x41/0x50
> > > >     register_netdevice+0x481/0x580
> > > >
> > > > and an object pointer would be the dev that this function is accessing:
> > > >
> > > >     dev_hold(dev).
> > > >
> > > > When this call comes to this object, a node including calltrace + object +
> > > > counter will be created if it doesn't exist, and the counter in this node
> > > > will increment if it already exists. Pretty simple.
> > > >
> > > > So naturally this lib can be used to track the refcnt of any objects, all
> > > > it has to do is put obj_cnt_track() to the place where this object is
> > > > held or put. It will count how many times this call has operated this
> > > > object after checking if this object and this type(hold/put) accessing
> > > > are being tracked.
> > > >
> > > > After the 1st lib patch, the other patches add the refcnt tracking for
> > > > netdev, dst, in6_dev and xfrm_state, and each has example how to use
> > > > in the changelog. The common use is:
> > > >
> > > >     # sysctl -w obj_cnt.control="clear" # clear the old result
> > > >
> > > >     # sysctl -w obj_cnt.type=0x1     # track type 0x1 operating
> > > >     # sysctl -w obj_cnt.name=test    # match name == test or
> > > >     # sysctl -w obj_cnt.index=1      # match index == 1
> > > >     # sysctl -w obj_cnt.nr_entries=4 # save 4 frames' calltrace
> > > >
> > > >     ... (reproduce the issue)
> > > >
> > > >     # sysctl -w obj_cnt.control="scan"  # print the new result
> > > >
> > > > Note that after seeing Eric's another patchset for refcnt tracking I
> > > > decided to post this patchset. As in this implemenation, it has some
> > > > benefits which I think worth sharing:
> > > >
> > >
> > > How can your code coexist with ref_tracker ?
> > Hi, Eric, Thanks for your checking
> >
> > It won't affect ref_tracker, one can even use both at the same time.
> >
> > >
> > > >   - it runs fast:
> > > >     1. it doesn't create nodes for the repeatitive calls to the same
> > > >        objects, and it saves memory and time.
> > > >     2. the depth of the calltrace to record is configurable, at most
> > > >        time small calltrace also saves memory and time, but will not
> > > >        affect the analysis.
> > > >     3. kmem_cache used also contributes to the performance.
> > >
> > > Points 2/3 can be implemented right away in the ref_tracker infra,
> > > please send patches.
> > >
> > > Quite frankly using a global hash table seems wrong, stack_depot
> > > already has this logic, why reimplement it ?
> > > stack_depot is damn fast (no spinlock in fast path)
> > What this patchset is trying to add is a calltrace+object counter.
> > I was looking at stack_depot after seeing you patch, stack_depot saves
> > calltrace only, no object(I guess this is okay, I can save object to
> > to entries[0] if I want to use it), but also it's not a counter.
> >
> > I'm not sure if it's allowed to do some change and add a counter to
> > the node of stack_depot, like when it's found in saving, the counter
> > increments. That will be perfect for this patchset.
> >
> > This global spinlock will eventually be used only to protect the new
> > node's insertion. For the fast path (lookup), rcu_read_lock() will take
> > care of it. I haven't got time to add it. but this won't be a problem.
> >
> > >
> > > Seeing that your patches add chunks in lib/obj_cnt.c, I do not see how
> > > you can claim this is generic code.
> > I planned it as a obj operating counter, it can be used for counting any
> > operatings, not just for the refcnt tracker which is only _put and _hold
> > operatings.
> >
> > >
> > > I don't know, it seems very strange to send this patch series now I
> > > have done about 60 patches on these issues.
> > This patch is not to do exactly the same things as your patchset, I think your
> > patch saves more information into the objects in the kernel memory, it will
> > be useful for vmcore analysis.
> >
> > This patchset is working in a different way, it's going to target a
> > specific object with index or name or pointer matched and some types
> > of function calls to it, we have to plan in advance after we know
> > which object (like it's name, index or string to match) is leaked.
> >
> > >
> > > And by doing this work, I found already two bugs in our stack.
> > Great effects!
> > I can see that you must go over all networking stack for dev operations.
> >
> > >
> > > You can be sure syzbot will send us many reports, most syzbot repros
> > > use a very limited number of objects.
> > >
> > > About performance : You use a single spinlock to protect your hash table.
> > > In my implementation, there is a spinlock per 'directory (eg one
> > > spinlock per struct net_device, one spinlock per struct net), it is
> > > more scalable.
> > I used per net spinlock at first, but I want to make the code more generic,
> > and not only for the network, then I decided to make it not related to net.
> >
> > After using rcu_lock in the fast path, I think this single spinlock won't
> > affect much, besides, this single lock can be replaced by a per hlist lock
> > on each hlist_head, it will also save some.
> >
> > >
> > > My tests have not shown a significant cost of the ref_tracker
> > > (the major cost comes from stack_trace_save() which you also use)
> > I added "run fast" in cover, mostly because it won't create many nodes
> > if dev_hold/put are called many times, it only increments the count if it's
> > the same call to the same object already existing in the hashtable.
> >
> > dev could be fine, thinking about tracking dst, when sending packets, dst
> > can be hold/put too many times, creating nodes for each call is not a good
> > idea, especially for some leak only occurs once for few months which I've
> > seen quite a few times in our customer envs.
>
> That is why I chose to not automatically track all dev_hold()/dev_put()
>
> For the ones that are contained in a short function, with clear
> contract, there is
> no need for adding tracking.
>
> But before taking this decision, I am looking at each case, very carefully.
OK, that's quite a lot of effort.
I'm sure you have a strong code review skill on networking code.

>
> >
> >
> > Other things are:
> > most net_dev leaks I've met are actually dst leak, some are in6_dev leak,
> > only tracking net_dev is not enough to address it, do you have plans to
> > add dst track for dst too? That may be a lot of changes too?
>
> Yes, the plan is to add trackers when we think there is a high
> probability of having leaks.
>
> I think you missed the one major point of the exercise :
>
> By carefully doing the pairing, we can spot bugs already, even without
> turning the CONFIG_ options.
>
> Once this (patient) work done, it is done, the infra will catch future
> bugs right away.
It seems to me the phase of your patches' development is more fun.
For this patchset, it's the analysis part, we will do the pairing with the
results, and check which _hold and _put didn't come in pairs.

But I believe that refcnt track for netdev only is not enough, you might
eventually see the leak be caused by a missing dst_destroy() call, for
example.

>
> >
> > I think adding new members into core structures only for debugging
> > may not be a good choice, as it will bring troubles to downstream for
> > the backport because of kABI.
>
> The main point of this stuff, is to allow syzbot to simply turn on a
> few CONFIG options and let
> it discover past/new issues. No special sysctls magics.
I see, collecting data for syzbot, that's one thing I didn't know.

>
> I see you added introspection, to list all active references, I think
> this could be added on a per object
> basis, to help developers have ways to better understand the kernel behavior.
Printing the ‘counting' results will save a lot of time to address the leaks.
Tracking multiple types at the same time will allow to address the problem
by reproducing the issue only once.

>
> Again, I think there is no reason your work can not complement mine.
> They serve different purposes.
Yep, this implementation is used more by developers to address known problems,
not by bot to look for new problems.

Thanks.