Message ID | 20240321163705.3067592-6-surenb@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Memory allocation profiling | expand |
On Thu, 21 Mar 2024 09:36:27 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > From: Kent Overstreet <kent.overstreet@linux.dev> > > We're introducing alloc tagging, which tracks memory allocations by > callsite. Converting alloc_inode_sb() to a macro means allocations will > be tracked by its caller, which is a bit more useful. I'd have thought that there would be many similar inlines-which-allocate-memory. Such as, I dunno, jbd2_alloc_inode(). Do we have to go converting things to macros as people report misleading or less useful results, or is there some more general solution to this? > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -3083,11 +3083,7 @@ int setattr_should_drop_sgid(struct mnt_idmap *idmap, > * This must be used for allocating filesystems specific inodes to set > * up the inode reclaim context correctly. > */ > -static inline void * > -alloc_inode_sb(struct super_block *sb, struct kmem_cache *cache, gfp_t gfp) > -{ > - return kmem_cache_alloc_lru(cache, &sb->s_inode_lru, gfp); > -} > +#define alloc_inode_sb(_sb, _cache, _gfp) kmem_cache_alloc_lru(_cache, &_sb->s_inode_lru, _gfp) Parenthesizing __sb seems sensible here?
On Thu, Mar 21, 2024 at 1:31 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Thu, 21 Mar 2024 09:36:27 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > We're introducing alloc tagging, which tracks memory allocations by > > callsite. Converting alloc_inode_sb() to a macro means allocations will > > be tracked by its caller, which is a bit more useful. > > I'd have thought that there would be many similar > inlines-which-allocate-memory. Such as, I dunno, jbd2_alloc_inode(). > Do we have to go converting things to macros as people report > misleading or less useful results, or is there some more general > solution to this? Yeah, that's unfortunately inevitable. Even if we had compiler support we would have to add annotations for such inlined functions. For the given example of jbd2_alloc_inode() it's not so bad since it's used only from one location but in general yes, that's something we will have to improve as we find more such cases. > > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -3083,11 +3083,7 @@ int setattr_should_drop_sgid(struct mnt_idmap *idmap, > > * This must be used for allocating filesystems specific inodes to set > > * up the inode reclaim context correctly. > > */ > > -static inline void * > > -alloc_inode_sb(struct super_block *sb, struct kmem_cache *cache, gfp_t gfp) > > -{ > > - return kmem_cache_alloc_lru(cache, &sb->s_inode_lru, gfp); > > -} > > +#define alloc_inode_sb(_sb, _cache, _gfp) kmem_cache_alloc_lru(_cache, &_sb->s_inode_lru, _gfp) > > Parenthesizing __sb seems sensible here? Ack. Let's wait for more comments and then I'll post fixes. Thanks!
On Thu, Mar 21, 2024 at 01:31:47PM -0700, Andrew Morton wrote: > On Thu, 21 Mar 2024 09:36:27 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > We're introducing alloc tagging, which tracks memory allocations by > > callsite. Converting alloc_inode_sb() to a macro means allocations will > > be tracked by its caller, which is a bit more useful. > > I'd have thought that there would be many similar > inlines-which-allocate-memory. Such as, I dunno, jbd2_alloc_inode(). > Do we have to go converting things to macros as people report > misleading or less useful results, or is there some more general > solution to this? No, this is just what we have to do. But a fair number of these helpers shouldn't exist - jbd2_alloc_inode() is one of those, it looks like it predates kmalloc() being able to use the page allocator for large allocations. > > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -3083,11 +3083,7 @@ int setattr_should_drop_sgid(struct mnt_idmap *idmap, > > * This must be used for allocating filesystems specific inodes to set > > * up the inode reclaim context correctly. > > */ > > -static inline void * > > -alloc_inode_sb(struct super_block *sb, struct kmem_cache *cache, gfp_t gfp) > > -{ > > - return kmem_cache_alloc_lru(cache, &sb->s_inode_lru, gfp); > > -} > > +#define alloc_inode_sb(_sb, _cache, _gfp) kmem_cache_alloc_lru(_cache, &_sb->s_inode_lru, _gfp) > > Parenthesizing __sb seems sensible here? yeah, we can do that
On Thu, 21 Mar 2024 17:15:39 -0400 Kent Overstreet <kent.overstreet@linux.dev> wrote: > On Thu, Mar 21, 2024 at 01:31:47PM -0700, Andrew Morton wrote: > > On Thu, 21 Mar 2024 09:36:27 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > > > We're introducing alloc tagging, which tracks memory allocations by > > > callsite. Converting alloc_inode_sb() to a macro means allocations will > > > be tracked by its caller, which is a bit more useful. > > > > I'd have thought that there would be many similar > > inlines-which-allocate-memory. Such as, I dunno, jbd2_alloc_inode(). > > Do we have to go converting things to macros as people report > > misleading or less useful results, or is there some more general > > solution to this? > > No, this is just what we have to do. Well, this is something we strike in other contexts - kallsyms gives us an inlined function and it's rarely what we wanted. I think kallsyms has all the data which is needed to fix this - how hard can it be to figure out that a particular function address lies within an outer function? I haven't looked...
On Thu, Mar 21, 2024 at 03:09:08PM -0700, Andrew Morton wrote: > On Thu, 21 Mar 2024 17:15:39 -0400 Kent Overstreet <kent.overstreet@linux.dev> wrote: > > > On Thu, Mar 21, 2024 at 01:31:47PM -0700, Andrew Morton wrote: > > > On Thu, 21 Mar 2024 09:36:27 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > > > > > We're introducing alloc tagging, which tracks memory allocations by > > > > callsite. Converting alloc_inode_sb() to a macro means allocations will > > > > be tracked by its caller, which is a bit more useful. > > > > > > I'd have thought that there would be many similar > > > inlines-which-allocate-memory. Such as, I dunno, jbd2_alloc_inode(). > > > Do we have to go converting things to macros as people report > > > misleading or less useful results, or is there some more general > > > solution to this? > > > > No, this is just what we have to do. > > Well, this is something we strike in other contexts - kallsyms gives us > an inlined function and it's rarely what we wanted. > > I think kallsyms has all the data which is needed to fix this - how > hard can it be to figure out that a particular function address lies > within an outer function? I haven't looked... This is different, though - even if a function is inlined in multiple places there's only going to be one instance of a static var defined within that function.
On Thu, Mar 21, 2024 at 3:17 PM Kent Overstreet <kent.overstreet@linux.dev> wrote: > > On Thu, Mar 21, 2024 at 03:09:08PM -0700, Andrew Morton wrote: > > On Thu, 21 Mar 2024 17:15:39 -0400 Kent Overstreet <kent.overstreet@linux.dev> wrote: > > > > > On Thu, Mar 21, 2024 at 01:31:47PM -0700, Andrew Morton wrote: > > > > On Thu, 21 Mar 2024 09:36:27 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > > > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > > > > > > > We're introducing alloc tagging, which tracks memory allocations by > > > > > callsite. Converting alloc_inode_sb() to a macro means allocations will > > > > > be tracked by its caller, which is a bit more useful. > > > > > > > > I'd have thought that there would be many similar > > > > inlines-which-allocate-memory. Such as, I dunno, jbd2_alloc_inode(). > > > > Do we have to go converting things to macros as people report > > > > misleading or less useful results, or is there some more general > > > > solution to this? > > > > > > No, this is just what we have to do. > > > > Well, this is something we strike in other contexts - kallsyms gives us > > an inlined function and it's rarely what we wanted. > > > > I think kallsyms has all the data which is needed to fix this - how > > hard can it be to figure out that a particular function address lies > > within an outer function? I haven't looked... > > This is different, though - even if a function is inlined in multiple > places there's only going to be one instance of a static var defined > within that function. I guess one simple way to detect the majority of these helpers would be to filter all entries from /proc/allocinfo which originate from header files. ~# grep ".*\.h:." /proc/allocinfo 933888 228 include/linux/mm.h:2863 func:pagetable_alloc 848 53 include/linux/mm_types.h:1175 func:mm_alloc_cid 0 0 include/linux/bpfptr.h:70 func:kvmemdup_bpfptr 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node 0 0 include/linux/bpf.h:2249 func:bpf_map_kvcalloc 0 0 include/linux/bpf.h:2243 func:bpf_map_kzalloc 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node 0 0 include/linux/ptr_ring.h:471 func:__ptr_ring_init_queue_alloc 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node 0 0 include/net/tcx.h:80 func:tcx_entry_create 0 0 arch/x86/include/asm/pgalloc.h:156 func:p4d_alloc_one 487424 119 include/linux/mm.h:2863 func:pagetable_alloc 0 0 include/linux/mm.h:2863 func:pagetable_alloc 832 13 include/linux/jbd2.h:1607 func:jbd2_alloc_inode 0 0 include/linux/jbd2.h:1591 func:jbd2_alloc_handle 0 0 fs/nfs/iostat.h:51 func:nfs_alloc_iostats 0 0 include/net/netlabel.h:281 func:netlbl_secattr_cache_alloc 0 0 include/net/netlabel.h:381 func:netlbl_secattr_alloc 0 0 include/crypto/internal/acompress.h:76 func:__acomp_request_alloc 8064 84 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 1016 74 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 384 4 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 704 3 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 32 1 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 64 1 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 40 2 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 32 1 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 30000 625 include/acpi/platform/aclinuxex.h:67 func:acpi_os_acquire_object 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:67 func:acpi_os_acquire_object 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 512 1 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 192 6 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 192 3 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate 61992 861 include/acpi/platform/aclinuxex.h:67 func:acpi_os_acquire_object 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 include/acpi/platform/aclinuxex.h:67 func:acpi_os_acquire_object 0 0 include/acpi/platform/aclinuxex.h:57 func:acpi_os_allocate_zeroed 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/hid_bpf.h:154 func:call_hid_bpf_rdesc_fixup 0 0 include/linux/skbuff.h:3392 func:__dev_alloc_pages 114688 56 include/linux/ptr_ring.h:471 func:__ptr_ring_init_queue_alloc 0 0 include/linux/skmsg.h:415 func:sk_psock_init_link 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node 0 0 include/linux/ptr_ring.h:628 func:ptr_ring_resize_multiple 24576 3 include/linux/ptr_ring.h:471 func:__ptr_ring_init_queue_alloc 0 0 include/net/netlink.h:1896 func:nla_memdup 0 0 include/linux/sockptr.h:97 func:memdup_sockptr 0 0 include/net/request_sock.h:131 func:reqsk_alloc 0 0 include/net/tcp.h:2456 func:tcp_v4_save_options 0 0 include/net/tcp.h:2456 func:tcp_v4_save_options 0 0 include/crypto/hash.h:586 func:ahash_request_alloc 0 0 include/linux/sockptr.h:97 func:memdup_sockptr 0 0 include/linux/sockptr.h:97 func:memdup_sockptr 0 0 net/sunrpc/auth_gss/auth_gss_internal.h:38 func:simple_get_netobj 0 0 include/crypto/hash.h:586 func:ahash_request_alloc 0 0 include/net/netlink.h:1896 func:nla_memdup 0 0 include/crypto/skcipher.h:869 func:skcipher_request_alloc 0 0 include/net/fq_impl.h:361 func:fq_init 0 0 include/net/netlabel.h:316 func:netlbl_catmap_alloc and it finds our example: 832 13 include/linux/jbd2.h:1607 func:jbd2_alloc_inode Interestingly the inlined functions which are called from multiple places will have multiple entries with the same file+line: 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc 0 0 include/linux/dma-fence-chain.h:91 func:dma_fence_chain_alloc So, duplicate entries can be also used as an indication of an inlined allocator. I'll go chase these down and will post a separate patch converting them.
On Thu, Mar 21, 2024 at 3:47 PM Suren Baghdasaryan <surenb@google.com> wrote: > > On Thu, Mar 21, 2024 at 3:17 PM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > > > On Thu, Mar 21, 2024 at 03:09:08PM -0700, Andrew Morton wrote: > > > On Thu, 21 Mar 2024 17:15:39 -0400 Kent Overstreet <kent.overstreet@linux.dev> wrote: > > > > > > > On Thu, Mar 21, 2024 at 01:31:47PM -0700, Andrew Morton wrote: > > > > > On Thu, 21 Mar 2024 09:36:27 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > > > > > > > From: Kent Overstreet <kent.overstreet@linux.dev> > > > > > > > > > > > > We're introducing alloc tagging, which tracks memory allocations by > > > > > > callsite. Converting alloc_inode_sb() to a macro means allocations will > > > > > > be tracked by its caller, which is a bit more useful. > > > > > > > > > > I'd have thought that there would be many similar > > > > > inlines-which-allocate-memory. Such as, I dunno, jbd2_alloc_inode(). > > > > > Do we have to go converting things to macros as people report > > > > > misleading or less useful results, or is there some more general > > > > > solution to this? > > > > > > > > No, this is just what we have to do. > > > > > > Well, this is something we strike in other contexts - kallsyms gives us > > > an inlined function and it's rarely what we wanted. > > > > > > I think kallsyms has all the data which is needed to fix this - how > > > hard can it be to figure out that a particular function address lies > > > within an outer function? I haven't looked... > > > > This is different, though - even if a function is inlined in multiple > > places there's only going to be one instance of a static var defined > > within that function. > > I guess one simple way to detect the majority of these helpers would > be to filter all entries from /proc/allocinfo which originate from > header files. > > ~# grep ".*\.h:." /proc/allocinfo > 933888 228 include/linux/mm.h:2863 func:pagetable_alloc > 848 53 include/linux/mm_types.h:1175 func:mm_alloc_cid > 0 0 include/linux/bpfptr.h:70 func:kvmemdup_bpfptr > 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node > 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu > 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu > 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node > 0 0 include/linux/bpf.h:2249 func:bpf_map_kvcalloc > 0 0 include/linux/bpf.h:2243 func:bpf_map_kzalloc > 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node > 0 0 include/linux/ptr_ring.h:471 > func:__ptr_ring_init_queue_alloc > 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu > 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node > 0 0 include/net/tcx.h:80 func:tcx_entry_create > 0 0 arch/x86/include/asm/pgalloc.h:156 func:p4d_alloc_one > 487424 119 include/linux/mm.h:2863 func:pagetable_alloc > 0 0 include/linux/mm.h:2863 func:pagetable_alloc > 832 13 include/linux/jbd2.h:1607 func:jbd2_alloc_inode > 0 0 include/linux/jbd2.h:1591 func:jbd2_alloc_handle > 0 0 fs/nfs/iostat.h:51 func:nfs_alloc_iostats > 0 0 include/net/netlabel.h:281 func:netlbl_secattr_cache_alloc > 0 0 include/net/netlabel.h:381 func:netlbl_secattr_alloc > 0 0 include/crypto/internal/acompress.h:76 > func:__acomp_request_alloc > 8064 84 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 1016 74 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 384 4 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 704 3 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 32 1 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 64 1 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 40 2 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 32 1 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 30000 625 include/acpi/platform/aclinuxex.h:67 > func:acpi_os_acquire_object > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:67 > func:acpi_os_acquire_object > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 512 1 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 192 6 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 192 3 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate > 61992 861 include/acpi/platform/aclinuxex.h:67 > func:acpi_os_acquire_object > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 include/acpi/platform/aclinuxex.h:67 > func:acpi_os_acquire_object > 0 0 include/acpi/platform/aclinuxex.h:57 > func:acpi_os_allocate_zeroed > 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page > 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page > 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/hid_bpf.h:154 func:call_hid_bpf_rdesc_fixup > 0 0 include/linux/skbuff.h:3392 func:__dev_alloc_pages > 114688 56 include/linux/ptr_ring.h:471 > func:__ptr_ring_init_queue_alloc > 0 0 include/linux/skmsg.h:415 func:sk_psock_init_link > 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node > 0 0 include/linux/ptr_ring.h:628 func:ptr_ring_resize_multiple > 24576 3 include/linux/ptr_ring.h:471 > func:__ptr_ring_init_queue_alloc > 0 0 include/net/netlink.h:1896 func:nla_memdup > 0 0 include/linux/sockptr.h:97 func:memdup_sockptr > 0 0 include/net/request_sock.h:131 func:reqsk_alloc > 0 0 include/net/tcp.h:2456 func:tcp_v4_save_options > 0 0 include/net/tcp.h:2456 func:tcp_v4_save_options > 0 0 include/crypto/hash.h:586 func:ahash_request_alloc > 0 0 include/linux/sockptr.h:97 func:memdup_sockptr > 0 0 include/linux/sockptr.h:97 func:memdup_sockptr > 0 0 net/sunrpc/auth_gss/auth_gss_internal.h:38 > func:simple_get_netobj > 0 0 include/crypto/hash.h:586 func:ahash_request_alloc > 0 0 include/net/netlink.h:1896 func:nla_memdup > 0 0 include/crypto/skcipher.h:869 func:skcipher_request_alloc > 0 0 include/net/fq_impl.h:361 func:fq_init > 0 0 include/net/netlabel.h:316 func:netlbl_catmap_alloc > > and it finds our example: > > 832 13 include/linux/jbd2.h:1607 func:jbd2_alloc_inode > > Interestingly the inlined functions which are called from multiple > places will have multiple entries with the same file+line: > > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > 0 0 include/linux/dma-fence-chain.h:91 > func:dma_fence_chain_alloc > > So, duplicate entries can be also used as an indication of an inlined allocator. > I'll go chase these down and will post a separate patch converting them. I just posted https://lore.kernel.org/all/20240404165404.3805498-1-surenb@google.com/ to report allocations done from the inlined functions in the headers to their callers.
diff --git a/include/linux/fs.h b/include/linux/fs.h index 00fc429b0af0..034f0c918eea 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3083,11 +3083,7 @@ int setattr_should_drop_sgid(struct mnt_idmap *idmap, * This must be used for allocating filesystems specific inodes to set * up the inode reclaim context correctly. */ -static inline void * -alloc_inode_sb(struct super_block *sb, struct kmem_cache *cache, gfp_t gfp) -{ - return kmem_cache_alloc_lru(cache, &sb->s_inode_lru, gfp); -} +#define alloc_inode_sb(_sb, _cache, _gfp) kmem_cache_alloc_lru(_cache, &_sb->s_inode_lru, _gfp) extern void __insert_inode_hash(struct inode *, unsigned long hashval); static inline void insert_inode_hash(struct inode *inode)