Message ID | 7302f4a13c1cbf62b07f636878ce25fcca84b6c4.camel@perches.com (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
Series | xfs: Remove noinline from #define STATIC | expand |
On 11/10/18 7:21 PM, Joe Perches wrote: > Reduce total object size quite a bit (~32KB) and presumably > improve performance at the same time. > > Total object size old vs new (x86-64 defconfig with xfs) > > text data bss dec hex filename > - 959351 165573 632 1125556 112cb4 (TOTALS) (old) > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new) And what does it do to maximum stack excursions? -Eric > Signed-off-by: Joe Perches <joe@perches.com> > --- > fs/xfs/xfs_linux.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h > index edbd5a210df2..f33c8b626bca 100644 > --- a/fs/xfs/xfs_linux.h > +++ b/fs/xfs/xfs_linux.h > @@ -242,7 +242,7 @@ static inline uint64_t howmany_64(uint64_t x, uint32_t y) > #endif /* XFS_WARN */ > #endif /* DEBUG */ > > -#define STATIC static noinline > +#define STATIC static > > #ifdef CONFIG_XFS_RT > >
On Mon, 2018-11-12 at 14:12 -0600, Eric Sandeen wrote: > On 11/10/18 7:21 PM, Joe Perches wrote: > > Reduce total object size quite a bit (~32KB) and presumably > > improve performance at the same time. > > > > Total object size old vs new (x86-64 defconfig with xfs) > > > > text data bss dec hex filename > > - 959351 165573 632 1125556 112cb4 (TOTALS) (old) > > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new) > > And what does it do to maximum stack excursions? It seems to add a maximum of 40 bytes in xfs_reclaim_inodes_ag and xfs_readdir, but I didn't do more than visually scan checkstack output. Using scripts/checkstack.pl on defconfig x86-64 with xfs and cutting out absolute addresses diff -urN xfs_cs_old_2 xfs_cs_new_2 --- xfs_cs_old_2 2018-11-12 12:55:03.195282512 -0800 +++ xfs_cs_new_2 2018-11-12 12:55:09.219168923 -0800 @@ -1,18 +1,20 @@ xfs_ag_init_headers [vmlinux]: 464 xfs_ag_init_headers [vmlinux]: 464 -xfs_inode_ag_walk.isra.19 [vmlinux]: 416 -xfs_inode_ag_walk.isra.19 [vmlinux]: 416 +xfs_inode_ag_walk.isra.21 [vmlinux]: 424 +xfs_inode_ag_walk.isra.21 [vmlinux]: 424 +xfs_reclaim_inodes_ag [vmlinux]: 368 +xfs_reclaim_inodes_ag [vmlinux]: 368 xfs_trans_committed_bulk [vmlinux]: 336 -xfs_reclaim_inodes_ag [vmlinux]: 328 -xfs_reclaim_inodes_ag [vmlinux]: 328 xfs_ioc_getfsmap.isra.23 [vmlinux]: 320 xfs_ioc_getfsmap.isra.23 [vmlinux]: 320 +xfs_file_ioctl [vmlinux]: 320 +xfs_file_ioctl [vmlinux]: 320 +xfs_bmapi_write [vmlinux]: 312 xfs_getfsmap [vmlinux]: 304 xfs_qm_dquot_walk.isra.11 [vmlinux]: 296 xfs_qm_dquot_walk.isra.11 [vmlinux]: 296 -xfs_bmapi_write [vmlinux]: 288 -xfs_file_ioctl [vmlinux]: 288 -xfs_file_ioctl [vmlinux]: 288 +xfs_file_compat_ioctl [vmlinux]: 288 +xfs_file_compat_ioctl [vmlinux]: 288 xfs_sb_read_verify [vmlinux]: 280 xfs_sb_read_verify [vmlinux]: 280 xfs_sb_write_verify [vmlinux]: 272 @@ -21,36 +23,38 @@ xfs_rmap_convert [vmlinux]: 264 xfs_rmap_convert_shared [vmlinux]: 256 xfs_rmap_convert_shared [vmlinux]: 256 -xfs_file_compat_ioctl [vmlinux]: 256 -xfs_file_compat_ioctl [vmlinux]: 256 xfs_bmap_extents_to_btree [vmlinux]: 248 xfs_bmap_extents_to_btree [vmlinux]: 248 +xfs_bmap_del_extent_real [vmlinux]: 240 +xfs_alloc_fix_freelist [vmlinux]: 224 +xfs_alloc_fix_freelist [vmlinux]: 224 +xfs_bmap_add_extent_unwritten_real [vmlinux]:224 xfs_symlink [vmlinux]: 224 xfs_symlink [vmlinux]: 224 -xfs_alloc_fix_freelist [vmlinux]: 216 -xfs_alloc_fix_freelist [vmlinux]: 216 -xfs_bmap_local_to_extents.constprop.27 [vmlinux]:208 -xfs_bmap_local_to_extents.constprop.27 [vmlinux]:208 -xfs_ialloc_ag_alloc [vmlinux]: 208 -xfs_ialloc_ag_alloc [vmlinux]: 208 -xfs_bmap_add_extent_unwritten_real [vmlinux]:192 -xfs_bmap_add_extent_unwritten_real [vmlinux]:192 -xfs_bmap_add_extent_delay_real [vmlinux]:192 -xfs_bmap_add_extent_delay_real [vmlinux]:192 -xfs_bmap_del_extent_real [vmlinux]: 192 -xfs_bmap_del_extent_real [vmlinux]: 192 -xfs_bmap_btalloc [vmlinux]: 192 -xfs_bmap_btalloc [vmlinux]: 192 +xfs_bmap_btalloc [vmlinux]: 216 +xfs_bmap_btalloc [vmlinux]: 216 +xfs_attr_rmtval_set [vmlinux]: 208 +xfs_bmap_add_extent_delay_real [vmlinux]:208 +xfs_bmap_local_to_extents.constprop.28 [vmlinux]:208 +xfs_bmap_local_to_extents.constprop.28 [vmlinux]:208 +xfs_bulkstat [vmlinux]: 208 +xfs_ialloc_ag_alloc [vmlinux]: 200 +xfs_ialloc_ag_alloc [vmlinux]: 200 xfs_refcountbt_alloc_block [vmlinux]:192 xfs_refcountbt_alloc_block [vmlinux]:192 +xfs_reflink_remap_blocks [vmlinux]: 192 +xfs_reflink_remap_blocks [vmlinux]: 192 __xfs_bunmapi [vmlinux]: 184 __xfs_bunmapi [vmlinux]: 184 xfs_rmap_map [vmlinux]: 184 xfs_rmap_map [vmlinux]: 184 +xfs_readdir [vmlinux]: 184 +xfs_readdir [vmlinux]: 184 xfs_attr_set [vmlinux]: 176 xfs_attr_set [vmlinux]: 176 xfs_attr3_leaf_to_shortform [vmlinux]:176 xfs_attr3_leaf_to_shortform [vmlinux]:176 +xfs_bmap_add_extent_hole_real [vmlinux]:176 xfs_attr_shortform_to_leaf [vmlinux]:168 xfs_attr_shortform_to_leaf [vmlinux]:168 xfs_btree_delrec [vmlinux]: 168 @@ -59,89 +63,99 @@ xfs_rmap_unmap [vmlinux]: 168 xfs_rmap_map_shared [vmlinux]: 168 xfs_rmap_map_shared [vmlinux]: 168 -xfs_bulkstat [vmlinux]: 168 -xfs_bulkstat [vmlinux]: 168 xfs_scrub_metadata [vmlinux]: 168 xfs_scrub_metadata [vmlinux]: 168 xfs_free_extent_fix_freelist [vmlinux]:160 xfs_free_extent_fix_freelist [vmlinux]:160 -xfs_bmap_add_extent_hole_real [vmlinux]:160 -xfs_bmap_add_extent_hole_real [vmlinux]:160 +xfs_attr3_leaf_split [vmlinux]: 160 +xfs_attr3_leaf_split [vmlinux]: 160 xfs_bmap_insert_extents [vmlinux]: 160 xfs_bmap_insert_extents [vmlinux]: 160 xfs_bmbt_alloc_block [vmlinux]: 160 xfs_bmbt_alloc_block [vmlinux]: 160 xfs_btree_overlapped_query_range [vmlinux]:160 xfs_btree_overlapped_query_range [vmlinux]:160 +xfs_dialloc [vmlinux]: 160 +xfs_dialloc [vmlinux]: 160 __xfs_inobt_alloc_block.isra.9 [vmlinux]:160 __xfs_inobt_alloc_block.isra.9 [vmlinux]:160 xfs_ag_resv_rmapbt_alloc [vmlinux]: 160 xfs_ag_resv_rmapbt_alloc [vmlinux]: 160 +xfs_do_writepage [vmlinux]: 160 +xfs_do_writepage [vmlinux]: 160 xfs_dir2_leaf_readbuf [vmlinux]: 160 xfs_readlink_bmap_ilocked [vmlinux]: 160 xfs_readlink_bmap_ilocked [vmlinux]: 160 xfs_ag_resv_rmapbt_alloc [vmlinux]: 160 xfs_ag_resv_rmapbt_alloc [vmlinux]: 160 +xfs_da3_split [vmlinux]: 152 +xfs_da3_split [vmlinux]: 152 xfs_getbmap [vmlinux]: 152 xfs_getbmap [vmlinux]: 152 trace_raw_output_xfs_loggrant_class [vmlinux]:144 trace_raw_output_xfs_alloc_class [vmlinux]:144 xfs_attr_remove [vmlinux]: 144 xfs_attr_remove [vmlinux]: 144 -xfs_dialloc_ag_inobt [vmlinux]: 144 -xfs_dialloc_ag_inobt [vmlinux]: 144 +xfs_bmap_split_extent_at [vmlinux]: 144 +__xfs_btree_split.isra.45 [vmlinux]: 144 +__xfs_btree_split.isra.45 [vmlinux]: 144 +xfs_btree_insrec [vmlinux]: 144 +xfs_btree_insrec [vmlinux]: 144 +xfs_da_shrink_inode [vmlinux]: 144 +xfs_da_shrink_inode [vmlinux]: 144 +xfs_attr_list_int_ilocked [vmlinux]: 144 +xfs_attr_list_int_ilocked [vmlinux]: 144 xfs_swap_extent_rmap [vmlinux]: 144 xfs_swap_extent_rmap [vmlinux]: 144 -xfs_readdir [vmlinux]: 144 -xfs_readdir [vmlinux]: 144 +xfs_ioc_trim [vmlinux]: 144 +xfs_ioc_trim [vmlinux]: 144 xfs_inactive_symlink_rmt [vmlinux]: 144 xfs_inactive_symlink_rmt [vmlinux]: 144 xfs_attr_get [vmlinux]: 136 xfs_attr_get [vmlinux]: 136 +xfs_attr_rmtval_get [vmlinux]: 136 +xfs_attr_rmtval_get [vmlinux]: 136 xfs_bmap_add_attrfork_local [vmlinux]:136 xfs_bmap_add_attrfork_local [vmlinux]:136 -__xfs_btree_split.isra.38 [vmlinux]: 136 -__xfs_btree_split.isra.38 [vmlinux]: 136 xfs_btree_split [vmlinux]: 136 xfs_btree_split [vmlinux]: 136 -xfs_btree_insrec [vmlinux]: 136 -xfs_btree_insrec [vmlinux]: 136 +xfs_rmap_finish_one [vmlinux]: 136 +xfs_rmap_finish_one [vmlinux]: 136 +xfs_attr3_leaf_inactive [vmlinux]: 136 +xfs_attr3_leaf_inactive [vmlinux]: 136 xfs_rename [vmlinux]: 136 xfs_rename [vmlinux]: 136 xfs_attr3_leaf_unbalance [vmlinux]: 120 xfs_attr3_leaf_unbalance [vmlinux]: 120 -xfs_attr_rmtval_set [vmlinux]: 120 -xfs_attr_rmtval_set [vmlinux]: 120 -xfs_ioc_fsgeometry_v1 [vmlinux]: 120 -xfs_ioc_fsgeometry_v1 [vmlinux]: 120 -xfs_ioc_fsgeometry [vmlinux]: 120 -xfs_ioc_fsgeometry [vmlinux]: 120 +xfs_refcount_merge_extents [vmlinux]:120 +xfs_refcount_merge_extents [vmlinux]:120 xfs_file_iomap_begin [vmlinux]: 120 xfs_file_iomap_begin [vmlinux]: 120 -xfs_compat_ioc_fsgeometry_v1 [vmlinux]:120 -xfs_compat_ioc_fsgeometry_v1 [vmlinux]:120 +xfs_ifree [vmlinux]: 120 +xfs_ifree [vmlinux]: 120 +xfs_qm_reset_dqcounts_buf.part.15 [vmlinux]:120 +xfs_qm_reset_dqcounts_buf.part.15 [vmlinux]:120 +xfs_rtallocate_extent [vmlinux]: 120 +xfs_rtallocate_extent [vmlinux]: 120 +xfs_free_ag_extent [vmlinux]: 112 xfs_attr3_leaf_toosmall [vmlinux]: 112 xfs_attr3_leaf_toosmall [vmlinux]: 112 -xfs_attr3_leaf_rebalance [vmlinux]: 112 -xfs_attr3_leaf_rebalance [vmlinux]: 112 +xfs_bmap_del_extent_delay [vmlinux]: 112 +xfs_bmap_del_extent_delay [vmlinux]: 112 xfs_iread_extents [vmlinux]: 112 xfs_iread_extents [vmlinux]: 112 -xfs_bmap_split_extent_at [vmlinux]: 112 -xfs_bmap_split_extent_at [vmlinux]: 112 xfs_btree_insert [vmlinux]: 112 xfs_btree_query_range [vmlinux]: 112 -xfs_da3_swap_lastblock [vmlinux]: 112 -xfs_da3_swap_lastblock [vmlinux]: 112 xfs_dir2_node_addname [vmlinux]: 112 xfs_dir2_node_addname [vmlinux]: 112 xfs_dir2_node_removename [vmlinux]: 112 xfs_dir2_node_removename [vmlinux]: 112 -xfs_trim_extents [vmlinux]: 112 -xfs_trim_extents [vmlinux]: 112 +xfs_difree [vmlinux]: 112 +xfs_difree [vmlinux]: 112 xfs_growfs_data [vmlinux]: 112 xfs_growfs_data [vmlinux]: 112 -xfs_ifree_cluster.isra.21 [vmlinux]: 112 -xfs_ifree_cluster.isra.21 [vmlinux]: 112 +xfs_iget [vmlinux]: 112 +xfs_iget [vmlinux]: 112 xfs_inumbers [vmlinux]: 112 xfs_inumbers [vmlinux]: 112 xfs_reflink_end_cow [vmlinux]: 112 @@ -149,17 +163,19 @@ xfs_vn_listxattr [vmlinux]: 112 xfs_vn_listxattr [vmlinux]: 112 xfsaild [vmlinux]: 112 +xfs_growfs_rt [vmlinux]: 112 +xfs_growfs_rt [vmlinux]: 112 trace_raw_output_xfs_dquot_class [vmlinux]:104 xfs_bmapi_reserve_delalloc [vmlinux]:104 xfs_bmapi_reserve_delalloc [vmlinux]:104 +xfs_btree_lshift [vmlinux]: 104 +xfs_btree_lshift [vmlinux]: 104 xfs_da_grow_inode_int [vmlinux]: 104 xfs_da_grow_inode_int [vmlinux]: 104 xfs_dir2_leaf_removename [vmlinux]: 104 xfs_dir2_leaf_removename [vmlinux]: 104 xfs_dir2_leafn_split [vmlinux]: 104 xfs_dir2_leafn_split [vmlinux]: 104 -xfs_iget [vmlinux]: 104 -xfs_iget [vmlinux]: 104 xfs_ioc_space [vmlinux]: 104 xfs_ioc_space [vmlinux]: 104 xfs_emerg [vmlinux]: 104 @@ -169,15 +185,9 @@ xfs_warn [vmlinux]: 104 xfs_notice [vmlinux]: 104 xfs_info [vmlinux]: 104 -xfs_reflink_remap_extent [vmlinux]: 104 -xfs_reflink_remap_extent [vmlinux]: 104 -xfs_reflink_dirty_extents [vmlinux]: 104 -xfs_reflink_dirty_extents [vmlinux]: 104 xfs_reflink_cancel_cow_blocks [vmlinux]:104 xfs_reflink_cancel_cow_blocks [vmlinux]:104 xfs_log_quiesce [vmlinux]: 104 xfs_log_quiesce [vmlinux]: 104 -xfs_rtallocate_extent_near [vmlinux]:104 -xfs_rtallocate_extent_near [vmlinux]:104 xfs_set_acl [vmlinux]: 104 xfs_set_acl [vmlinux]: 104
On Mon, Nov 12, 2018 at 02:12:08PM -0600, Eric Sandeen wrote: > On 11/10/18 7:21 PM, Joe Perches wrote: > > Reduce total object size quite a bit (~32KB) and presumably > > improve performance at the same time. > > > > Total object size old vs new (x86-64 defconfig with xfs) > > > > text data bss dec hex filename > > - 959351 165573 632 1125556 112cb4 (TOTALS) (old) > > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new) > > And what does it do to maximum stack excursions? Better yet: what does it do to corruption stack traces and debugging tools like profiling traces? i.e. this noinline directive isn't about stack usage, this is about being able to debug production code. Basically the compiler inliner is so agressive on static functions that it makes it impossible to decipher the stack traces. It flattens them way too much to be able to tell how we got to a specific location in the code. In reality, being able to find problems quickly and efficiently is far more important to us than being able to run everything at ludicrous speed.... Cheers, Dave.
On Tue, 2018-11-13 at 08:45 +1100, Dave Chinner wrote: > On Mon, Nov 12, 2018 at 02:12:08PM -0600, Eric Sandeen wrote: > > On 11/10/18 7:21 PM, Joe Perches wrote: > > > Reduce total object size quite a bit (~32KB) and presumably > > > improve performance at the same time. > > > > > > Total object size old vs new (x86-64 defconfig with xfs) > > > > > > text data bss dec hex filename > > > - 959351 165573 632 1125556 112cb4 (TOTALS) (old) > > > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new) > > > > And what does it do to maximum stack excursions? > > Better yet: what does it do to corruption stack traces and debugging > tools like profiling traces? > > i.e. this noinline directive isn't about stack usage, this is about > being able to debug production code. Basically the compiler inliner > is so agressive on static functions that it makes it impossible to > decipher the stack traces. It flattens them way too much to > be able to tell how we got to a specific location in the code. > > In reality, being able to find problems quickly and efficiently is > far more important to us than being able to run everything at > ludicrous speed.... Is that really a compelling argument given thw ~50:50 split of static/STATIC uses in xfs? $ git grep -w STATIC fs/xfs/ | wc -l 1064 $ git grep -w static fs/xfs/ | wc -l 942
On Mon, Nov 12, 2018 at 02:30:01PM -0800, Joe Perches wrote: > On Tue, 2018-11-13 at 08:45 +1100, Dave Chinner wrote: > > On Mon, Nov 12, 2018 at 02:12:08PM -0600, Eric Sandeen wrote: > > > On 11/10/18 7:21 PM, Joe Perches wrote: > > > > Reduce total object size quite a bit (~32KB) and presumably > > > > improve performance at the same time. > > > > > > > > Total object size old vs new (x86-64 defconfig with xfs) > > > > > > > > text data bss dec hex filename > > > > - 959351 165573 632 1125556 112cb4 (TOTALS) (old) > > > > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new) > > > > > > And what does it do to maximum stack excursions? > > > > Better yet: what does it do to corruption stack traces and debugging > > tools like profiling traces? > > > > i.e. this noinline directive isn't about stack usage, this is about > > being able to debug production code. Basically the compiler inliner > > is so agressive on static functions that it makes it impossible to > > decipher the stack traces. It flattens them way too much to > > be able to tell how we got to a specific location in the code. > > > > In reality, being able to find problems quickly and efficiently is > > far more important to us than being able to run everything at > > ludicrous speed.... > > Is that really a compelling argument given thw ~50:50 > split of static/STATIC uses in xfs? Historically, yes. We're talking about code with call chains that can go 50-60 functions deep here. If that gets flattened to 10-20 functions by compiler inlining (which is the sort of thing that happens) then we lose a huge amount of visibility into the workings of the code. This affects profiling, stack traces on corruption, dynamic debug probes, kernel crash dump analysis, etc. I'm not interested in making code fast if distro support engineers can't debug problems on user systems easily. Optimising for performance over debuggability is a horrible trade off for us to make because it means users and distros end up much more reliant on single points of expertise for debugging problems. And that means the majority of the load of problem triage falls directly on very limited resources - the core XFS development team. A little bit of thought about how to make code easier to triage and debug goes a long, long way.... Indeed, this is not a new problem - we've been using techniques like STATIC in one form or another to stop compiler inlining and/or function hiding since XFS was first ported to linux 20 years ago. In fact, STATIC was inherited from Irix because it helped with debugging via the userspace simulator that the initial XFS code was developed on. i.e. STATIC was present in the initial XFS commit made way back in 1993, and we've been using it ever since... Cheers, Dave.
On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote: > I'm not interested in making code fast if distro support engineers > can't debug problems on user systems easily. Optimising for > performance over debuggability is a horrible trade off for us to > make because it means users and distros end up much more reliant on > single points of expertise for debugging problems. And that means > the majority of the load of problem triage falls directly on very > limited resources - the core XFS development team. A little bit of > thought about how to make code easier to triage and debug goes a > long, long way.... So at least in my experience, if the kernels are compiled with CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED, scripts/decode_stracktrace.sh seems to do a very nice job with inlined functions. Now, ext4 generally only has about 3 or 4 nested inlines, and so I don't know how it works with 20 or 30 nested inlined functions, so perhaps this is not applicable for XFS. But it perhaps toolchain technology has advanced since the Irix days such that it's no longer as necessary to force the non-inlining of functions for easing debugging? - Ted
On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote: > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote: > > I'm not interested in making code fast if distro support engineers > > can't debug problems on user systems easily. Optimising for > > performance over debuggability is a horrible trade off for us to > > make because it means users and distros end up much more reliant on > > single points of expertise for debugging problems. And that means > > the majority of the load of problem triage falls directly on very > > limited resources - the core XFS development team. A little bit of > > thought about how to make code easier to triage and debug goes a > > long, long way.... > > So at least in my experience, if the kernels are compiled with > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED, > scripts/decode_stracktrace.sh seems to do a very nice job with inlined That doesn't help with kernel profiling and other such things that are based on callgraphs... > functions. Now, ext4 generally only has about 3 or 4 nested inlines, > and so I don't know how it works with 20 or 30 nested inlined > functions, so perhaps this is not applicable for XFS. > > But it perhaps toolchain technology has advanced since the Irix days > such that it's no longer as necessary to force the non-inlining of > functions for easing debugging? Not that I've noticed. Indeed, modern toolchains are moving the opposite direction - have you ever tried to debug a binary with gdb that was compiled with LTO enabled? Or maybe even just tried to profile it with perf? Cheers, Dave.
On Tue, 2018-11-13 at 14:09 +1100, Dave Chinner wrote: > On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote: > > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote: > > > I'm not interested in making code fast if distro support engineers > > > can't debug problems on user systems easily. Optimising for > > > performance over debuggability is a horrible trade off for us to > > > make because it means users and distros end up much more reliant on > > > single points of expertise for debugging problems. And that means > > > the majority of the load of problem triage falls directly on very > > > limited resources - the core XFS development team. A little bit of > > > thought about how to make code easier to triage and debug goes a > > > long, long way.... > > > > So at least in my experience, if the kernels are compiled with > > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED, > > scripts/decode_stracktrace.sh seems to do a very nice job with inlined > > That doesn't help with kernel profiling and other such things that > are based on callgraphs... If that's really the case: I rather suspect the xfs static v STATIC function marking is not particularly curated and the marking is somewhat arbitrary. So perhaps given the large number of static, but not STATIC functions, perhaps a sed of s/static/STATIC/ should be done when it's not inline for all xfs functions.
On Mon, Nov 12, 2018 at 08:23:42PM -0800, Joe Perches wrote: > On Tue, 2018-11-13 at 14:09 +1100, Dave Chinner wrote: > > On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote: > > > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote: > > > > I'm not interested in making code fast if distro support engineers > > > > can't debug problems on user systems easily. Optimising for > > > > performance over debuggability is a horrible trade off for us to > > > > make because it means users and distros end up much more reliant on > > > > single points of expertise for debugging problems. And that means > > > > the majority of the load of problem triage falls directly on very > > > > limited resources - the core XFS development team. A little bit of > > > > thought about how to make code easier to triage and debug goes a > > > > long, long way.... > > > > > > So at least in my experience, if the kernels are compiled with > > > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED, > > > scripts/decode_stracktrace.sh seems to do a very nice job with inlined > > > > That doesn't help with kernel profiling and other such things that > > are based on callgraphs... > > If that's really the case: > > I rather suspect the xfs static v STATIC function marking is not > particularly curated and the marking is somewhat arbitrary. That's a common opinion for an outsider to form when they come across something unfamiliar they don't really understand. "I don't understand this, so I must rewrite it" is an unfortunate habit that programmers have. > So perhaps given the large number of static, but not STATIC > functions, perhaps a sed of s/static/STATIC/ should be done > when it's not inline for all xfs functions. That's just as bad as removing them all, if not worse. If you are writing new code or reworking existing code, then we'll consider the usage of STATIC/static in the context of that work. Otherwise, we leave it alone. It if ain't broke, don't fix it. And it sure as hell isn't broken right now. We've got more than enough bugs to fix without having to deal with drive-by bikeshed painting... -Dave.
On Tue, 2018-11-13 at 16:26 +1100, Dave Chinner wrote: > On Mon, Nov 12, 2018 at 08:23:42PM -0800, Joe Perches wrote: > > On Tue, 2018-11-13 at 14:09 +1100, Dave Chinner wrote: > > > On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote: > > > > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote: > > > > > I'm not interested in making code fast if distro support engineers > > > > > can't debug problems on user systems easily. Optimising for > > > > > performance over debuggability is a horrible trade off for us to > > > > > make because it means users and distros end up much more reliant on > > > > > single points of expertise for debugging problems. And that means > > > > > the majority of the load of problem triage falls directly on very > > > > > limited resources - the core XFS development team. A little bit of > > > > > thought about how to make code easier to triage and debug goes a > > > > > long, long way.... > > > > > > > > So at least in my experience, if the kernels are compiled with > > > > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED, > > > > scripts/decode_stracktrace.sh seems to do a very nice job with inlined > > > > > > That doesn't help with kernel profiling and other such things that > > > are based on callgraphs... > > > > If that's really the case: > > > > I rather suspect the xfs static v STATIC function marking is not > > particularly curated and the marking is somewhat arbitrary. > > That's a common opinion for an outsider to form when they come > across something unfamiliar they don't really understand. "I don't > understand this, so I must rewrite it" is an unfortunate habit that > programmers have. Silly. > > So perhaps given the large number of static, but not STATIC > > functions, perhaps a sed of s/static/STATIC/ should be done > > when it's not inline for all xfs functions. > > That's just as bad as removing them all, if not worse. Why? > If you are writing new code or reworking existing code, then we'll > consider the usage of STATIC/static in the context of that work. > Otherwise, we leave it alone. If your statement is as described above, and the STATIC use to enable call stack tracing i useful, why shouldn't it be systemic? > It if ain't broke, don't fix it. A generically lazy statement.
On Mon, Nov 12, 2018 at 09:31:51PM -0800, Joe Perches wrote: > On Tue, 2018-11-13 at 16:26 +1100, Dave Chinner wrote: > > On Mon, Nov 12, 2018 at 08:23:42PM -0800, Joe Perches wrote: > > > On Tue, 2018-11-13 at 14:09 +1100, Dave Chinner wrote: > > > > On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote: > > > > > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote: > > > > > > I'm not interested in making code fast if distro support engineers > > > > > > can't debug problems on user systems easily. Optimising for > > > > > > performance over debuggability is a horrible trade off for us to > > > > > > make because it means users and distros end up much more reliant on > > > > > > single points of expertise for debugging problems. And that means > > > > > > the majority of the load of problem triage falls directly on very > > > > > > limited resources - the core XFS development team. A little bit of > > > > > > thought about how to make code easier to triage and debug goes a > > > > > > long, long way.... > > > > > > > > > > So at least in my experience, if the kernels are compiled with > > > > > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED, > > > > > scripts/decode_stracktrace.sh seems to do a very nice job with inlined > > > > > > > > That doesn't help with kernel profiling and other such things that > > > > are based on callgraphs... > > > > > > If that's really the case: > > > > > > I rather suspect the xfs static v STATIC function marking is not > > > particularly curated and the marking is somewhat arbitrary. I disagree. I've added plenty of code over the past couple of years. Short functions with few or no branches (e.g. converters) are 'static'; longer functions (loops, iterators, "decide what to do with this" functions, etc.) with many branches are STATIC to make it easier for me to ftrace their decisions over a given dataset. > > That's a common opinion for an outsider to form when they come > > across something unfamiliar they don't really understand. "I don't > > understand this, so I must rewrite it" is an unfortunate habit that > > programmers have. > > Silly. Yet frequent. > > > So perhaps given the large number of static, but not STATIC > > > functions, perhaps a sed of s/static/STATIC/ should be done > > > when it's not inline for all xfs functions. > > > > That's just as bad as removing them all, if not worse. > > Why? > > > If you are writing new code or reworking existing code, then we'll > > consider the usage of STATIC/static in the context of that work. > > Otherwise, we leave it alone. > > If your statement is as described above, and > the STATIC use to enable call stack tracing i > useful, why shouldn't it be systemic? > > > It if ain't broke, don't fix it. > > A generically lazy statement. Please everyone let's take a breather from this thread for a few hours. A 3.7% reduction in code size is not worth getting worked up over, IMO. --D > >
On Tue, Nov 13, 2018 at 04:26:51PM +1100, Dave Chinner wrote: > That's just as bad as removing them all, if not worse. > > If you are writing new code or reworking existing code, then we'll > consider the usage of STATIC/static in the context of that work. > Otherwise, we leave it alone. > > It if ain't broke, don't fix it. And it sure as hell isn't broken > right now. We've got more than enough bugs to fix without having to > deal with drive-by bikeshed painting... Agreed. That being said I think we should aim to add manual noline annotations to those functions where we really care while we go through the code instead of the weird STATIC that just seems to cause this kind of confusion. And if Joe or somone else can come up with a few patches where removing the noinline (aka STATIC) makes a huge difference for a small number of functions we should consider it as well.
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h index edbd5a210df2..f33c8b626bca 100644 --- a/fs/xfs/xfs_linux.h +++ b/fs/xfs/xfs_linux.h @@ -242,7 +242,7 @@ static inline uint64_t howmany_64(uint64_t x, uint32_t y) #endif /* XFS_WARN */ #endif /* DEBUG */ -#define STATIC static noinline +#define STATIC static #ifdef CONFIG_XFS_RT
Reduce total object size quite a bit (~32KB) and presumably improve performance at the same time. Total object size old vs new (x86-64 defconfig with xfs) text data bss dec hex filename - 959351 165573 632 1125556 112cb4 (TOTALS) (old) + 924683 165669 632 1090984 10a5a8 (TOTALS) (new) Signed-off-by: Joe Perches <joe@perches.com> --- fs/xfs/xfs_linux.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)