xfs: remove the redundant check in xfs_bmap_first_unused

Message ID	20220909030756.3916297-1-zhangshida@kylinos.cn (mailing list archive)
State	Deferred, archived
Headers	show Return-Path: <linux-xfs-owner@kernel.org> From: Stephen Zhang <starzhangzsd@gmail.com> To: djwong@kernel.org, dchinner@redhat.com, chandan.babu@oracle.com Cc: zhangshida@kylinos.cn, starzhangzsd@gmail.com, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH] xfs: remove the redundant check in xfs_bmap_first_unused Date: Fri, 9 Sep 2022 11:07:56 +0800 Message-Id: <20220909030756.3916297-1-zhangshida@kylinos.cn> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	xfs: remove the redundant check in xfs_bmap_first_unused \| expand xfs: remove the redundant check in xfs_bmap_first_unused

Stephen Zhang Sept. 9, 2022, 3:07 a.m. UTC

Given that
        max >= lowest,
hence if
        got.br_startoff >= max + len,
then, at the same time,
        got.br_startoff >= lowest + len,

So the check here is redundant, remove it.

Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
---
 fs/xfs/libxfs/xfs_bmap.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Carlos Maiolino Sept. 9, 2022, 8:40 a.m. UTC | #1

On Fri, Sep 09, 2022 at 11:07:56AM +0800, Stephen Zhang wrote:
> Given that
>         max >= lowest,
> hence if
>         got.br_startoff >= max + len,
> then, at the same time,
>         got.br_startoff >= lowest + len,
> 
> So the check here is redundant, remove it.
> 
> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>

Change seems ok, I wonder if this wouldn't mask a bit the intention of this
condition, but this does not seem a big deal, so.

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>


P.S.

There is no need to send xfs specific patches to LKML, this just cause extra
noise there without any gain.

> ---
>  fs/xfs/libxfs/xfs_bmap.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index e56723dc9cd5..f8a984c41b01 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -1230,8 +1230,7 @@ xfs_bmap_first_unused(
>  		/*
>  		 * See if the hole before this extent will work.
>  		 */
> -		if (got.br_startoff >= lowest + len &&
> -		    got.br_startoff - max >= len)
> +		if (got.br_startoff - max >= len)
>  			break;
>  		lastaddr = got.br_startoff + got.br_blockcount;
>  		max = XFS_FILEOFF_MAX(lastaddr, lowest);
> --
> 2.25.1
>

Dave Chinner Sept. 11, 2022, 11:12 p.m. UTC | #2

On Fri, Sep 09, 2022 at 11:07:56AM +0800, Stephen Zhang wrote:
> Given that
>         max >= lowest,
> hence if
>         got.br_startoff >= max + len,
> then, at the same time,
>         got.br_startoff >= lowest + len,
> 
> So the check here is redundant, remove it.

Check your types: what happens when *first_unused =
XFS_DIR2_LEAF_OFFSET?

> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index e56723dc9cd5..f8a984c41b01 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -1230,8 +1230,7 @@ xfs_bmap_first_unused(
>  		/*
>  		 * See if the hole before this extent will work.
>  		 */
> -		if (got.br_startoff >= lowest + len &&
> -		    got.br_startoff - max >= len)
> +		if (got.br_startoff - max >= len)
>  			break;
>  		lastaddr = got.br_startoff + got.br_blockcount;
>  		max = XFS_FILEOFF_MAX(lastaddr, lowest);

This loop does a linear scan of the extent list, so it starts at
extent index zero which will be got.br_startoff = 0 for the
first directory data block.

When we are called from xfs_da_grow_inode_int(), we're trying to add
blocks in the directory leaf btree segment here. Hence the lowest
file offset we want to search for a hole is XFS_DIR2_LEAF_OFFSET.

Given that all the types and comparisons involved are 64 bit
unsigned:

typedef uint64_t        xfs_fileoff_t;  /* block number in a file */ 

#define XFS_FILEOFF_MAX(a,b) max_t(xfs_fileoff_t, (a), (b))

	xfs_fileoff_t br_startoff;

        xfs_fileoff_t           lastaddr = 0;
	xfs_fileoff_t           lowest, max;

We end up with the following calculations (in FSBs, not bytes):

	lowest + len	= 0x800000ULL + 1 
			= 0x800001ULL

	got.br_startoff - max	= 0ULL - 0x800000
				= 0xffffffffff800000ULL

and so the existing check is:

	if (0 >= 0x800001ULL && 0xffffffffff800000 >= 1)

which evaluates as false because the extent that was found is not
beyond the initial offset (first_unused) that we need to start
searching at.

With your modification, this would now evaluate as:

	if (0xffffffffff800000 >= 1)

Because of the underflow, this would then evaluate as true  and we'd
return 0 as the first unused offset. This is incorrect as we do not
have a hole at offset 0, nor is it within the correct directory
offset segment, nor is it within the search bounds we have
specified.

If these were all signed types, then your proposed code might be
correct. But they are unsigned and hence we have to ensure that we
handle overflow/underflow appropriately.

Which leads me to ask: did you test this change before you send
it to the list?

Cheers,

Dave.

Stephen Zhang Sept. 12, 2022, 6:39 a.m. UTC | #3

Dave Chinner <david@fromorbit.com> 于2022年9月12日周一 07:12写道：
> Given that all the types and comparisons involved are 64 bit
> unsigned:
>
> typedef uint64_t        xfs_fileoff_t;  /* block number in a file */
>
> #define XFS_FILEOFF_MAX(a,b) max_t(xfs_fileoff_t, (a), (b))
>
>         xfs_fileoff_t br_startoff;
>
>         xfs_fileoff_t           lastaddr = 0;
>         xfs_fileoff_t           lowest, max;
>
> We end up with the following calculations (in FSBs, not bytes):
>
>         lowest + len    = 0x800000ULL + 1
>                         = 0x800001ULL
>
>         got.br_startoff - max   = 0ULL - 0x800000
>                                 = 0xffffffffff800000ULL
>
> and so the existing check is:
>
>         if (0 >= 0x800001ULL && 0xffffffffff800000 >= 1)
>
> which evaluates as false because the extent that was found is not
> beyond the initial offset (first_unused) that we need to start
> searching at.
>
> With your modification, this would now evaluate as:
>
>         if (0xffffffffff800000 >= 1)
>
> Because of the underflow, this would then evaluate as true  and we'd
> return 0 as the first unused offset. This is incorrect as we do not
> have a hole at offset 0, nor is it within the correct directory
> offset segment, nor is it within the search bounds we have
> specified.
>
> If these were all signed types, then your proposed code might be
> correct. But they are unsigned and hence we have to ensure that we
> handle overflow/underflow appropriately.
>
> Which leads me to ask: did you test this change before you send
> it to the list?
>

I am so sorry about the mistake, and thanks for your elaboration about
this problem. it indeed teaches me a lesson about the necessity of test
even for the simplest change.

By the way, theoretically, in order to solve this, I wonder if we could
change the code in the following way:
====
xfs_bmap_first_unused(
                /*
                 * See if the hole before this extent will work.
                 */
-               if (got.br_startoff >= lowest + len &&
-                   got.br_startoff - max >= len)
+               if (got.br_startoff >= max + len)
                        break;
====

Thanks,

Stephen.

Darrick J. Wong Sept. 14, 2022, 4:42 p.m. UTC | #4

On Mon, Sep 12, 2022 at 02:39:23PM +0800, Stephen Zhang wrote:
> Dave Chinner <david@fromorbit.com> 于2022年9月12日周一 07:12写道：
> > Given that all the types and comparisons involved are 64 bit
> > unsigned:
> >
> > typedef uint64_t        xfs_fileoff_t;  /* block number in a file */
> >
> > #define XFS_FILEOFF_MAX(a,b) max_t(xfs_fileoff_t, (a), (b))
> >
> >         xfs_fileoff_t br_startoff;
> >
> >         xfs_fileoff_t           lastaddr = 0;
> >         xfs_fileoff_t           lowest, max;
> >
> > We end up with the following calculations (in FSBs, not bytes):
> >
> >         lowest + len    = 0x800000ULL + 1
> >                         = 0x800001ULL
> >
> >         got.br_startoff - max   = 0ULL - 0x800000
> >                                 = 0xffffffffff800000ULL
> >
> > and so the existing check is:
> >
> >         if (0 >= 0x800001ULL && 0xffffffffff800000 >= 1)
> >
> > which evaluates as false because the extent that was found is not
> > beyond the initial offset (first_unused) that we need to start
> > searching at.
> >
> > With your modification, this would now evaluate as:
> >
> >         if (0xffffffffff800000 >= 1)
> >
> > Because of the underflow, this would then evaluate as true  and we'd
> > return 0 as the first unused offset. This is incorrect as we do not
> > have a hole at offset 0, nor is it within the correct directory
> > offset segment, nor is it within the search bounds we have
> > specified.
> >
> > If these were all signed types, then your proposed code might be
> > correct. But they are unsigned and hence we have to ensure that we
> > handle overflow/underflow appropriately.
> >
> > Which leads me to ask: did you test this change before you send
> > it to the list?
> >
> 
> I am so sorry about the mistake, and thanks for your elaboration about
> this problem. it indeed teaches me a lesson about the necessity of test
> even for the simplest change.
> 
> By the way, theoretically, in order to solve this, I wonder if we could
> change the code in the following way:
> ====
> xfs_bmap_first_unused(
>                 /*
>                  * See if the hole before this extent will work.
>                  */
> -               if (got.br_startoff >= lowest + len &&
> -                   got.br_startoff - max >= len)
> +               if (got.br_startoff >= max + len)

Er... what problem does this solve?  Is there a defect in this range
checking code?  Why not leave it alone, since that's less retesting for
all of us.

--D

>                         break;
> ====
> 
> Thanks,
> 
> Stephen.

kernel test robot Sept. 15, 2022, 7:41 a.m. UTC | #5

Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: 505313cbc048583d7be1ace80e9eff78fe504fa6 ("[PATCH] xfs: remove the redundant check in xfs_bmap_first_unused")
url: https://github.com/intel-lab-lkp/linux/commits/Stephen-Zhang/xfs-remove-the-redundant-check-in-xfs_bmap_first_unused/20220909-111228
patch link: https://lore.kernel.org/lkml/20220909030756.3916297-1-zhangshida@kylinos.cn

in testcase: fsmark
version: fsmark-x86_64-698ee57-1_20220517
with following parameters:

	iterations: 1x
	nr_threads: 1t
	disk: 1HDD
	fs: xfs
	filesize: 4K
	test_size: 100M
	sync_method: fsyncBeforeClose
	nr_files_per_directory: 1fpd
	cpufreq_governor: performance

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/


on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/r/202209151542.91c8983f-oliver.sang@intel.com


[   44.710825][ T1533]
[   44.719150][ T1533]   { wakeup_events, wakeup_watermark } 1
[   44.719151][ T1533]
[   44.728051][ T1533] ------------------------------------------------------------
[   44.728052][ T1533]
[   73.443038][ T6040] XFS: Assertion failed: index == leafhdr.count || be32_to_cpu(ents[index].hashval) >= args->hashval, file: fs/xfs/libxfs/xfs_dir2_node.c, line: 537
[   73.458270][ T6040] ------------[ cut here ]------------
[   73.463715][ T6040] kernel BUG at fs/xfs/xfs_message.c:102!
[   73.469417][ T6040] invalid opcode: 0000 [#1] SMP NOPTI
[   73.474773][ T6040] CPU: 100 PID: 6040 Comm: fs_mark Not tainted 6.0.0-rc4-00001-g505313cbc048 #1
[ 73.483780][ T6040] RIP: 0010:assfail (kbuild/src/consumer/fs/xfs/./xfs_trace.h:143) xfs
[ 73.489068][ T6040] Code: 00 00 49 89 d0 41 89 c9 48 c7 c2 08 77 3e a1 48 89 f1 48 89 fe 48 c7 c7 c3 97 3d a1 e8 3a fe ff ff 80 3d e1 4b 0b 00 00 74 02 <0f> 0b 0f 0b c3 cc cc cc cc 48 8d 45 10 48 89 e2 4c 89 e6 48 89 1c
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	49 89 d0             	mov    %rdx,%r8
   5:	41 89 c9             	mov    %ecx,%r9d
   8:	48 c7 c2 08 77 3e a1 	mov    $0xffffffffa13e7708,%rdx
   f:	48 89 f1             	mov    %rsi,%rcx
  12:	48 89 fe             	mov    %rdi,%rsi
  15:	48 c7 c7 c3 97 3d a1 	mov    $0xffffffffa13d97c3,%rdi
  1c:	e8 3a fe ff ff       	callq  0xfffffffffffffe5b
  21:	80 3d e1 4b 0b 00 00 	cmpb   $0x0,0xb4be1(%rip)        # 0xb4c09
  28:	74 02                	je     0x2c
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	0f 0b                	ud2    
  2e:	c3                   	retq   
  2f:	cc                   	int3   
  30:	cc                   	int3   
  31:	cc                   	int3   
  32:	cc                   	int3   
  33:	48 8d 45 10          	lea    0x10(%rbp),%rax
  37:	48 89 e2             	mov    %rsp,%rdx
  3a:	4c 89 e6             	mov    %r12,%rsi
  3d:	48                   	rex.W
  3e:	89                   	.byte 0x89
  3f:	1c                   	.byte 0x1c

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	0f 0b                	ud2    
   4:	c3                   	retq   
   5:	cc                   	int3   
   6:	cc                   	int3   
   7:	cc                   	int3   
   8:	cc                   	int3   
   9:	48 8d 45 10          	lea    0x10(%rbp),%rax
   d:	48 89 e2             	mov    %rsp,%rdx
  10:	4c 89 e6             	mov    %r12,%rsi
  13:	48                   	rex.W
  14:	89                   	.byte 0x89
  15:	1c                   	.byte 0x1c
[   73.508838][ T6040] RSP: 0018:ffa000000ce23ba0 EFLAGS: 00010202
[   73.514929][ T6040] RAX: 0000000000000000 RBX: ff11001085896400 RCX: 000000007fffffff
[   73.522931][ T6040] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa13d97c3
[   73.530934][ T6040] RBP: ff11001084634080 R08: 0000000000000000 R09: 000000000000000a
[   73.538938][ T6040] R10: 000000000000000a R11: f000000000000000 R12: ff11001085b08300
[   73.546943][ T6040] R13: ff1100206f937000 R14: 0000000000000000 R15: ff1100206f937040
[   73.554955][ T6040] FS:  00007f69443de640(0000) GS:ff11002002d00000(0000) knlGS:0000000000000000
[   73.563919][ T6040] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   73.570551][ T6040] CR2: 0000555bd87a3ab2 CR3: 0000002076e68003 CR4: 0000000000771ee0
[   73.578582][ T6040] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   73.586610][ T6040] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   73.594638][ T6040] PKRU: 55555554
[   73.598248][ T6040] Call Trace:
[   73.601607][ T6040]  <TASK>
[ 73.604609][ T6040] xfs_dir2_leafn_add (kbuild/src/consumer/fs/xfs/libxfs/xfs_dir2_node.c:537 (discriminator 2)) xfs
[ 73.610280][ T6040] ? xfs_da3_blk_link (kbuild/src/consumer/fs/xfs/libxfs/xfs_da_btree.c:1890) xfs
[ 73.615933][ T6040] xfs_dir2_leafn_split (kbuild/src/consumer/fs/xfs/libxfs/xfs_dir2_node.c:1463) xfs
[ 73.621744][ T6040] ? xfs_dir2_leafn_add (kbuild/src/consumer/fs/xfs/libxfs/xfs_dir2_node.c:518) xfs


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

xfs: remove the redundant check in xfs_bmap_first_unused

Commit Message

Comments

Patch