diff mbox series

[1/4] xfs: fix brainos in the refcount scrubber's rmap fragment processor

Message ID 160494585913.772802.17231950418756379430.stgit@magnolia (mailing list archive)
State Accepted, archived
Headers show
Series xfs: fix various scrub problems | expand

Commit Message

Darrick J. Wong Nov. 9, 2020, 6:17 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

Fix some serious WTF in the reference count scrubber's rmap fragment
processing.  The code comment says that this loop is supposed to move
all fragment records starting at or before bno onto the worklist, but
there's no obvious reason why nr (the number of items added) should
increment starting from 1, and breaking the loop when we've added the
target number seems dubious since we could have more rmap fragments that
should have been added to the worklist.

This seems to manifest in xfs/411 when adding one to the refcount field.

Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/refcount.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

Christoph Hellwig Nov. 10, 2020, 6:35 p.m. UTC | #1
On Mon, Nov 09, 2020 at 10:17:39AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Fix some serious WTF in the reference count scrubber's rmap fragment
> processing.  The code comment says that this loop is supposed to move
> all fragment records starting at or before bno onto the worklist, but
> there's no obvious reason why nr (the number of items added) should
> increment starting from 1, and breaking the loop when we've added the
> target number seems dubious since we could have more rmap fragments that
> should have been added to the worklist.
> 
> This seems to manifest in xfs/411 when adding one to the refcount field.
> 
> Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt")

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Chandan Babu R Nov. 12, 2020, 12:51 p.m. UTC | #2
On Monday 9 November 2020 11:47:39 PM IST Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Fix some serious WTF in the reference count scrubber's rmap fragment
> processing.  The code comment says that this loop is supposed to move
> all fragment records starting at or before bno onto the worklist, but
> there's no obvious reason why nr (the number of items added) should
> increment starting from 1, and breaking the loop when we've added the
> target number seems dubious since we could have more rmap fragments that
> should have been added to the worklist.
> 
> This seems to manifest in xfs/411 when adding one to the refcount field.
> 
> Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/scrub/refcount.c |    8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> 
> diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c
> index beaeb6fa3119..dd672e6bbc75 100644
> --- a/fs/xfs/scrub/refcount.c
> +++ b/fs/xfs/scrub/refcount.c
> @@ -170,7 +170,6 @@ xchk_refcountbt_process_rmap_fragments(
>  	 */
>  	INIT_LIST_HEAD(&worklist);
>  	rbno = NULLAGBLOCK;
> -	nr = 1;
>  
>  	/* Make sure the fragments actually /are/ in agbno order. */
>  	bno = 0;
> @@ -184,15 +183,14 @@ xchk_refcountbt_process_rmap_fragments(
>  	 * Find all the rmaps that start at or before the refc extent,
>  	 * and put them on the worklist.
>  	 */
> +	nr = 0;
>  	list_for_each_entry_safe(frag, n, &refchk->fragments, list) {
> -		if (frag->rm.rm_startblock > refchk->bno)
> -			goto done;
> +		if (frag->rm.rm_startblock > refchk->bno || nr > target_nr)
> +			break;

In the case of fuzzed refcnt value of 1, The condition "nr > target_nr" causes
"nr != target_nr" condition (appearing after the loop) to evaluate to true
(since atleast two rmap entries would be present for the refcount extent)
which in turn causes xchk_refcountbt_xref_rmap() to flag the data structure as
corrupt. Please let me know if my understanding of the code flow is correct?

>  		bno = frag->rm.rm_startblock + frag->rm.rm_blockcount;
>  		if (bno < rbno)
>  			rbno = bno;
>  		list_move_tail(&frag->list, &worklist);
> -		if (nr == target_nr)
> -			break;
>  		nr++;
>  	}
>  
> 
>
Darrick J. Wong Nov. 12, 2020, 4:05 p.m. UTC | #3
On Thu, Nov 12, 2020 at 06:21:52PM +0530, Chandan Babu R wrote:
> On Monday 9 November 2020 11:47:39 PM IST Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Fix some serious WTF in the reference count scrubber's rmap fragment
> > processing.  The code comment says that this loop is supposed to move
> > all fragment records starting at or before bno onto the worklist, but
> > there's no obvious reason why nr (the number of items added) should
> > increment starting from 1, and breaking the loop when we've added the
> > target number seems dubious since we could have more rmap fragments that
> > should have been added to the worklist.
> > 
> > This seems to manifest in xfs/411 when adding one to the refcount field.
> > 
> > Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt")
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/scrub/refcount.c |    8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c
> > index beaeb6fa3119..dd672e6bbc75 100644
> > --- a/fs/xfs/scrub/refcount.c
> > +++ b/fs/xfs/scrub/refcount.c
> > @@ -170,7 +170,6 @@ xchk_refcountbt_process_rmap_fragments(
> >  	 */
> >  	INIT_LIST_HEAD(&worklist);
> >  	rbno = NULLAGBLOCK;
> > -	nr = 1;
> >  
> >  	/* Make sure the fragments actually /are/ in agbno order. */
> >  	bno = 0;
> > @@ -184,15 +183,14 @@ xchk_refcountbt_process_rmap_fragments(
> >  	 * Find all the rmaps that start at or before the refc extent,
> >  	 * and put them on the worklist.
> >  	 */
> > +	nr = 0;
> >  	list_for_each_entry_safe(frag, n, &refchk->fragments, list) {
> > -		if (frag->rm.rm_startblock > refchk->bno)
> > -			goto done;
> > +		if (frag->rm.rm_startblock > refchk->bno || nr > target_nr)
> > +			break;
> 
> In the case of fuzzed refcnt value of 1, The condition "nr > target_nr" causes
> "nr != target_nr" condition (appearing after the loop) to evaluate to true
> (since atleast two rmap entries would be present for the refcount extent)
> which in turn causes xchk_refcountbt_xref_rmap() to flag the data structure as
> corrupt. Please let me know if my understanding of the code flow is correct?

Right.

--D

> >  		bno = frag->rm.rm_startblock + frag->rm.rm_blockcount;
> >  		if (bno < rbno)
> >  			rbno = bno;
> >  		list_move_tail(&frag->list, &worklist);
> > -		if (nr == target_nr)
> > -			break;
> >  		nr++;
> >  	}
> >  
> > 
> > 
> 
> 
> -- 
> chandan
> 
> 
>
Chandan Babu R Nov. 13, 2020, 5:11 a.m. UTC | #4
On Thursday 12 November 2020 9:35:26 PM IST Darrick J. Wong wrote:
> On Thu, Nov 12, 2020 at 06:21:52PM +0530, Chandan Babu R wrote:
> > On Monday 9 November 2020 11:47:39 PM IST Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Fix some serious WTF in the reference count scrubber's rmap fragment
> > > processing.  The code comment says that this loop is supposed to move
> > > all fragment records starting at or before bno onto the worklist, but
> > > there's no obvious reason why nr (the number of items added) should
> > > increment starting from 1, and breaking the loop when we've added the
> > > target number seems dubious since we could have more rmap fragments that
> > > should have been added to the worklist.
> > > 
> > > This seems to manifest in xfs/411 when adding one to the refcount field.
> > > 
> > > Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt")
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/scrub/refcount.c |    8 +++-----
> > >  1 file changed, 3 insertions(+), 5 deletions(-)
> > > 
> > > 
> > > diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c
> > > index beaeb6fa3119..dd672e6bbc75 100644
> > > --- a/fs/xfs/scrub/refcount.c
> > > +++ b/fs/xfs/scrub/refcount.c
> > > @@ -170,7 +170,6 @@ xchk_refcountbt_process_rmap_fragments(
> > >  	 */
> > >  	INIT_LIST_HEAD(&worklist);
> > >  	rbno = NULLAGBLOCK;
> > > -	nr = 1;
> > >  
> > >  	/* Make sure the fragments actually /are/ in agbno order. */
> > >  	bno = 0;
> > > @@ -184,15 +183,14 @@ xchk_refcountbt_process_rmap_fragments(
> > >  	 * Find all the rmaps that start at or before the refc extent,
> > >  	 * and put them on the worklist.
> > >  	 */
> > > +	nr = 0;
> > >  	list_for_each_entry_safe(frag, n, &refchk->fragments, list) {
> > > -		if (frag->rm.rm_startblock > refchk->bno)
> > > -			goto done;
> > > +		if (frag->rm.rm_startblock > refchk->bno || nr > target_nr)
> > > +			break;
> > 
> > In the case of fuzzed refcnt value of 1, The condition "nr > target_nr" causes
> > "nr != target_nr" condition (appearing after the loop) to evaluate to true
> > (since atleast two rmap entries would be present for the refcount extent)
> > which in turn causes xchk_refcountbt_xref_rmap() to flag the data structure as
> > corrupt. Please let me know if my understanding of the code flow is correct?
> 
> Right.
>
Ok. In that case the code change in this patch is handling the erroneous
scenario correctly.

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>

> --D
> 
> > >  		bno = frag->rm.rm_startblock + frag->rm.rm_blockcount;
> > >  		if (bno < rbno)
> > >  			rbno = bno;
> > >  		list_move_tail(&frag->list, &worklist);
> > > -		if (nr == target_nr)
> > > -			break;
> > >  		nr++;
> > >  	}
> > >  
> > > 
> > > 
> > 
> > 
>
diff mbox series

Patch

diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c
index beaeb6fa3119..dd672e6bbc75 100644
--- a/fs/xfs/scrub/refcount.c
+++ b/fs/xfs/scrub/refcount.c
@@ -170,7 +170,6 @@  xchk_refcountbt_process_rmap_fragments(
 	 */
 	INIT_LIST_HEAD(&worklist);
 	rbno = NULLAGBLOCK;
-	nr = 1;
 
 	/* Make sure the fragments actually /are/ in agbno order. */
 	bno = 0;
@@ -184,15 +183,14 @@  xchk_refcountbt_process_rmap_fragments(
 	 * Find all the rmaps that start at or before the refc extent,
 	 * and put them on the worklist.
 	 */
+	nr = 0;
 	list_for_each_entry_safe(frag, n, &refchk->fragments, list) {
-		if (frag->rm.rm_startblock > refchk->bno)
-			goto done;
+		if (frag->rm.rm_startblock > refchk->bno || nr > target_nr)
+			break;
 		bno = frag->rm.rm_startblock + frag->rm.rm_blockcount;
 		if (bno < rbno)
 			rbno = bno;
 		list_move_tail(&frag->list, &worklist);
-		if (nr == target_nr)
-			break;
 		nr++;
 	}