diff mbox series

[v4,2/3] xfs: introduce in-core global counter of allocbt blocks

Message ID 20210423131050.141140-3-bfoster@redhat.com (mailing list archive)
State Superseded, archived
Headers show
Series xfs: set aside allocation btree blocks from block reservation | expand

Commit Message

Brian Foster April 23, 2021, 1:10 p.m. UTC
Introduce an in-core counter to track the sum of all allocbt blocks
used by the filesystem. This value is currently tracked per-ag via
the ->agf_btreeblks field in the AGF, which also happens to include
rmapbt blocks. A global, in-core count of allocbt blocks is required
to identify the subset of global ->m_fdblocks that consists of
unavailable blocks currently used for allocation btrees. To support
this calculation at block reservation time, construct a similar
global counter for allocbt blocks, populate it on first read of each
AGF and update it as allocbt blocks are used and released.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
 fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
 fs/xfs/xfs_mount.h              |  6 ++++++
 3 files changed, 20 insertions(+)

Comments

Chandan Babu R April 27, 2021, 10:28 a.m. UTC | #1
On 23 Apr 2021 at 18:40, Brian Foster wrote:
> Introduce an in-core counter to track the sum of all allocbt blocks
> used by the filesystem. This value is currently tracked per-ag via
> the ->agf_btreeblks field in the AGF, which also happens to include
> rmapbt blocks. A global, in-core count of allocbt blocks is required
> to identify the subset of global ->m_fdblocks that consists of
> unavailable blocks currently used for allocation btrees. To support
> this calculation at block reservation time, construct a similar
> global counter for allocbt blocks, populate it on first read of each
> AGF and update it as allocbt blocks are used and released.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
>  fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
>  fs/xfs/xfs_mount.h              |  6 ++++++
>  3 files changed, 20 insertions(+)
>
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index aaa19101bb2a..144e2d68245c 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -3036,6 +3036,7 @@ xfs_alloc_read_agf(
>  	struct xfs_agf		*agf;		/* ag freelist header */
>  	struct xfs_perag	*pag;		/* per allocation group data */
>  	int			error;
> +	uint32_t		allocbt_blks;
>
>  	trace_xfs_alloc_read_agf(mp, agno);
>
> @@ -3066,6 +3067,17 @@ xfs_alloc_read_agf(
>  		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
>  		pag->pagf_init = 1;
>  		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
> +
> +		/*
> +		 * Update the global in-core allocbt block counter. Filter
> +		 * rmapbt blocks from the on-disk counter because those are
> +		 * managed by perag reservation.
> +		 */
> +		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {

pag->pagf_btreeblks gets incremented everytime a block is allocated to refill
AGFL (via xfs_alloc_get_freelist()). Apart from the allobt trees, blocks for
Rmap btree also get allocated from AGFL. Hence pag->pagf_btreeblks must be
larger than agf->agf_rmap_blocks.

Can you please describe the scenario in which pag->pagf_btreeblks has a value
that is <= agf->agf_rmap_blocks?

--
chandan
Brian Foster April 27, 2021, 11:33 a.m. UTC | #2
On Tue, Apr 27, 2021 at 03:58:16PM +0530, Chandan Babu R wrote:
> On 23 Apr 2021 at 18:40, Brian Foster wrote:
> > Introduce an in-core counter to track the sum of all allocbt blocks
> > used by the filesystem. This value is currently tracked per-ag via
> > the ->agf_btreeblks field in the AGF, which also happens to include
> > rmapbt blocks. A global, in-core count of allocbt blocks is required
> > to identify the subset of global ->m_fdblocks that consists of
> > unavailable blocks currently used for allocation btrees. To support
> > this calculation at block reservation time, construct a similar
> > global counter for allocbt blocks, populate it on first read of each
> > AGF and update it as allocbt blocks are used and released.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
> >  fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
> >  fs/xfs/xfs_mount.h              |  6 ++++++
> >  3 files changed, 20 insertions(+)
> >
> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > index aaa19101bb2a..144e2d68245c 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.c
> > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > @@ -3036,6 +3036,7 @@ xfs_alloc_read_agf(
> >  	struct xfs_agf		*agf;		/* ag freelist header */
> >  	struct xfs_perag	*pag;		/* per allocation group data */
> >  	int			error;
> > +	uint32_t		allocbt_blks;
> >
> >  	trace_xfs_alloc_read_agf(mp, agno);
> >
> > @@ -3066,6 +3067,17 @@ xfs_alloc_read_agf(
> >  		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
> >  		pag->pagf_init = 1;
> >  		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
> > +
> > +		/*
> > +		 * Update the global in-core allocbt block counter. Filter
> > +		 * rmapbt blocks from the on-disk counter because those are
> > +		 * managed by perag reservation.
> > +		 */
> > +		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {
> 
> pag->pagf_btreeblks gets incremented everytime a block is allocated to refill
> AGFL (via xfs_alloc_get_freelist()). Apart from the allobt trees, blocks for
> Rmap btree also get allocated from AGFL. Hence pag->pagf_btreeblks must be
> larger than agf->agf_rmap_blocks.
> 

This function is actually to consume a block from the AGFL (as opposed
to refill it).

> Can you please describe the scenario in which pag->pagf_btreeblks has a value
> that is <= agf->agf_rmap_blocks?
> 

Ah, this was just an initialization quirk. I originally had an assert
here and based the logic on the assumption that pagf_btreeblks >=
agf_rmap_blocks, but alas:

# mkfs.xfs -f -mrmapbt <dev>
...
# xfs_db -c "agf 0" -c "p rmapblocks" -c "p btreeblks" <dev>
rmapblocks = 1
btreeblks = 0
#

Brian

> --
> chandan
>
Chandan Babu R April 27, 2021, 1:22 p.m. UTC | #3
On 27 Apr 2021 at 17:03, Brian Foster wrote:
> On Tue, Apr 27, 2021 at 03:58:16PM +0530, Chandan Babu R wrote:
>> On 23 Apr 2021 at 18:40, Brian Foster wrote:
>> > Introduce an in-core counter to track the sum of all allocbt blocks
>> > used by the filesystem. This value is currently tracked per-ag via
>> > the ->agf_btreeblks field in the AGF, which also happens to include
>> > rmapbt blocks. A global, in-core count of allocbt blocks is required
>> > to identify the subset of global ->m_fdblocks that consists of
>> > unavailable blocks currently used for allocation btrees. To support
>> > this calculation at block reservation time, construct a similar
>> > global counter for allocbt blocks, populate it on first read of each
>> > AGF and update it as allocbt blocks are used and released.
>> >
>> > Signed-off-by: Brian Foster <bfoster@redhat.com>
>> > ---
>> >  fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
>> >  fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
>> >  fs/xfs/xfs_mount.h              |  6 ++++++
>> >  3 files changed, 20 insertions(+)
>> >
>> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
>> > index aaa19101bb2a..144e2d68245c 100644
>> > --- a/fs/xfs/libxfs/xfs_alloc.c
>> > +++ b/fs/xfs/libxfs/xfs_alloc.c
>> > @@ -3036,6 +3036,7 @@ xfs_alloc_read_agf(
>> >  	struct xfs_agf		*agf;		/* ag freelist header */
>> >  	struct xfs_perag	*pag;		/* per allocation group data */
>> >  	int			error;
>> > +	uint32_t		allocbt_blks;
>> >
>> >  	trace_xfs_alloc_read_agf(mp, agno);
>> >
>> > @@ -3066,6 +3067,17 @@ xfs_alloc_read_agf(
>> >  		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
>> >  		pag->pagf_init = 1;
>> >  		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
>> > +
>> > +		/*
>> > +		 * Update the global in-core allocbt block counter. Filter
>> > +		 * rmapbt blocks from the on-disk counter because those are
>> > +		 * managed by perag reservation.
>> > +		 */
>> > +		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {
>> 
>> pag->pagf_btreeblks gets incremented everytime a block is allocated to refill
>> AGFL (via xfs_alloc_get_freelist()). Apart from the allobt trees, blocks for
>> Rmap btree also get allocated from AGFL. Hence pag->pagf_btreeblks must be
>> larger than agf->agf_rmap_blocks.
>> 
>
> This function is actually to consume a block from the AGFL (as opposed
> to refill it).

Sorry, I meant to say "allocate a block from AGFL".

>
>> Can you please describe the scenario in which pag->pagf_btreeblks has a value
>> that is <= agf->agf_rmap_blocks?
>> 
>
> Ah, this was just an initialization quirk. I originally had an assert
> here and based the logic on the assumption that pagf_btreeblks >=
> agf_rmap_blocks, but alas:
>
> # mkfs.xfs -f -mrmapbt <dev>
> ...
> # xfs_db -c "agf 0" -c "p rmapblocks" -c "p btreeblks" <dev>
> rmapblocks = 1
> btreeblks = 0
> #

Thanks for clarifying my doubt.

The patch looks good to me.

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Allison Henderson April 27, 2021, 9:37 p.m. UTC | #4
On 4/23/21 6:10 AM, Brian Foster wrote:
> Introduce an in-core counter to track the sum of all allocbt blocks
> used by the filesystem. This value is currently tracked per-ag via
> the ->agf_btreeblks field in the AGF, which also happens to include
> rmapbt blocks. A global, in-core count of allocbt blocks is required
> to identify the subset of global ->m_fdblocks that consists of
> unavailable blocks currently used for allocation btrees. To support
> this calculation at block reservation time, construct a similar
> global counter for allocbt blocks, populate it on first read of each
> AGF and update it as allocbt blocks are used and released.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
OK, makes sense
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>

> ---
>   fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
>   fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
>   fs/xfs/xfs_mount.h              |  6 ++++++
>   3 files changed, 20 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index aaa19101bb2a..144e2d68245c 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -3036,6 +3036,7 @@ xfs_alloc_read_agf(
>   	struct xfs_agf		*agf;		/* ag freelist header */
>   	struct xfs_perag	*pag;		/* per allocation group data */
>   	int			error;
> +	uint32_t		allocbt_blks;
>   
>   	trace_xfs_alloc_read_agf(mp, agno);
>   
> @@ -3066,6 +3067,17 @@ xfs_alloc_read_agf(
>   		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
>   		pag->pagf_init = 1;
>   		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
> +
> +		/*
> +		 * Update the global in-core allocbt block counter. Filter
> +		 * rmapbt blocks from the on-disk counter because those are
> +		 * managed by perag reservation.
> +		 */
> +		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {
> +			allocbt_blks = pag->pagf_btreeblks -
> +					be32_to_cpu(agf->agf_rmap_blocks);
> +			atomic64_add(allocbt_blks, &mp->m_allocbt_blks);
> +		}
>   	}
>   #ifdef DEBUG
>   	else if (!XFS_FORCED_SHUTDOWN(mp)) {
> diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
> index 8e01231b308e..9f5a45f7baed 100644
> --- a/fs/xfs/libxfs/xfs_alloc_btree.c
> +++ b/fs/xfs/libxfs/xfs_alloc_btree.c
> @@ -71,6 +71,7 @@ xfs_allocbt_alloc_block(
>   		return 0;
>   	}
>   
> +	atomic64_inc(&cur->bc_mp->m_allocbt_blks);
>   	xfs_extent_busy_reuse(cur->bc_mp, cur->bc_ag.agno, bno, 1, false);
>   
>   	xfs_trans_agbtree_delta(cur->bc_tp, 1);
> @@ -95,6 +96,7 @@ xfs_allocbt_free_block(
>   	if (error)
>   		return error;
>   
> +	atomic64_dec(&cur->bc_mp->m_allocbt_blks);
>   	xfs_extent_busy_insert(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1,
>   			      XFS_EXTENT_BUSY_SKIP_DISCARD);
>   	xfs_trans_agbtree_delta(cur->bc_tp, -1);
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index 81829d19596e..bb67274ee23f 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -170,6 +170,12 @@ typedef struct xfs_mount {
>   	 * extents or anything related to the rt device.
>   	 */
>   	struct percpu_counter	m_delalloc_blks;
> +	/*
> +	 * Global count of allocation btree blocks in use across all AGs. Only
> +	 * used when perag reservation is enabled. Helps prevent block
> +	 * reservation from attempting to reserve allocation btree blocks.
> +	 */
> +	atomic64_t		m_allocbt_blks;
>   
>   	struct radix_tree_root	m_perag_tree;	/* per-ag accounting info */
>   	spinlock_t		m_perag_lock;	/* lock for m_perag_tree */
>
Darrick J. Wong April 28, 2021, 4:15 a.m. UTC | #5
On Fri, Apr 23, 2021 at 09:10:49AM -0400, Brian Foster wrote:
> Introduce an in-core counter to track the sum of all allocbt blocks
> used by the filesystem. This value is currently tracked per-ag via
> the ->agf_btreeblks field in the AGF, which also happens to include
> rmapbt blocks. A global, in-core count of allocbt blocks is required
> to identify the subset of global ->m_fdblocks that consists of
> unavailable blocks currently used for allocation btrees. To support
> this calculation at block reservation time, construct a similar
> global counter for allocbt blocks, populate it on first read of each
> AGF and update it as allocbt blocks are used and released.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
>  fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
>  fs/xfs/xfs_mount.h              |  6 ++++++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index aaa19101bb2a..144e2d68245c 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -3036,6 +3036,7 @@ xfs_alloc_read_agf(
>  	struct xfs_agf		*agf;		/* ag freelist header */
>  	struct xfs_perag	*pag;		/* per allocation group data */
>  	int			error;
> +	uint32_t		allocbt_blks;
>  
>  	trace_xfs_alloc_read_agf(mp, agno);
>  
> @@ -3066,6 +3067,17 @@ xfs_alloc_read_agf(
>  		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
>  		pag->pagf_init = 1;
>  		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
> +
> +		/*
> +		 * Update the global in-core allocbt block counter. Filter
> +		 * rmapbt blocks from the on-disk counter because those are
> +		 * managed by perag reservation.
> +		 */
> +		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {

As pointed out elsewhere in the thread, agf_rmap_blocks counts the total
number of blocks in the rmapbt (whereas agf_btreeblks counts the number
of non-root blocks in all three free space btrees).  Does this need a
change?

	int delta = (int)pag->pagf_btreeblks - (be32_to_cpu(...) - 1);
	if (delta > 0)
		atomic64_add(delta, &mp->m_allocbt_blks);

--D

> +			allocbt_blks = pag->pagf_btreeblks -
> +					be32_to_cpu(agf->agf_rmap_blocks);
> +			atomic64_add(allocbt_blks, &mp->m_allocbt_blks);
> +		}
>  	}
>  #ifdef DEBUG
>  	else if (!XFS_FORCED_SHUTDOWN(mp)) {
> diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
> index 8e01231b308e..9f5a45f7baed 100644
> --- a/fs/xfs/libxfs/xfs_alloc_btree.c
> +++ b/fs/xfs/libxfs/xfs_alloc_btree.c
> @@ -71,6 +71,7 @@ xfs_allocbt_alloc_block(
>  		return 0;
>  	}
>  
> +	atomic64_inc(&cur->bc_mp->m_allocbt_blks);
>  	xfs_extent_busy_reuse(cur->bc_mp, cur->bc_ag.agno, bno, 1, false);
>  
>  	xfs_trans_agbtree_delta(cur->bc_tp, 1);
> @@ -95,6 +96,7 @@ xfs_allocbt_free_block(
>  	if (error)
>  		return error;
>  
> +	atomic64_dec(&cur->bc_mp->m_allocbt_blks);
>  	xfs_extent_busy_insert(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1,
>  			      XFS_EXTENT_BUSY_SKIP_DISCARD);
>  	xfs_trans_agbtree_delta(cur->bc_tp, -1);
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index 81829d19596e..bb67274ee23f 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -170,6 +170,12 @@ typedef struct xfs_mount {
>  	 * extents or anything related to the rt device.
>  	 */
>  	struct percpu_counter	m_delalloc_blks;
> +	/*
> +	 * Global count of allocation btree blocks in use across all AGs. Only
> +	 * used when perag reservation is enabled. Helps prevent block
> +	 * reservation from attempting to reserve allocation btree blocks.
> +	 */
> +	atomic64_t		m_allocbt_blks;
>  
>  	struct radix_tree_root	m_perag_tree;	/* per-ag accounting info */
>  	spinlock_t		m_perag_lock;	/* lock for m_perag_tree */
> -- 
> 2.26.3
>
Brian Foster April 28, 2021, 3:01 p.m. UTC | #6
On Tue, Apr 27, 2021 at 09:15:09PM -0700, Darrick J. Wong wrote:
> On Fri, Apr 23, 2021 at 09:10:49AM -0400, Brian Foster wrote:
> > Introduce an in-core counter to track the sum of all allocbt blocks
> > used by the filesystem. This value is currently tracked per-ag via
> > the ->agf_btreeblks field in the AGF, which also happens to include
> > rmapbt blocks. A global, in-core count of allocbt blocks is required
> > to identify the subset of global ->m_fdblocks that consists of
> > unavailable blocks currently used for allocation btrees. To support
> > this calculation at block reservation time, construct a similar
> > global counter for allocbt blocks, populate it on first read of each
> > AGF and update it as allocbt blocks are used and released.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
> >  fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
> >  fs/xfs/xfs_mount.h              |  6 ++++++
> >  3 files changed, 20 insertions(+)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > index aaa19101bb2a..144e2d68245c 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.c
> > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > @@ -3036,6 +3036,7 @@ xfs_alloc_read_agf(
> >  	struct xfs_agf		*agf;		/* ag freelist header */
> >  	struct xfs_perag	*pag;		/* per allocation group data */
> >  	int			error;
> > +	uint32_t		allocbt_blks;
> >  
> >  	trace_xfs_alloc_read_agf(mp, agno);
> >  
> > @@ -3066,6 +3067,17 @@ xfs_alloc_read_agf(
> >  		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
> >  		pag->pagf_init = 1;
> >  		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
> > +
> > +		/*
> > +		 * Update the global in-core allocbt block counter. Filter
> > +		 * rmapbt blocks from the on-disk counter because those are
> > +		 * managed by perag reservation.
> > +		 */
> > +		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {
> 
> As pointed out elsewhere in the thread, agf_rmap_blocks counts the total
> number of blocks in the rmapbt (whereas agf_btreeblks counts the number
> of non-root blocks in all three free space btrees).  Does this need a
> change?
> 
> 	int delta = (int)pag->pagf_btreeblks - (be32_to_cpu(...) - 1);
> 	if (delta > 0)
> 		atomic64_add(delta, &mp->m_allocbt_blks);
>

Hm yes, this makes more sense. Will fix and update the comment..

Brian
 
> --D
> 
> > +			allocbt_blks = pag->pagf_btreeblks -
> > +					be32_to_cpu(agf->agf_rmap_blocks);
> > +			atomic64_add(allocbt_blks, &mp->m_allocbt_blks);
> > +		}
> >  	}
> >  #ifdef DEBUG
> >  	else if (!XFS_FORCED_SHUTDOWN(mp)) {
> > diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
> > index 8e01231b308e..9f5a45f7baed 100644
> > --- a/fs/xfs/libxfs/xfs_alloc_btree.c
> > +++ b/fs/xfs/libxfs/xfs_alloc_btree.c
> > @@ -71,6 +71,7 @@ xfs_allocbt_alloc_block(
> >  		return 0;
> >  	}
> >  
> > +	atomic64_inc(&cur->bc_mp->m_allocbt_blks);
> >  	xfs_extent_busy_reuse(cur->bc_mp, cur->bc_ag.agno, bno, 1, false);
> >  
> >  	xfs_trans_agbtree_delta(cur->bc_tp, 1);
> > @@ -95,6 +96,7 @@ xfs_allocbt_free_block(
> >  	if (error)
> >  		return error;
> >  
> > +	atomic64_dec(&cur->bc_mp->m_allocbt_blks);
> >  	xfs_extent_busy_insert(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1,
> >  			      XFS_EXTENT_BUSY_SKIP_DISCARD);
> >  	xfs_trans_agbtree_delta(cur->bc_tp, -1);
> > diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> > index 81829d19596e..bb67274ee23f 100644
> > --- a/fs/xfs/xfs_mount.h
> > +++ b/fs/xfs/xfs_mount.h
> > @@ -170,6 +170,12 @@ typedef struct xfs_mount {
> >  	 * extents or anything related to the rt device.
> >  	 */
> >  	struct percpu_counter	m_delalloc_blks;
> > +	/*
> > +	 * Global count of allocation btree blocks in use across all AGs. Only
> > +	 * used when perag reservation is enabled. Helps prevent block
> > +	 * reservation from attempting to reserve allocation btree blocks.
> > +	 */
> > +	atomic64_t		m_allocbt_blks;
> >  
> >  	struct radix_tree_root	m_perag_tree;	/* per-ag accounting info */
> >  	spinlock_t		m_perag_lock;	/* lock for m_perag_tree */
> > -- 
> > 2.26.3
> > 
>
Brian Foster April 28, 2021, 3:29 p.m. UTC | #7
On Wed, Apr 28, 2021 at 11:01:24AM -0400, Brian Foster wrote:
> On Tue, Apr 27, 2021 at 09:15:09PM -0700, Darrick J. Wong wrote:
> > On Fri, Apr 23, 2021 at 09:10:49AM -0400, Brian Foster wrote:
> > > Introduce an in-core counter to track the sum of all allocbt blocks
> > > used by the filesystem. This value is currently tracked per-ag via
> > > the ->agf_btreeblks field in the AGF, which also happens to include
> > > rmapbt blocks. A global, in-core count of allocbt blocks is required
> > > to identify the subset of global ->m_fdblocks that consists of
> > > unavailable blocks currently used for allocation btrees. To support
> > > this calculation at block reservation time, construct a similar
> > > global counter for allocbt blocks, populate it on first read of each
> > > AGF and update it as allocbt blocks are used and released.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
> > >  fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
> > >  fs/xfs/xfs_mount.h              |  6 ++++++
> > >  3 files changed, 20 insertions(+)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > index aaa19101bb2a..144e2d68245c 100644
> > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > @@ -3036,6 +3036,7 @@ xfs_alloc_read_agf(
> > >  	struct xfs_agf		*agf;		/* ag freelist header */
> > >  	struct xfs_perag	*pag;		/* per allocation group data */
> > >  	int			error;
> > > +	uint32_t		allocbt_blks;
> > >  
> > >  	trace_xfs_alloc_read_agf(mp, agno);
> > >  
> > > @@ -3066,6 +3067,17 @@ xfs_alloc_read_agf(
> > >  		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
> > >  		pag->pagf_init = 1;
> > >  		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
> > > +
> > > +		/*
> > > +		 * Update the global in-core allocbt block counter. Filter
> > > +		 * rmapbt blocks from the on-disk counter because those are
> > > +		 * managed by perag reservation.
> > > +		 */
> > > +		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {
> > 
> > As pointed out elsewhere in the thread, agf_rmap_blocks counts the total
> > number of blocks in the rmapbt (whereas agf_btreeblks counts the number
> > of non-root blocks in all three free space btrees).  Does this need a
> > change?
> > 
> > 	int delta = (int)pag->pagf_btreeblks - (be32_to_cpu(...) - 1);
> > 	if (delta > 0)
> > 		atomic64_add(delta, &mp->m_allocbt_blks);
> >
> 
> Hm yes, this makes more sense. Will fix and update the comment..
> 

I ended up with the following:

		...
                /*
                 * Update the in-core allocbt counter. Filter out the rmapbt
                 * subset of the btreeblks counter because the rmapbt is managed
                 * by perag reservation. Subtract one for the rmapbt root block
                 * because the rmap counter includes it while the btreeblks
                 * counter only tracks non-root blocks.
                 */
                allocbt_blks = pag->pagf_btreeblks;
                if (xfs_sb_version_hasrmapbt(&mp->m_sb))
                        allocbt_blks -= be32_to_cpu(agf->agf_rmap_blocks) - 1;
                if (allocbt_blks > 0)
                        atomic64_add(allocbt_blks, &mp->m_allocbt_blks);
		...

Any thoughts before I post v5?

Brian

> Brian
>  
> > --D
> > 
> > > +			allocbt_blks = pag->pagf_btreeblks -
> > > +					be32_to_cpu(agf->agf_rmap_blocks);
> > > +			atomic64_add(allocbt_blks, &mp->m_allocbt_blks);
> > > +		}
> > >  	}
> > >  #ifdef DEBUG
> > >  	else if (!XFS_FORCED_SHUTDOWN(mp)) {
> > > diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
> > > index 8e01231b308e..9f5a45f7baed 100644
> > > --- a/fs/xfs/libxfs/xfs_alloc_btree.c
> > > +++ b/fs/xfs/libxfs/xfs_alloc_btree.c
> > > @@ -71,6 +71,7 @@ xfs_allocbt_alloc_block(
> > >  		return 0;
> > >  	}
> > >  
> > > +	atomic64_inc(&cur->bc_mp->m_allocbt_blks);
> > >  	xfs_extent_busy_reuse(cur->bc_mp, cur->bc_ag.agno, bno, 1, false);
> > >  
> > >  	xfs_trans_agbtree_delta(cur->bc_tp, 1);
> > > @@ -95,6 +96,7 @@ xfs_allocbt_free_block(
> > >  	if (error)
> > >  		return error;
> > >  
> > > +	atomic64_dec(&cur->bc_mp->m_allocbt_blks);
> > >  	xfs_extent_busy_insert(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1,
> > >  			      XFS_EXTENT_BUSY_SKIP_DISCARD);
> > >  	xfs_trans_agbtree_delta(cur->bc_tp, -1);
> > > diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> > > index 81829d19596e..bb67274ee23f 100644
> > > --- a/fs/xfs/xfs_mount.h
> > > +++ b/fs/xfs/xfs_mount.h
> > > @@ -170,6 +170,12 @@ typedef struct xfs_mount {
> > >  	 * extents or anything related to the rt device.
> > >  	 */
> > >  	struct percpu_counter	m_delalloc_blks;
> > > +	/*
> > > +	 * Global count of allocation btree blocks in use across all AGs. Only
> > > +	 * used when perag reservation is enabled. Helps prevent block
> > > +	 * reservation from attempting to reserve allocation btree blocks.
> > > +	 */
> > > +	atomic64_t		m_allocbt_blks;
> > >  
> > >  	struct radix_tree_root	m_perag_tree;	/* per-ag accounting info */
> > >  	spinlock_t		m_perag_lock;	/* lock for m_perag_tree */
> > > -- 
> > > 2.26.3
> > > 
> >
Darrick J. Wong April 28, 2021, 4:12 p.m. UTC | #8
On Wed, Apr 28, 2021 at 11:29:57AM -0400, Brian Foster wrote:
> On Wed, Apr 28, 2021 at 11:01:24AM -0400, Brian Foster wrote:
> > On Tue, Apr 27, 2021 at 09:15:09PM -0700, Darrick J. Wong wrote:
> > > On Fri, Apr 23, 2021 at 09:10:49AM -0400, Brian Foster wrote:
> > > > Introduce an in-core counter to track the sum of all allocbt blocks
> > > > used by the filesystem. This value is currently tracked per-ag via
> > > > the ->agf_btreeblks field in the AGF, which also happens to include
> > > > rmapbt blocks. A global, in-core count of allocbt blocks is required
> > > > to identify the subset of global ->m_fdblocks that consists of
> > > > unavailable blocks currently used for allocation btrees. To support
> > > > this calculation at block reservation time, construct a similar
> > > > global counter for allocbt blocks, populate it on first read of each
> > > > AGF and update it as allocbt blocks are used and released.
> > > > 
> > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > ---
> > > >  fs/xfs/libxfs/xfs_alloc.c       | 12 ++++++++++++
> > > >  fs/xfs/libxfs/xfs_alloc_btree.c |  2 ++
> > > >  fs/xfs/xfs_mount.h              |  6 ++++++
> > > >  3 files changed, 20 insertions(+)
> > > > 
> > > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > > index aaa19101bb2a..144e2d68245c 100644
> > > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > > @@ -3036,6 +3036,7 @@ xfs_alloc_read_agf(
> > > >  	struct xfs_agf		*agf;		/* ag freelist header */
> > > >  	struct xfs_perag	*pag;		/* per allocation group data */
> > > >  	int			error;
> > > > +	uint32_t		allocbt_blks;
> > > >  
> > > >  	trace_xfs_alloc_read_agf(mp, agno);
> > > >  
> > > > @@ -3066,6 +3067,17 @@ xfs_alloc_read_agf(
> > > >  		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
> > > >  		pag->pagf_init = 1;
> > > >  		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
> > > > +
> > > > +		/*
> > > > +		 * Update the global in-core allocbt block counter. Filter
> > > > +		 * rmapbt blocks from the on-disk counter because those are
> > > > +		 * managed by perag reservation.
> > > > +		 */
> > > > +		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {
> > > 
> > > As pointed out elsewhere in the thread, agf_rmap_blocks counts the total
> > > number of blocks in the rmapbt (whereas agf_btreeblks counts the number
> > > of non-root blocks in all three free space btrees).  Does this need a
> > > change?
> > > 
> > > 	int delta = (int)pag->pagf_btreeblks - (be32_to_cpu(...) - 1);
> > > 	if (delta > 0)
> > > 		atomic64_add(delta, &mp->m_allocbt_blks);
> > >
> > 
> > Hm yes, this makes more sense. Will fix and update the comment..
> > 
> 
> I ended up with the following:
> 
> 		...
>                 /*
>                  * Update the in-core allocbt counter. Filter out the rmapbt
>                  * subset of the btreeblks counter because the rmapbt is managed
>                  * by perag reservation. Subtract one for the rmapbt root block
>                  * because the rmap counter includes it while the btreeblks
>                  * counter only tracks non-root blocks.
>                  */
>                 allocbt_blks = pag->pagf_btreeblks;
>                 if (xfs_sb_version_hasrmapbt(&mp->m_sb))
>                         allocbt_blks -= be32_to_cpu(agf->agf_rmap_blocks) - 1;
>                 if (allocbt_blks > 0)
>                         atomic64_add(allocbt_blks, &mp->m_allocbt_blks);
> 		...
> 
> Any thoughts before I post v5?

Looks reasonable to me. :)

--D

> Brian
> 
> > Brian
> >  
> > > --D
> > > 
> > > > +			allocbt_blks = pag->pagf_btreeblks -
> > > > +					be32_to_cpu(agf->agf_rmap_blocks);
> > > > +			atomic64_add(allocbt_blks, &mp->m_allocbt_blks);
> > > > +		}
> > > >  	}
> > > >  #ifdef DEBUG
> > > >  	else if (!XFS_FORCED_SHUTDOWN(mp)) {
> > > > diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
> > > > index 8e01231b308e..9f5a45f7baed 100644
> > > > --- a/fs/xfs/libxfs/xfs_alloc_btree.c
> > > > +++ b/fs/xfs/libxfs/xfs_alloc_btree.c
> > > > @@ -71,6 +71,7 @@ xfs_allocbt_alloc_block(
> > > >  		return 0;
> > > >  	}
> > > >  
> > > > +	atomic64_inc(&cur->bc_mp->m_allocbt_blks);
> > > >  	xfs_extent_busy_reuse(cur->bc_mp, cur->bc_ag.agno, bno, 1, false);
> > > >  
> > > >  	xfs_trans_agbtree_delta(cur->bc_tp, 1);
> > > > @@ -95,6 +96,7 @@ xfs_allocbt_free_block(
> > > >  	if (error)
> > > >  		return error;
> > > >  
> > > > +	atomic64_dec(&cur->bc_mp->m_allocbt_blks);
> > > >  	xfs_extent_busy_insert(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1,
> > > >  			      XFS_EXTENT_BUSY_SKIP_DISCARD);
> > > >  	xfs_trans_agbtree_delta(cur->bc_tp, -1);
> > > > diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> > > > index 81829d19596e..bb67274ee23f 100644
> > > > --- a/fs/xfs/xfs_mount.h
> > > > +++ b/fs/xfs/xfs_mount.h
> > > > @@ -170,6 +170,12 @@ typedef struct xfs_mount {
> > > >  	 * extents or anything related to the rt device.
> > > >  	 */
> > > >  	struct percpu_counter	m_delalloc_blks;
> > > > +	/*
> > > > +	 * Global count of allocation btree blocks in use across all AGs. Only
> > > > +	 * used when perag reservation is enabled. Helps prevent block
> > > > +	 * reservation from attempting to reserve allocation btree blocks.
> > > > +	 */
> > > > +	atomic64_t		m_allocbt_blks;
> > > >  
> > > >  	struct radix_tree_root	m_perag_tree;	/* per-ag accounting info */
> > > >  	spinlock_t		m_perag_lock;	/* lock for m_perag_tree */
> > > > -- 
> > > > 2.26.3
> > > > 
> > > 
>
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index aaa19101bb2a..144e2d68245c 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -3036,6 +3036,7 @@  xfs_alloc_read_agf(
 	struct xfs_agf		*agf;		/* ag freelist header */
 	struct xfs_perag	*pag;		/* per allocation group data */
 	int			error;
+	uint32_t		allocbt_blks;
 
 	trace_xfs_alloc_read_agf(mp, agno);
 
@@ -3066,6 +3067,17 @@  xfs_alloc_read_agf(
 		pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
 		pag->pagf_init = 1;
 		pag->pagf_agflreset = xfs_agfl_needs_reset(mp, agf);
+
+		/*
+		 * Update the global in-core allocbt block counter. Filter
+		 * rmapbt blocks from the on-disk counter because those are
+		 * managed by perag reservation.
+		 */
+		if (pag->pagf_btreeblks > be32_to_cpu(agf->agf_rmap_blocks)) {
+			allocbt_blks = pag->pagf_btreeblks -
+					be32_to_cpu(agf->agf_rmap_blocks);
+			atomic64_add(allocbt_blks, &mp->m_allocbt_blks);
+		}
 	}
 #ifdef DEBUG
 	else if (!XFS_FORCED_SHUTDOWN(mp)) {
diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c
index 8e01231b308e..9f5a45f7baed 100644
--- a/fs/xfs/libxfs/xfs_alloc_btree.c
+++ b/fs/xfs/libxfs/xfs_alloc_btree.c
@@ -71,6 +71,7 @@  xfs_allocbt_alloc_block(
 		return 0;
 	}
 
+	atomic64_inc(&cur->bc_mp->m_allocbt_blks);
 	xfs_extent_busy_reuse(cur->bc_mp, cur->bc_ag.agno, bno, 1, false);
 
 	xfs_trans_agbtree_delta(cur->bc_tp, 1);
@@ -95,6 +96,7 @@  xfs_allocbt_free_block(
 	if (error)
 		return error;
 
+	atomic64_dec(&cur->bc_mp->m_allocbt_blks);
 	xfs_extent_busy_insert(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1,
 			      XFS_EXTENT_BUSY_SKIP_DISCARD);
 	xfs_trans_agbtree_delta(cur->bc_tp, -1);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 81829d19596e..bb67274ee23f 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -170,6 +170,12 @@  typedef struct xfs_mount {
 	 * extents or anything related to the rt device.
 	 */
 	struct percpu_counter	m_delalloc_blks;
+	/*
+	 * Global count of allocation btree blocks in use across all AGs. Only
+	 * used when perag reservation is enabled. Helps prevent block
+	 * reservation from attempting to reserve allocation btree blocks.
+	 */
+	atomic64_t		m_allocbt_blks;
 
 	struct radix_tree_root	m_perag_tree;	/* per-ag accounting info */
 	spinlock_t		m_perag_lock;	/* lock for m_perag_tree */