diff mbox series

[v2,10/13] xfs: Unmap blocks according to forcealign

Message ID 20240705162450.3481169-11-john.g.garry@oracle.com (mailing list archive)
State Superseded, archived
Headers show
Series forcealign for xfs | expand

Commit Message

John Garry July 5, 2024, 4:24 p.m. UTC
For when forcealign is enabled, blocks in an inode need to be unmapped
according to extent alignment, like what is already done for rtvol.

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 46 ++++++++++++++++++++++++++++++++--------
 1 file changed, 37 insertions(+), 9 deletions(-)

Comments

Christoph Hellwig July 6, 2024, 7:58 a.m. UTC | #1
> +static xfs_extlen_t
> +xfs_bunmapi_align(
> +	struct xfs_inode	*ip,
> +	xfs_fsblock_t		bno)
> +{
> +	struct xfs_mount	*mp = ip->i_mount;
> +	xfs_agblock_t		agbno;
> +
> +	if (xfs_inode_has_forcealign(ip)) {
> +		if (is_power_of_2(ip->i_extsize))
> +			return bno & (ip->i_extsize - 1);
> +
> +		agbno = XFS_FSB_TO_AGBNO(mp, bno);
> +		return agbno % ip->i_extsize;
> +	}
> +	ASSERT(XFS_IS_REALTIME_INODE(ip));
> +	return xfs_rtb_to_rtxoff(ip->i_mount, bno);

This helper isn't really bunmapi sepcific, is it?

> @@ -5425,6 +5444,7 @@ __xfs_bunmapi(
>  	struct xfs_bmbt_irec	got;		/* current extent record */
>  	struct xfs_ifork	*ifp;		/* inode fork pointer */
>  	int			isrt;		/* freeing in rt area */
> +	int			isforcealign;	/* freeing for inode with forcealign */

This is really a bool.  And while it matches the code around it the
code feels a bit too verbose..
> 
> +		if ((!isrt && !isforcealign) || (flags & XFS_BMAPI_REMAP))
>  			goto delete;
>  
> -		mod = xfs_rtb_to_rtxoff(mp,
> -				del.br_startblock + del.br_blockcount);
> +		mod = xfs_bunmapi_align(ip, del.br_startblock + del.br_blockcount);

Overly long line.

We've been long wanting to split the whole align / convert unwritten /
etc code into a helper outside the main bumapi flow.  And when adding
new logic to it this might indeed be a good time.

> +			if (isforcealign) {
> +				off = ip->i_extsize - mod;
> +			} else {
> +				ASSERT(isrt);
> +				off = mp->m_sb.sb_rextsize - mod;
> +			}

And we'll really need proper helpers so that we don't have to
open code the i_extsize vs sb_rextsize logic all over.
John Garry July 8, 2024, 2:48 p.m. UTC | #2
On 06/07/2024 08:58, Christoph Hellwig wrote:
>> +static xfs_extlen_t
>> +xfs_bunmapi_align(
>> +	struct xfs_inode	*ip,
>> +	xfs_fsblock_t		bno)
>> +{
>> +	struct xfs_mount	*mp = ip->i_mount;
>> +	xfs_agblock_t		agbno;
>> +
>> +	if (xfs_inode_has_forcealign(ip)) {
>> +		if (is_power_of_2(ip->i_extsize))
>> +			return bno & (ip->i_extsize - 1);
>> +
>> +		agbno = XFS_FSB_TO_AGBNO(mp, bno);
>> +		return agbno % ip->i_extsize;
>> +	}
>> +	ASSERT(XFS_IS_REALTIME_INODE(ip));
>> +	return xfs_rtb_to_rtxoff(ip->i_mount, bno);
> 
> This helper isn't really bunmapi sepcific, is it?

Right, it is not really. Apart from the ASSERT to ensure that we are not 
calling from a stray context.

> 
>> @@ -5425,6 +5444,7 @@ __xfs_bunmapi(
>>   	struct xfs_bmbt_irec	got;		/* current extent record */
>>   	struct xfs_ifork	*ifp;		/* inode fork pointer */
>>   	int			isrt;		/* freeing in rt area */
>> +	int			isforcealign;	/* freeing for inode with forcealign */
> 
> This is really a bool.  And while it matches the code around it the
> code feels a bit too verbose..

I can change both to a bool - would that be better?

Using isfa (instead of isforcealign) might be interpreted as something 
else :)

>>
>> +		if ((!isrt && !isforcealign) || (flags & XFS_BMAPI_REMAP))
>>   			goto delete;
>>   
>> -		mod = xfs_rtb_to_rtxoff(mp,
>> -				del.br_startblock + del.br_blockcount);
>> +		mod = xfs_bunmapi_align(ip, del.br_startblock + del.br_blockcount);
> 
> Overly long line.

noted

> 
> We've been long wanting to split the whole align / convert unwritten /
> etc code into a helper outside the main bumapi flow.  And when adding
> new logic to it this might indeed be a good time.

ok, I'll see if can come up with something

> 
>> +			if (isforcealign) {
>> +				off = ip->i_extsize - mod;
>> +			} else {
>> +				ASSERT(isrt);
>> +				off = mp->m_sb.sb_rextsize - mod;
>> +			}
> 
> And we'll really need proper helpers so that we don't have to
> open code the i_extsize vs sb_rextsize logic all over.

sure

>
Christoph Hellwig July 9, 2024, 7:46 a.m. UTC | #3
On Mon, Jul 08, 2024 at 03:48:20PM +0100, John Garry wrote:
>>> +	int			isforcealign;	/* freeing for inode with forcealign */
>>
>> This is really a bool.  And while it matches the code around it the
>> code feels a bit too verbose..
>
> I can change both to a bool - would that be better?
>
> Using isfa (instead of isforcealign) might be interpreted as something else 

The check should be used in one single place where we decided if
we need to to the alignment based adjustments.  So IMHO just killing
it and open coding it there seems way easier.   Yes, it is in a loop,
but compared to all the work done is is really cheap.

>> We've been long wanting to split the whole align / convert unwritten /
>> etc code into a helper outside the main bumapi flow.  And when adding
>> new logic to it this might indeed be a good time.
>
> ok, I'll see if can come up with something

I can take a look too.  There is some real mess in there like trying
to account for cases where the transaction doesn't have a block
reservation, which I think could have happen in truncate until
Zhang Yi fixed it for the 6.11 merge window.
Dave Chinner July 9, 2024, 9:57 a.m. UTC | #4
On Sat, Jul 06, 2024 at 09:58:58AM +0200, Christoph Hellwig wrote:
> 
> > +			if (isforcealign) {
> > +				off = ip->i_extsize - mod;
> > +			} else {
> > +				ASSERT(isrt);
> > +				off = mp->m_sb.sb_rextsize - mod;
> > +			}
> 
> And we'll really need proper helpers so that we don't have to
> open code the i_extsize vs sb_rextsize logic all over.

We already have that: xfs_inode_alloc_unitsize().

Have the code get that value, then do all the alignment based on
whether it is allocation unit size > mp->m_sb.sb_blocksize. Then all
the calculations are generic and not dependent on forcealign or rt,
but on whether the inode requires multi-block contiguous extent
alignment....

i.e.
	alloc_size = xfs_inode_alloc_unitsize(ip);
	if (alloc_size > mp->m_sb.sb_blocksize) {
		/* do aligned allocation setup stuff */
		.....
	}
	....

No code should be doing "if (forcealign) ... else if (realtime) ..."
branching for alignment purposes. All the code should all be
generic based on the value xfs_inode_alloc_unitsize() returns.

-Dave.
Christoph Hellwig July 9, 2024, 11:19 a.m. UTC | #5
On Tue, Jul 09, 2024 at 07:57:22PM +1000, Dave Chinner wrote:
> No code should be doing "if (forcealign) ... else if (realtime) ..."
> branching for alignment purposes. All the code should all be
> generic based on the value xfs_inode_alloc_unitsize() returns.

Yes, please.
John Garry July 17, 2024, 3:24 p.m. UTC | #6
On 09/07/2024 08:46, Christoph Hellwig wrote:

Hi Christoph,

>> Using isfa (instead of isforcealign) might be interpreted as something else
> The check should be used in one single place where we decided if
> we need to to the alignment based adjustments.  So IMHO just killing
> it and open coding it there seems way easier.   Yes, it is in a loop,
> but compared to all the work done is is really cheap.
> 
>>> We've been long wanting to split the whole align / convert unwritten /
>>> etc code into a helper outside the main bumapi flow.  And when adding
>>> new logic to it this might indeed be a good time.
>> ok, I'll see if can come up with something
> I can take a look too. 

I was wondering what you plans are for any clean-up/refactoring here, as 
mentioned?

I was starting to look at the whole "if (forcealign) else if (big rt)" 
flow refactoring in this series to use xfs_inode_alloc_unitsize(); 
however, I figure that you have plans wider in scope, which affects this.

? There is some real mess in there like trying
> to account for cases where the transaction doesn't have a block
> reservation, which I think could have happen in truncate until
> Zhang Yi fixed it for the 6.11 merge window.

Cheers,
John
Christoph Hellwig July 17, 2024, 4:42 p.m. UTC | #7
On Wed, Jul 17, 2024 at 04:24:28PM +0100, John Garry wrote:
>>>> new logic to it this might indeed be a good time.
>>> ok, I'll see if can come up with something
>> I can take a look too. 
>
> I was wondering what you plans are for any clean-up/refactoring here, as 
> mentioned?

Try to split all the convert left over of large rtextent / allocation
size logic out of __xfs_bunmapi and into a separate helper or two.
I started on it after writing that previous mail, but it has been
preempted by more urgent work for now.

> I was starting to look at the whole "if (forcealign) else if (big rt)" flow 
> refactoring in this series to use xfs_inode_alloc_unitsize(); however, I 
> figure that you have plans wider in scope, which affects this.

It will create a bit of conflict, but nothing out of ordinary.
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index db12f006646a..07478c88a51b 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5403,6 +5403,25 @@  xfs_bmap_del_extent_real(
 	return 0;
 }
 
+static xfs_extlen_t
+xfs_bunmapi_align(
+	struct xfs_inode	*ip,
+	xfs_fsblock_t		bno)
+{
+	struct xfs_mount	*mp = ip->i_mount;
+	xfs_agblock_t		agbno;
+
+	if (xfs_inode_has_forcealign(ip)) {
+		if (is_power_of_2(ip->i_extsize))
+			return bno & (ip->i_extsize - 1);
+
+		agbno = XFS_FSB_TO_AGBNO(mp, bno);
+		return agbno % ip->i_extsize;
+	}
+	ASSERT(XFS_IS_REALTIME_INODE(ip));
+	return xfs_rtb_to_rtxoff(ip->i_mount, bno);
+}
+
 /*
  * Unmap (remove) blocks from a file.
  * If nexts is nonzero then the number of extents to remove is limited to
@@ -5425,6 +5444,7 @@  __xfs_bunmapi(
 	struct xfs_bmbt_irec	got;		/* current extent record */
 	struct xfs_ifork	*ifp;		/* inode fork pointer */
 	int			isrt;		/* freeing in rt area */
+	int			isforcealign;	/* freeing for inode with forcealign */
 	int			logflags;	/* transaction logging flags */
 	xfs_extlen_t		mod;		/* rt extent offset */
 	struct xfs_mount	*mp = ip->i_mount;
@@ -5462,6 +5482,8 @@  __xfs_bunmapi(
 	}
 	XFS_STATS_INC(mp, xs_blk_unmap);
 	isrt = xfs_ifork_is_realtime(ip, whichfork);
+	isforcealign = (whichfork != XFS_ATTR_FORK) &&
+			xfs_inode_has_forcealign(ip);
 	end = start + len;
 
 	if (!xfs_iext_lookup_extent_before(ip, ifp, &end, &icur, &got)) {
@@ -5513,14 +5535,13 @@  __xfs_bunmapi(
 		if (del.br_startoff + del.br_blockcount > end + 1)
 			del.br_blockcount = end + 1 - del.br_startoff;
 
-		if (!isrt || (flags & XFS_BMAPI_REMAP))
+		if ((!isrt && !isforcealign) || (flags & XFS_BMAPI_REMAP))
 			goto delete;
 
-		mod = xfs_rtb_to_rtxoff(mp,
-				del.br_startblock + del.br_blockcount);
+		mod = xfs_bunmapi_align(ip, del.br_startblock + del.br_blockcount);
 		if (mod) {
 			/*
-			 * Realtime extent not lined up at the end.
+			 * Not aligned to allocation unit on the end.
 			 * The extent could have been split into written
 			 * and unwritten pieces, or we could just be
 			 * unmapping part of it.  But we can't really
@@ -5565,14 +5586,21 @@  __xfs_bunmapi(
 			goto nodelete;
 		}
 
-		mod = xfs_rtb_to_rtxoff(mp, del.br_startblock);
+		mod = xfs_bunmapi_align(ip, del.br_startblock);
 		if (mod) {
-			xfs_extlen_t off = mp->m_sb.sb_rextsize - mod;
+			xfs_extlen_t off;
+
+			if (isforcealign) {
+				off = ip->i_extsize - mod;
+			} else {
+				ASSERT(isrt);
+				off = mp->m_sb.sb_rextsize - mod;
+			}
 
 			/*
-			 * Realtime extent is lined up at the end but not
-			 * at the front.  We'll get rid of full extents if
-			 * we can.
+			 * Extent is lined up to the allocation unit at the
+			 * end but not at the front.  We'll get rid of full
+			 * extents if we can.
 			 */
 			if (del.br_blockcount > off) {
 				del.br_blockcount -= off;