diff mbox series

[3/3] tmpfs: Add project quota interface support for get/set attr

Message ID 20230925130028.1244740-4-cem@kernel.org (mailing list archive)
State New, archived
Headers show
Series tmpfs: Add tmpfs project quota support | expand

Commit Message

Carlos Maiolino Sept. 25, 2023, 1 p.m. UTC
From: Carlos Maiolino <cem@kernel.org>

Not project quota support is in place, enable users to use it.

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 mm/shmem.c | 35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

Comments

Andreas Dilger Sept. 26, 2023, 8:28 p.m. UTC | #1
I've added Dave to the CC list, since he has a deep understanding
of the projid code since it originated from XFS.

On Sep 25, 2023, at 7:00 AM, cem@kernel.org wrote:
> 
> From: Carlos Maiolino <cem@kernel.org>
> 
> Not project quota support is in place, enable users to use it.

There is a peculiar behavior of project quotas, that rename across
directories with different project IDs and PROJINHERIT set should
cause the project ID to be updated (similar to BSD setgid).

For renaming regular files and other non-directories, it is OK to
change the projid and update the quota for the old and new IDs
to avoid copying all of the data needlessly.  For directories this
(unfortunately) means that the kernel should return -EXDEV if the
project IDs don't match, and then "mv" will create a new target
directory and resume moving the files (which are thankfully still
done with a rename() call if possible).

The reason for this is that just renaming the directory does not
atomically update the projid on all of the (possibly millions of)
sub-files/sub-dirs, which is required for PROJINHERIT directories.


Another option for tmpfs to maintain this PROJINHERIT behavior would
be to rename the directory and then spawn a background kernel thread
to change the projids on the whole tree.  Since tmpfs is an in-memory
filesystem there will be a "limited" number of files and subdirs
to update, and you don't need to worry about error handling if the
system crashes before the projid updates are done.

While not 100% atomic, it is not *less* atomic than having "mv"
recursively copy the whole directory tree to the target name when
the projid on the source and target don't match.  It is also a
*lot* less overhead than doing a million mkdir() + rename() calls.

There would be a risk that the target directory projid could go over
quota, but the alternative (that "mv" is half-way between moving the
directory tree from the source to the destination before it fails,
or fills up the filesystem because it can't hold another full copy
of the tree being renamed) is not better.

> 
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
> mm/shmem.c | 35 +++++++++++++++++++++++++++++++----
> 1 file changed, 31 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 4d2b713bff06..744a39251a31 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3571,6 +3571,23 @@ static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa)
> 
> 	fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE);
> 
> +	fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
> +	return 0;
> +}
> +
> +static int shmem_set_project(struct inode *inode, __u32 projid)
> +{
> +	int err = -EOPNOTSUPP;
> +	kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid);
> +
> +	if (projid_eq(kprojid, SHMEM_I(inode)->i_projid))
> +		return 0;
> +
> +	err = dquot_initialize(inode);
> +	if (err)
> +		return err;
> +
> +	SHMEM_I(inode)->i_projid = kprojid;
> 	return 0;
> }

(defect) this also needs to __dquot_transfer() the quota away from the
previous projid, or the accounting will be broken.


I think one hole in project quotas is the fact that any user can set
the projid of their files to any project they want.  Changing the projid/PROJINHERIT is restricted outside of the init namespace by
fileattr_set_prepare(), which is good in itself, but I think it makes
sense for root/CAP_SYS_RESOURCE to be needed to change projid/PROJINHERIT
even in the init namespace. Otherwise project quota is only an estimate
of space usage in a directory, if users are not actively subverting it.

> @@ -3579,19 +3596,29 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap,
> {
> 	struct inode *inode = d_inode(dentry);
> 	struct shmem_inode_info *info = SHMEM_I(inode);
> +	int err = -EOPNOTSUPP;
> +
> +	if (fa->fsx_valid &&
> +	   ((fa->fsx_xflags & ~FS_XFLAG_COMMON) ||
> +	   fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0))
> +		goto out;
> 
> -	if (fileattr_has_fsx(fa))
> -		return -EOPNOTSUPP;
> 	if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE)
> -		return -EOPNOTSUPP;
> +		goto out;
> 
> 	info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) |
> 		(fa->flags & SHMEM_FL_USER_MODIFIABLE);
> 
> 	shmem_set_inode_flags(inode, info->fsflags);
> +	err = shmem_set_project(inode, fa->fsx_projid);
> +		if (err)
> +			goto out;
> +
> 	inode_set_ctime_current(inode);
> 	inode_inc_iversion(inode);
> -	return 0;
> +
> +out:
> +	return err;
> }



There were also some patches to add projid support to statx() that didn't
quite get merged:

https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-3-git-send-email-wshilong1991@gmail.com/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-2-git-send-email-wshilong1991@gmail.com/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-6-git-send-email-wshilong1991@gmail.com/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-7-git-send-email-wshilong1991@gmail.com/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-8-git-send-email-wshilong1991@gmail.com/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-9-git-send-email-wshilong1991@gmail.com/

They were part of a larger series to allow setting projid directly with
the fchownat(2), but that got bogged down in how the interface should
work, and whether i_projid should be moved to struct inode, but I think
the statx() functionality was uncontroversial and could land as-is.

Cheers, Andreas
kernel test robot Sept. 26, 2023, 9:35 p.m. UTC | #2
Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on linus/master v6.6-rc3 next-20230926]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/cem-kernel-org/tmpfs-add-project-ID-support/20230925-210238
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20230925130028.1244740-4-cem%40kernel.org
patch subject: [PATCH 3/3] tmpfs: Add project quota interface support for get/set attr
config: arc-randconfig-001-20230926 (https://download.01.org/0day-ci/archive/20230927/202309270552.A4zwjPrB-lkp@intel.com/config)
compiler: arceb-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230927/202309270552.A4zwjPrB-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309270552.A4zwjPrB-lkp@intel.com/

All errors (new ones prefixed by >>):

   mm/shmem.c: In function 'shmem_fileattr_get':
>> mm/shmem.c:3574:63: error: 'struct shmem_inode_info' has no member named 'i_projid'
    3574 |         fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
         |                                                               ^~
   mm/shmem.c: In function 'shmem_set_project':
   mm/shmem.c:3583:46: error: 'struct shmem_inode_info' has no member named 'i_projid'
    3583 |         if (projid_eq(kprojid, SHMEM_I(inode)->i_projid))
         |                                              ^~
   mm/shmem.c:3590:23: error: 'struct shmem_inode_info' has no member named 'i_projid'
    3590 |         SHMEM_I(inode)->i_projid = kprojid;
         |                       ^~


vim +3574 mm/shmem.c

  3567	
  3568	static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa)
  3569	{
  3570		struct shmem_inode_info *info = SHMEM_I(d_inode(dentry));
  3571	
  3572		fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE);
  3573	
> 3574		fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
  3575		return 0;
  3576	}
  3577
Dave Chinner Sept. 27, 2023, 1:09 a.m. UTC | #3
On Tue, Sep 26, 2023 at 02:28:06PM -0600, Andreas Dilger wrote:
> I've added Dave to the CC list, since he has a deep understanding
> of the projid code since it originated from XFS.
> 
> On Sep 25, 2023, at 7:00 AM, cem@kernel.org wrote:
> > 
> > From: Carlos Maiolino <cem@kernel.org>
> > 
> > Not project quota support is in place, enable users to use it.
> 
> There is a peculiar behavior of project quotas, that rename across
> directories with different project IDs and PROJINHERIT set should
> cause the project ID to be updated (similar to BSD setgid).
>
> For renaming regular files and other non-directories, it is OK to
> change the projid and update the quota for the old and new IDs
> to avoid copying all of the data needlessly.  For directories this
> (unfortunately) means that the kernel should return -EXDEV if the
> project IDs don't match, and then "mv" will create a new target
> directory and resume moving the files (which are thankfully still
> done with a rename() call if possible).
> 
> The reason for this is that just renaming the directory does not
> atomically update the projid on all of the (possibly millions of)
> sub-files/sub-dirs, which is required for PROJINHERIT directories.

Yup, PROJINHERIT implies that project quotas are being used to
implement directory tree quotas, hence everything in the destination
directory should have the same projid as the parent.  This code in
xfs_rename() implements that restriction:

        /*
         * If we are using project inheritance, we only allow renames
         * into our tree when the project IDs are the same; else the
         * tree quota mechanism would be circumvented.
         */
        if (unlikely((target_dp->i_diflags & XFS_DIFLAG_PROJINHERIT) &&
                     target_dp->i_projid != src_ip->i_projid)) {
                error = -EXDEV;
                goto out_trans_cancel;
        }

> Another option for tmpfs to maintain this PROJINHERIT behavior would
> be to rename the directory and then spawn a background kernel thread
> to change the projids on the whole tree.  Since tmpfs is an in-memory
> filesystem there will be a "limited" number of files and subdirs
> to update, and you don't need to worry about error handling if the
> system crashes before the projid updates are done.

Except that can get EDQUOT half way through and now there's nothing
to report the ENOSPC error to, nor any way that the application can
interrupt and/or recover the situation. I think that's a
non-starter.

And, quite frankly, it's also massive feature creep as well as
premature optimisation. We don't need tmpfs project quotas to be
"smart" like this; we just need the initial support patch set to -do
the right thing-.

-Dave.
Carlos Maiolino Sept. 27, 2023, 12:48 p.m. UTC | #4
On Tue, Sep 26, 2023 at 02:28:06PM -0600, Andreas Dilger wrote:
> I've added Dave to the CC list, since he has a deep understanding
> of the projid code since it originated from XFS.
> 
> On Sep 25, 2023, at 7:00 AM, cem@kernel.org wrote:
> > 
> > From: Carlos Maiolino <cem@kernel.org>
> > 
> > Not project quota support is in place, enable users to use it.
> 
> There is a peculiar behavior of project quotas, that rename across
> directories with different project IDs and PROJINHERIT set should
> cause the project ID to be updated (similar to BSD setgid).
> 
> For renaming regular files and other non-directories, it is OK to
> change the projid and update the quota for the old and new IDs
> to avoid copying all of the data needlessly.  For directories this
> (unfortunately) means that the kernel should return -EXDEV if the
> project IDs don't match, and then "mv" will create a new target
> directory and resume moving the files (which are thankfully still
> done with a rename() call if possible).

Right! I meant to include it on the TODO list of things, but I totally forgot to
do so, thanks for reminding me!

> 
> The reason for this is that just renaming the directory does not
> atomically update the projid on all of the (possibly millions of)
> sub-files/sub-dirs, which is required for PROJINHERIT directories.
> 
> 
> Another option for tmpfs to maintain this PROJINHERIT behavior would
> be to rename the directory and then spawn a background kernel thread
> to change the projids on the whole tree.  Since tmpfs is an in-memory
> filesystem there will be a "limited" number of files and subdirs
> to update, and you don't need to worry about error handling if the
> system crashes before the projid updates are done.
> 
> While not 100% atomic, it is not *less* atomic than having "mv"
> recursively copy the whole directory tree to the target name when
> the projid on the source and target don't match.  It is also a
> *lot* less overhead than doing a million mkdir() + rename() calls.
> 
> There would be a risk that the target directory projid could go over
> quota, but the alternative (that "mv" is half-way between moving the
> directory tree from the source to the destination before it fails,
> or fills up the filesystem because it can't hold another full copy
> of the tree being renamed) is not better.

I will look into these while I work on a non-rfc version of this series.
Thanks Andreas.

Carlos

> 
> > 
> > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> > ---
> > mm/shmem.c | 35 +++++++++++++++++++++++++++++++----
> > 1 file changed, 31 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 4d2b713bff06..744a39251a31 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -3571,6 +3571,23 @@ static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa)
> > 
> > 	fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE);
> > 
> > +	fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
> > +	return 0;
> > +}
> > +
> > +static int shmem_set_project(struct inode *inode, __u32 projid)
> > +{
> > +	int err = -EOPNOTSUPP;
> > +	kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid);
> > +
> > +	if (projid_eq(kprojid, SHMEM_I(inode)->i_projid))
> > +		return 0;
> > +
> > +	err = dquot_initialize(inode);
> > +	if (err)
> > +		return err;
> > +
> > +	SHMEM_I(inode)->i_projid = kprojid;
> > 	return 0;
> > }
> 
> (defect) this also needs to __dquot_transfer() the quota away from the
> previous projid, or the accounting will be broken.
> 
> 
> I think one hole in project quotas is the fact that any user can set
> the projid of their files to any project they want.  Changing the projid/PROJINHERIT is restricted outside of the init namespace by
> fileattr_set_prepare(), which is good in itself, but I think it makes
> sense for root/CAP_SYS_RESOURCE to be needed to change projid/PROJINHERIT
> even in the init namespace. Otherwise project quota is only an estimate
> of space usage in a directory, if users are not actively subverting it.
> 
> > @@ -3579,19 +3596,29 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap,
> > {
> > 	struct inode *inode = d_inode(dentry);
> > 	struct shmem_inode_info *info = SHMEM_I(inode);
> > +	int err = -EOPNOTSUPP;
> > +
> > +	if (fa->fsx_valid &&
> > +	   ((fa->fsx_xflags & ~FS_XFLAG_COMMON) ||
> > +	   fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0))
> > +		goto out;
> > 
> > -	if (fileattr_has_fsx(fa))
> > -		return -EOPNOTSUPP;
> > 	if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE)
> > -		return -EOPNOTSUPP;
> > +		goto out;
> > 
> > 	info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) |
> > 		(fa->flags & SHMEM_FL_USER_MODIFIABLE);
> > 
> > 	shmem_set_inode_flags(inode, info->fsflags);
> > +	err = shmem_set_project(inode, fa->fsx_projid);
> > +		if (err)
> > +			goto out;
> > +
> > 	inode_set_ctime_current(inode);
> > 	inode_inc_iversion(inode);
> > -	return 0;
> > +
> > +out:
> > +	return err;
> > }
> 
> 
> 
> There were also some patches to add projid support to statx() that didn't
> quite get merged:
> 
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-3-git-send-email-wshilong1991@gmail.com/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-2-git-send-email-wshilong1991@gmail.com/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-6-git-send-email-wshilong1991@gmail.com/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-7-git-send-email-wshilong1991@gmail.com/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-8-git-send-email-wshilong1991@gmail.com/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-9-git-send-email-wshilong1991@gmail.com/
> 
> They were part of a larger series to allow setting projid directly with
> the fchownat(2), but that got bogged down in how the interface should
> work, and whether i_projid should be moved to struct inode, but I think
> the statx() functionality was uncontroversial and could land as-is.
> 
> Cheers, Andreas
> 
> 
> 
>
kernel test robot Oct. 2, 2023, 8:41 a.m. UTC | #5
Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on linus/master v6.6-rc4 next-20230929]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/cem-kernel-org/tmpfs-add-project-ID-support/20230925-210238
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20230925130028.1244740-4-cem%40kernel.org
patch subject: [PATCH 3/3] tmpfs: Add project quota interface support for get/set attr
config: arm-sp7021_defconfig (https://download.01.org/0day-ci/archive/20231002/202310021602.dvZVjyMh-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231002/202310021602.dvZVjyMh-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310021602.dvZVjyMh-lkp@intel.com/

All errors (new ones prefixed by >>):

>> mm/shmem.c:3574:58: error: no member named 'i_projid' in 'struct shmem_inode_info'
    3574 |         fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
         |                                                           ~~~~  ^
   mm/shmem.c:3583:41: error: no member named 'i_projid' in 'struct shmem_inode_info'
    3583 |         if (projid_eq(kprojid, SHMEM_I(inode)->i_projid))
         |                                ~~~~~~~~~~~~~~  ^
   mm/shmem.c:3590:18: error: no member named 'i_projid' in 'struct shmem_inode_info'
    3590 |         SHMEM_I(inode)->i_projid = kprojid;
         |         ~~~~~~~~~~~~~~  ^
   3 errors generated.


vim +3574 mm/shmem.c

  3567	
  3568	static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa)
  3569	{
  3570		struct shmem_inode_info *info = SHMEM_I(d_inode(dentry));
  3571	
  3572		fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE);
  3573	
> 3574		fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
  3575		return 0;
  3576	}
  3577
diff mbox series

Patch

diff --git a/mm/shmem.c b/mm/shmem.c
index 4d2b713bff06..744a39251a31 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3571,6 +3571,23 @@  static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa)
 
 	fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE);
 
+	fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
+	return 0;
+}
+
+static int shmem_set_project(struct inode *inode, __u32 projid)
+{
+	int err = -EOPNOTSUPP;
+	kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid);
+
+	if (projid_eq(kprojid, SHMEM_I(inode)->i_projid))
+		return 0;
+
+	err = dquot_initialize(inode);
+	if (err)
+		return err;
+
+	SHMEM_I(inode)->i_projid = kprojid;
 	return 0;
 }
 
@@ -3579,19 +3596,29 @@  static int shmem_fileattr_set(struct mnt_idmap *idmap,
 {
 	struct inode *inode = d_inode(dentry);
 	struct shmem_inode_info *info = SHMEM_I(inode);
+	int err = -EOPNOTSUPP;
+
+	if (fa->fsx_valid &&
+	   ((fa->fsx_xflags & ~FS_XFLAG_COMMON) ||
+	   fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0))
+		goto out;
 
-	if (fileattr_has_fsx(fa))
-		return -EOPNOTSUPP;
 	if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE)
-		return -EOPNOTSUPP;
+		goto out;
 
 	info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) |
 		(fa->flags & SHMEM_FL_USER_MODIFIABLE);
 
 	shmem_set_inode_flags(inode, info->fsflags);
+	err = shmem_set_project(inode, fa->fsx_projid);
+		if (err)
+			goto out;
+
 	inode_set_ctime_current(inode);
 	inode_inc_iversion(inode);
-	return 0;
+
+out:
+	return err;
 }
 
 /*