Message ID | 20230925130028.1244740-4-cem@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | tmpfs: Add tmpfs project quota support | expand |
I've added Dave to the CC list, since he has a deep understanding of the projid code since it originated from XFS. On Sep 25, 2023, at 7:00 AM, cem@kernel.org wrote: > > From: Carlos Maiolino <cem@kernel.org> > > Not project quota support is in place, enable users to use it. There is a peculiar behavior of project quotas, that rename across directories with different project IDs and PROJINHERIT set should cause the project ID to be updated (similar to BSD setgid). For renaming regular files and other non-directories, it is OK to change the projid and update the quota for the old and new IDs to avoid copying all of the data needlessly. For directories this (unfortunately) means that the kernel should return -EXDEV if the project IDs don't match, and then "mv" will create a new target directory and resume moving the files (which are thankfully still done with a rename() call if possible). The reason for this is that just renaming the directory does not atomically update the projid on all of the (possibly millions of) sub-files/sub-dirs, which is required for PROJINHERIT directories. Another option for tmpfs to maintain this PROJINHERIT behavior would be to rename the directory and then spawn a background kernel thread to change the projids on the whole tree. Since tmpfs is an in-memory filesystem there will be a "limited" number of files and subdirs to update, and you don't need to worry about error handling if the system crashes before the projid updates are done. While not 100% atomic, it is not *less* atomic than having "mv" recursively copy the whole directory tree to the target name when the projid on the source and target don't match. It is also a *lot* less overhead than doing a million mkdir() + rename() calls. There would be a risk that the target directory projid could go over quota, but the alternative (that "mv" is half-way between moving the directory tree from the source to the destination before it fails, or fills up the filesystem because it can't hold another full copy of the tree being renamed) is not better. > > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> > --- > mm/shmem.c | 35 +++++++++++++++++++++++++++++++---- > 1 file changed, 31 insertions(+), 4 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 4d2b713bff06..744a39251a31 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -3571,6 +3571,23 @@ static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa) > > fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE); > > + fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid); > + return 0; > +} > + > +static int shmem_set_project(struct inode *inode, __u32 projid) > +{ > + int err = -EOPNOTSUPP; > + kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid); > + > + if (projid_eq(kprojid, SHMEM_I(inode)->i_projid)) > + return 0; > + > + err = dquot_initialize(inode); > + if (err) > + return err; > + > + SHMEM_I(inode)->i_projid = kprojid; > return 0; > } (defect) this also needs to __dquot_transfer() the quota away from the previous projid, or the accounting will be broken. I think one hole in project quotas is the fact that any user can set the projid of their files to any project they want. Changing the projid/PROJINHERIT is restricted outside of the init namespace by fileattr_set_prepare(), which is good in itself, but I think it makes sense for root/CAP_SYS_RESOURCE to be needed to change projid/PROJINHERIT even in the init namespace. Otherwise project quota is only an estimate of space usage in a directory, if users are not actively subverting it. > @@ -3579,19 +3596,29 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap, > { > struct inode *inode = d_inode(dentry); > struct shmem_inode_info *info = SHMEM_I(inode); > + int err = -EOPNOTSUPP; > + > + if (fa->fsx_valid && > + ((fa->fsx_xflags & ~FS_XFLAG_COMMON) || > + fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0)) > + goto out; > > - if (fileattr_has_fsx(fa)) > - return -EOPNOTSUPP; > if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE) > - return -EOPNOTSUPP; > + goto out; > > info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) | > (fa->flags & SHMEM_FL_USER_MODIFIABLE); > > shmem_set_inode_flags(inode, info->fsflags); > + err = shmem_set_project(inode, fa->fsx_projid); > + if (err) > + goto out; > + > inode_set_ctime_current(inode); > inode_inc_iversion(inode); > - return 0; > + > +out: > + return err; > } There were also some patches to add projid support to statx() that didn't quite get merged: https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-3-git-send-email-wshilong1991@gmail.com/ https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-2-git-send-email-wshilong1991@gmail.com/ https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-6-git-send-email-wshilong1991@gmail.com/ https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-7-git-send-email-wshilong1991@gmail.com/ https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-8-git-send-email-wshilong1991@gmail.com/ https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-9-git-send-email-wshilong1991@gmail.com/ They were part of a larger series to allow setting projid directly with the fchownat(2), but that got bogged down in how the interface should work, and whether i_projid should be moved to struct inode, but I think the statx() functionality was uncontroversial and could land as-is. Cheers, Andreas
Hi, kernel test robot noticed the following build errors: [auto build test ERROR on akpm-mm/mm-everything] [also build test ERROR on linus/master v6.6-rc3 next-20230926] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/cem-kernel-org/tmpfs-add-project-ID-support/20230925-210238 base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/r/20230925130028.1244740-4-cem%40kernel.org patch subject: [PATCH 3/3] tmpfs: Add project quota interface support for get/set attr config: arc-randconfig-001-20230926 (https://download.01.org/0day-ci/archive/20230927/202309270552.A4zwjPrB-lkp@intel.com/config) compiler: arceb-elf-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230927/202309270552.A4zwjPrB-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202309270552.A4zwjPrB-lkp@intel.com/ All errors (new ones prefixed by >>): mm/shmem.c: In function 'shmem_fileattr_get': >> mm/shmem.c:3574:63: error: 'struct shmem_inode_info' has no member named 'i_projid' 3574 | fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid); | ^~ mm/shmem.c: In function 'shmem_set_project': mm/shmem.c:3583:46: error: 'struct shmem_inode_info' has no member named 'i_projid' 3583 | if (projid_eq(kprojid, SHMEM_I(inode)->i_projid)) | ^~ mm/shmem.c:3590:23: error: 'struct shmem_inode_info' has no member named 'i_projid' 3590 | SHMEM_I(inode)->i_projid = kprojid; | ^~ vim +3574 mm/shmem.c 3567 3568 static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa) 3569 { 3570 struct shmem_inode_info *info = SHMEM_I(d_inode(dentry)); 3571 3572 fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE); 3573 > 3574 fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid); 3575 return 0; 3576 } 3577
On Tue, Sep 26, 2023 at 02:28:06PM -0600, Andreas Dilger wrote: > I've added Dave to the CC list, since he has a deep understanding > of the projid code since it originated from XFS. > > On Sep 25, 2023, at 7:00 AM, cem@kernel.org wrote: > > > > From: Carlos Maiolino <cem@kernel.org> > > > > Not project quota support is in place, enable users to use it. > > There is a peculiar behavior of project quotas, that rename across > directories with different project IDs and PROJINHERIT set should > cause the project ID to be updated (similar to BSD setgid). > > For renaming regular files and other non-directories, it is OK to > change the projid and update the quota for the old and new IDs > to avoid copying all of the data needlessly. For directories this > (unfortunately) means that the kernel should return -EXDEV if the > project IDs don't match, and then "mv" will create a new target > directory and resume moving the files (which are thankfully still > done with a rename() call if possible). > > The reason for this is that just renaming the directory does not > atomically update the projid on all of the (possibly millions of) > sub-files/sub-dirs, which is required for PROJINHERIT directories. Yup, PROJINHERIT implies that project quotas are being used to implement directory tree quotas, hence everything in the destination directory should have the same projid as the parent. This code in xfs_rename() implements that restriction: /* * If we are using project inheritance, we only allow renames * into our tree when the project IDs are the same; else the * tree quota mechanism would be circumvented. */ if (unlikely((target_dp->i_diflags & XFS_DIFLAG_PROJINHERIT) && target_dp->i_projid != src_ip->i_projid)) { error = -EXDEV; goto out_trans_cancel; } > Another option for tmpfs to maintain this PROJINHERIT behavior would > be to rename the directory and then spawn a background kernel thread > to change the projids on the whole tree. Since tmpfs is an in-memory > filesystem there will be a "limited" number of files and subdirs > to update, and you don't need to worry about error handling if the > system crashes before the projid updates are done. Except that can get EDQUOT half way through and now there's nothing to report the ENOSPC error to, nor any way that the application can interrupt and/or recover the situation. I think that's a non-starter. And, quite frankly, it's also massive feature creep as well as premature optimisation. We don't need tmpfs project quotas to be "smart" like this; we just need the initial support patch set to -do the right thing-. -Dave.
On Tue, Sep 26, 2023 at 02:28:06PM -0600, Andreas Dilger wrote: > I've added Dave to the CC list, since he has a deep understanding > of the projid code since it originated from XFS. > > On Sep 25, 2023, at 7:00 AM, cem@kernel.org wrote: > > > > From: Carlos Maiolino <cem@kernel.org> > > > > Not project quota support is in place, enable users to use it. > > There is a peculiar behavior of project quotas, that rename across > directories with different project IDs and PROJINHERIT set should > cause the project ID to be updated (similar to BSD setgid). > > For renaming regular files and other non-directories, it is OK to > change the projid and update the quota for the old and new IDs > to avoid copying all of the data needlessly. For directories this > (unfortunately) means that the kernel should return -EXDEV if the > project IDs don't match, and then "mv" will create a new target > directory and resume moving the files (which are thankfully still > done with a rename() call if possible). Right! I meant to include it on the TODO list of things, but I totally forgot to do so, thanks for reminding me! > > The reason for this is that just renaming the directory does not > atomically update the projid on all of the (possibly millions of) > sub-files/sub-dirs, which is required for PROJINHERIT directories. > > > Another option for tmpfs to maintain this PROJINHERIT behavior would > be to rename the directory and then spawn a background kernel thread > to change the projids on the whole tree. Since tmpfs is an in-memory > filesystem there will be a "limited" number of files and subdirs > to update, and you don't need to worry about error handling if the > system crashes before the projid updates are done. > > While not 100% atomic, it is not *less* atomic than having "mv" > recursively copy the whole directory tree to the target name when > the projid on the source and target don't match. It is also a > *lot* less overhead than doing a million mkdir() + rename() calls. > > There would be a risk that the target directory projid could go over > quota, but the alternative (that "mv" is half-way between moving the > directory tree from the source to the destination before it fails, > or fills up the filesystem because it can't hold another full copy > of the tree being renamed) is not better. I will look into these while I work on a non-rfc version of this series. Thanks Andreas. Carlos > > > > > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> > > --- > > mm/shmem.c | 35 +++++++++++++++++++++++++++++++---- > > 1 file changed, 31 insertions(+), 4 deletions(-) > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > index 4d2b713bff06..744a39251a31 100644 > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -3571,6 +3571,23 @@ static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa) > > > > fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE); > > > > + fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid); > > + return 0; > > +} > > + > > +static int shmem_set_project(struct inode *inode, __u32 projid) > > +{ > > + int err = -EOPNOTSUPP; > > + kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid); > > + > > + if (projid_eq(kprojid, SHMEM_I(inode)->i_projid)) > > + return 0; > > + > > + err = dquot_initialize(inode); > > + if (err) > > + return err; > > + > > + SHMEM_I(inode)->i_projid = kprojid; > > return 0; > > } > > (defect) this also needs to __dquot_transfer() the quota away from the > previous projid, or the accounting will be broken. > > > I think one hole in project quotas is the fact that any user can set > the projid of their files to any project they want. Changing the projid/PROJINHERIT is restricted outside of the init namespace by > fileattr_set_prepare(), which is good in itself, but I think it makes > sense for root/CAP_SYS_RESOURCE to be needed to change projid/PROJINHERIT > even in the init namespace. Otherwise project quota is only an estimate > of space usage in a directory, if users are not actively subverting it. > > > @@ -3579,19 +3596,29 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap, > > { > > struct inode *inode = d_inode(dentry); > > struct shmem_inode_info *info = SHMEM_I(inode); > > + int err = -EOPNOTSUPP; > > + > > + if (fa->fsx_valid && > > + ((fa->fsx_xflags & ~FS_XFLAG_COMMON) || > > + fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0)) > > + goto out; > > > > - if (fileattr_has_fsx(fa)) > > - return -EOPNOTSUPP; > > if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE) > > - return -EOPNOTSUPP; > > + goto out; > > > > info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) | > > (fa->flags & SHMEM_FL_USER_MODIFIABLE); > > > > shmem_set_inode_flags(inode, info->fsflags); > > + err = shmem_set_project(inode, fa->fsx_projid); > > + if (err) > > + goto out; > > + > > inode_set_ctime_current(inode); > > inode_inc_iversion(inode); > > - return 0; > > + > > +out: > > + return err; > > } > > > > There were also some patches to add projid support to statx() that didn't > quite get merged: > > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-3-git-send-email-wshilong1991@gmail.com/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-2-git-send-email-wshilong1991@gmail.com/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-6-git-send-email-wshilong1991@gmail.com/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-7-git-send-email-wshilong1991@gmail.com/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-8-git-send-email-wshilong1991@gmail.com/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-9-git-send-email-wshilong1991@gmail.com/ > > They were part of a larger series to allow setting projid directly with > the fchownat(2), but that got bogged down in how the interface should > work, and whether i_projid should be moved to struct inode, but I think > the statx() functionality was uncontroversial and could land as-is. > > Cheers, Andreas > > > >
Hi, kernel test robot noticed the following build errors: [auto build test ERROR on akpm-mm/mm-everything] [also build test ERROR on linus/master v6.6-rc4 next-20230929] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/cem-kernel-org/tmpfs-add-project-ID-support/20230925-210238 base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/r/20230925130028.1244740-4-cem%40kernel.org patch subject: [PATCH 3/3] tmpfs: Add project quota interface support for get/set attr config: arm-sp7021_defconfig (https://download.01.org/0day-ci/archive/20231002/202310021602.dvZVjyMh-lkp@intel.com/config) compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231002/202310021602.dvZVjyMh-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202310021602.dvZVjyMh-lkp@intel.com/ All errors (new ones prefixed by >>): >> mm/shmem.c:3574:58: error: no member named 'i_projid' in 'struct shmem_inode_info' 3574 | fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid); | ~~~~ ^ mm/shmem.c:3583:41: error: no member named 'i_projid' in 'struct shmem_inode_info' 3583 | if (projid_eq(kprojid, SHMEM_I(inode)->i_projid)) | ~~~~~~~~~~~~~~ ^ mm/shmem.c:3590:18: error: no member named 'i_projid' in 'struct shmem_inode_info' 3590 | SHMEM_I(inode)->i_projid = kprojid; | ~~~~~~~~~~~~~~ ^ 3 errors generated. vim +3574 mm/shmem.c 3567 3568 static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa) 3569 { 3570 struct shmem_inode_info *info = SHMEM_I(d_inode(dentry)); 3571 3572 fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE); 3573 > 3574 fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid); 3575 return 0; 3576 } 3577
diff --git a/mm/shmem.c b/mm/shmem.c index 4d2b713bff06..744a39251a31 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3571,6 +3571,23 @@ static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa) fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE); + fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid); + return 0; +} + +static int shmem_set_project(struct inode *inode, __u32 projid) +{ + int err = -EOPNOTSUPP; + kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid); + + if (projid_eq(kprojid, SHMEM_I(inode)->i_projid)) + return 0; + + err = dquot_initialize(inode); + if (err) + return err; + + SHMEM_I(inode)->i_projid = kprojid; return 0; } @@ -3579,19 +3596,29 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap, { struct inode *inode = d_inode(dentry); struct shmem_inode_info *info = SHMEM_I(inode); + int err = -EOPNOTSUPP; + + if (fa->fsx_valid && + ((fa->fsx_xflags & ~FS_XFLAG_COMMON) || + fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0)) + goto out; - if (fileattr_has_fsx(fa)) - return -EOPNOTSUPP; if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE) - return -EOPNOTSUPP; + goto out; info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) | (fa->flags & SHMEM_FL_USER_MODIFIABLE); shmem_set_inode_flags(inode, info->fsflags); + err = shmem_set_project(inode, fa->fsx_projid); + if (err) + goto out; + inode_set_ctime_current(inode); inode_inc_iversion(inode); - return 0; + +out: + return err; } /*