Message ID | 164251409447.3435901.10092442643336534999.stgit@warthog.procyon.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fscache, cachefiles: Rewrite fixes/updates | expand |
On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote: > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is > common practice for the other inode flags[1]. Please fix the flag to have a sensible name first, as the naming of the flag and this new helper is utterly wrong as we already discussed.
Christoph Hellwig <hch@infradead.org> wrote: > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote: > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is > > common practice for the other inode flags[1]. > > Please fix the flag to have a sensible name first, as the naming of the > flag and this new helper is utterly wrong as we already discussed. And I suggested a new name, which you didn't comment on. David
On Tue, Jan 18, 2022 at 05:40:14PM +0000, David Howells wrote: > Christoph Hellwig <hch@infradead.org> wrote: > > > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote: > > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is > > > common practice for the other inode flags[1]. > > > > Please fix the flag to have a sensible name first, as the naming of the > > flag and this new helper is utterly wrong as we already discussed. > > And I suggested a new name, which you didn't comment on. Again, look at the semantics of the flag: The only thing it does in the VFS is to prevent a rmdir. So you might want to name it after that. Or in fact drop the flag entirely. We don't have that kind of protection for other in-kernel file use or important userspace daemons either. I can't see why cachefiles is the magic snowflake here that suddenly needs semantics no one else has.
Christoph Hellwig <hch@infradead.org> wrote: > On Tue, Jan 18, 2022 at 05:40:14PM +0000, David Howells wrote: > > Christoph Hellwig <hch@infradead.org> wrote: > > > > > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote: > > > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is > > > > common practice for the other inode flags[1]. > > > > > > Please fix the flag to have a sensible name first, as the naming of the > > > flag and this new helper is utterly wrong as we already discussed. > > > > And I suggested a new name, which you didn't comment on. > > Again, look at the semantics of the flag: The only thing it does in the > VFS is to prevent a rmdir. So you might want to name it after that. > > Or in fact drop the flag entirely. We don't have that kind of > protection for other in-kernel file use or important userspace daemons > either. I can't see why cachefiles is the magic snowflake here that > suddenly needs semantics no one else has. The flag cannot just be dropped - it's an important part of the interaction with cachefilesd with regard to culling. Culling to free up space is offloaded to userspace rather than being done within the kernel. Previously, cachefiles, the kernel module, had to maintain a huge tree of records of every backing inode that it was currently using so that it could forbid cachefilesd to cull one when cachefilesd asked. I've reduced that to a single bit flag on the inode struct, thereby saving both memory and time. You can argue whether it's worth sacrificing an inode flag bit for that, but the flag can be reused for any other kernel service that wants to similarly mark an inode in use. Further, it's used as a mark to prevent cachefiles accidentally using an inode twice - say someone misconfigures a second cache overlapping the first - and, again, this works if some other kernel driver wants to mark inode it is using in use. Cachefiles will refuse to use them if it ever sees them, so no problem there. And it's not true that we don't have that kind of protection for other in-kernel file use. See S_SWAPFILE. I did consider using that, but that has other side effects. I mentioned that perhaps I should make swapon set S_KERNEL_FILE also. Also blockdevs have some exclusion also, I think. The rmdir thing should really apply to rename and unlink also. That's to prevent someone, cachefilesd included, causing cachefiles to malfunction by removing the directories it created. Possibly this should be a separate bit to S_KERNEL_FILE, maybe S_NO_DELETE. So I could change S_KERNEL_FILE to S_KERNEL_LOCK, say, or maybe S_EXCLUSIVE. David
On Wed, Jan 19, 2022 at 09:18:05AM +0000, David Howells wrote: > Christoph Hellwig <hch@infradead.org> wrote: > > > On Tue, Jan 18, 2022 at 05:40:14PM +0000, David Howells wrote: > > > Christoph Hellwig <hch@infradead.org> wrote: > > > > > > > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote: > > > > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is > > > > > common practice for the other inode flags[1]. > > > > > > > > Please fix the flag to have a sensible name first, as the naming of the > > > > flag and this new helper is utterly wrong as we already discussed. > > > > > > And I suggested a new name, which you didn't comment on. > > > > Again, look at the semantics of the flag: The only thing it does in the > > VFS is to prevent a rmdir. So you might want to name it after that. > > > > Or in fact drop the flag entirely. We don't have that kind of > > protection for other in-kernel file use or important userspace daemons > > either. I can't see why cachefiles is the magic snowflake here that > > suddenly needs semantics no one else has. > > The flag cannot just be dropped - it's an important part of the interaction > with cachefilesd with regard to culling. Culling to free up space is > offloaded to userspace rather than being done within the kernel. > > Previously, cachefiles, the kernel module, had to maintain a huge tree of > records of every backing inode that it was currently using so that it could > forbid cachefilesd to cull one when cachefilesd asked. I've reduced that to a > single bit flag on the inode struct, thereby saving both memory and time. You > can argue whether it's worth sacrificing an inode flag bit for that, but the > flag can be reused for any other kernel service that wants to similarly mark > an inode in use. > > Further, it's used as a mark to prevent cachefiles accidentally using an inode > twice - say someone misconfigures a second cache overlapping the first - and, > again, this works if some other kernel driver wants to mark inode it is using > in use. Cachefiles will refuse to use them if it ever sees them, so no > problem there. > > And it's not true that we don't have that kind of protection for other > in-kernel file use. See S_SWAPFILE. I did consider using that, but that has > other side effects. I mentioned that perhaps I should make swapon set > S_KERNEL_FILE also. Also blockdevs have some exclusion also, I think. > > The rmdir thing should really apply to rename and unlink also. That's to > prevent someone, cachefilesd included, causing cachefiles to malfunction by > removing the directories it created. Possibly this should be a separate bit > to S_KERNEL_FILE, maybe S_NO_DELETE. > > So I could change S_KERNEL_FILE to S_KERNEL_LOCK, say, or maybe S_EXCLUSIVE. [ ] S_REMOVE_PROTECTED [ ] S_UNREMOVABLE [ ] S_HELD_BUSY [ ] S_KERNEL_BUSY [ ] S_BUSY_INTERNAL [ ] S_BUSY [ ] S_HELD ?
On Wed, Jan 19, 2022 at 09:18:05AM +0000, David Howells wrote: > The flag cannot just be dropped - it's an important part of the interaction > with cachefilesd with regard to culling. Culling to free up space is > offloaded to userspace rather than being done within the kernel. > > Previously, cachefiles, the kernel module, had to maintain a huge tree of > records of every backing inode that it was currently using so that it could > forbid cachefilesd to cull one when cachefilesd asked. I've reduced that to a > single bit flag on the inode struct, thereby saving both memory and time. You > can argue whether it's worth sacrificing an inode flag bit for that, but the > flag can be reused for any other kernel service that wants to similarly mark > an inode in use. Which is a horrible interface. But you tricked Linus into merging this crap, so let's not pretent it is a "kernel file". We have plenty of those, basically every caller of filp_open is one. It is something like "pinned for fscache/cachefiles", so name it that way and add a big fat comment expaining the atrocities.
Christoph Hellwig <hch@infradead.org> wrote:
> But you tricked Linus
Tricked? I put a notice explicitly pointing out that I was adding it and
indicating that it might be controversial in the cover note and the pull
request and further explained the use in the patches that handle it. I posted
the patches adding/using it a bunch of times to various mailing lists. TYVM.
David
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index f256c8aff7bb..04563f759e99 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -20,7 +20,7 @@ static bool __cachefiles_mark_inode_in_use(struct cachefiles_object *object, struct inode *inode = d_backing_inode(dentry); bool can_use = false; - if (!(inode->i_flags & S_KERNEL_FILE)) { + if (!IS_KERNEL_FILE(inode)) { inode->i_flags |= S_KERNEL_FILE; trace_cachefiles_mark_active(object, inode); can_use = true; @@ -746,7 +746,7 @@ static struct dentry *cachefiles_lookup_for_cull(struct cachefiles_cache *cache, goto lookup_error; if (d_is_negative(victim)) goto lookup_put; - if (d_inode(victim)->i_flags & S_KERNEL_FILE) + if (IS_KERNEL_FILE(d_inode(victim))) goto lookup_busy; return victim; @@ -793,7 +793,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, /* check to see if someone is using this object */ inode = d_inode(victim); inode_lock(inode); - if (inode->i_flags & S_KERNEL_FILE) { + if (IS_KERNEL_FILE(inode)) { ret = -EBUSY; } else { /* Stop the cache from picking it back up */ diff --git a/fs/namei.c b/fs/namei.c index d81f04f8d818..c2175ab3849d 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3959,7 +3959,7 @@ int vfs_rmdir(struct user_namespace *mnt_userns, struct inode *dir, error = -EBUSY; if (is_local_mountpoint(dentry) || - (dentry->d_inode->i_flags & S_KERNEL_FILE)) + IS_KERNEL_FILE(dentry->d_inode)) goto out; error = security_inode_rmdir(dir, dentry); diff --git a/include/linux/fs.h b/include/linux/fs.h index f5d3bf5b69a6..227497793282 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2216,6 +2216,7 @@ static inline bool sb_rdonly(const struct super_block *sb) { return sb->s_flags #define IS_ENCRYPTED(inode) ((inode)->i_flags & S_ENCRYPTED) #define IS_CASEFOLDED(inode) ((inode)->i_flags & S_CASEFOLD) #define IS_VERITY(inode) ((inode)->i_flags & S_VERITY) +#define IS_KERNEL_FILE(inode) ((inode)->i_flags & S_KERNEL_FILE) #define IS_WHITEOUT(inode) (S_ISCHR(inode->i_mode) && \ (inode)->i_rdev == WHITEOUT_DEV)