Message ID | 05a0f4fd-7f62-8fbc-378d-886ccd5b3f11@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | statx: Fix DAX attribute collision and handling | expand |
Eric Sandeen <sandeen@redhat.com> wrote: > - if (IS_DAX(inode)) > - stat->attributes |= STATX_ATTR_DAX; > - This could probably be left in and not distributed amongst the filesytems provided that any filesystem that might turn it on sets the bit in the attributes_mask. I'm presuming that the core doesn't turn it on without the filesystem buying in. David
David Howells <dhowells@redhat.com> wrote: > > - if (IS_DAX(inode)) > > - stat->attributes |= STATX_ATTR_DAX; > > - > > This could probably be left in and not distributed amongst the filesytems > provided that any filesystem that might turn it on sets the bit in the > attributes_mask. On further consideration, it's probably better to split it as you've done it. Reviewed-by: David Howells <dhowells@redhat.com> You do need one or more Fixes: lines, though. David
On Tue, Dec 01, 2020 at 10:59:36AM -0600, Eric Sandeen wrote: > It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS; > while the VFS can detect the current DAX state, it is the filesystem which > actually sets S_DAX on the inode, and the filesystem is the place that > knows whether DAX is something that the "filesystem actually supports" [1] > so that the statx attributes_mask can be properly set. > > So, move STATX_ATTR_DAX attribute setting to the individual dax-capable > filesystems, and update the attributes_mask there as well. > > [1] 3209f68b3ca4 statx: Include a mask for stx_attributes in struct statx > > Signed-off-by: Eric Sandeen <sandeen@redhat.com> > --- > fs/ext2/inode.c | 6 +++++- > fs/ext4/inode.c | 5 ++++- > fs/stat.c | 3 --- > fs/xfs/xfs_iops.c | 5 ++++- > 4 files changed, 13 insertions(+), 6 deletions(-) > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > index 11c5c6fe75bb..3550783a6ea0 100644 > --- a/fs/ext2/inode.c > +++ b/fs/ext2/inode.c > @@ -1653,11 +1653,15 @@ int ext2_getattr(const struct path *path, struct kstat *stat, > stat->attributes |= STATX_ATTR_IMMUTABLE; > if (flags & EXT2_NODUMP_FL) > stat->attributes |= STATX_ATTR_NODUMP; > + if (IS_DAX(inode)) > + stat->attributes |= STATX_ATTR_DAX; > + > stat->attributes_mask |= (STATX_ATTR_APPEND | > STATX_ATTR_COMPRESSED | > STATX_ATTR_ENCRYPTED | > STATX_ATTR_IMMUTABLE | > - STATX_ATTR_NODUMP); > + STATX_ATTR_NODUMP | > + STATX_ATTR_DAX); > > generic_fillattr(inode, stat); > return 0; > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 0d8385aea898..848a0f2b154e 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -5550,13 +5550,16 @@ int ext4_getattr(const struct path *path, struct kstat *stat, > stat->attributes |= STATX_ATTR_NODUMP; > if (flags & EXT4_VERITY_FL) > stat->attributes |= STATX_ATTR_VERITY; > + if (IS_DAX(inode)) > + stat->attributes |= STATX_ATTR_DAX; > > stat->attributes_mask |= (STATX_ATTR_APPEND | > STATX_ATTR_COMPRESSED | > STATX_ATTR_ENCRYPTED | > STATX_ATTR_IMMUTABLE | > STATX_ATTR_NODUMP | > - STATX_ATTR_VERITY); > + STATX_ATTR_VERITY | > + STATX_ATTR_DAX); > > generic_fillattr(inode, stat); > return 0; > diff --git a/fs/stat.c b/fs/stat.c > index dacecdda2e79..5bd90949c69b 100644 > --- a/fs/stat.c > +++ b/fs/stat.c > @@ -80,9 +80,6 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, > if (IS_AUTOMOUNT(inode)) > stat->attributes |= STATX_ATTR_AUTOMOUNT; > > - if (IS_DAX(inode)) > - stat->attributes |= STATX_ATTR_DAX; > - > if (inode->i_op->getattr) > return inode->i_op->getattr(path, stat, request_mask, > query_flags); > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > index 1414ab79eacf..56deda7042fd 100644 > --- a/fs/xfs/xfs_iops.c > +++ b/fs/xfs/xfs_iops.c > @@ -575,10 +575,13 @@ xfs_vn_getattr( > stat->attributes |= STATX_ATTR_APPEND; > if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP) > stat->attributes |= STATX_ATTR_NODUMP; > + if (IS_DAX(inode)) > + stat->attributes |= STATX_ATTR_DAX; > > stat->attributes_mask |= (STATX_ATTR_IMMUTABLE | > STATX_ATTR_APPEND | > - STATX_ATTR_NODUMP); > + STATX_ATTR_NODUMP | > + STATX_ATTR_DAX); TBH I preferred your previous iteration on this, which only set the DAX bit in the attributes_mask if the underlying storage was pmem and the blocksize was correct, etc. since it made it easier to distinguish between a filesystem where you /could/ have DAX (but it wasn't currently enabled) and a filesystem where it just isn't possible. That might be enough to satisfy any critics who want to be able to detect DAX support from an installer program. --D > > switch (inode->i_mode & S_IFMT) { > case S_IFBLK: > -- > 2.17.0 > >
On 12/1/20 11:39 AM, Darrick J. Wong wrote: >> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c >> index 1414ab79eacf..56deda7042fd 100644 >> --- a/fs/xfs/xfs_iops.c >> +++ b/fs/xfs/xfs_iops.c >> @@ -575,10 +575,13 @@ xfs_vn_getattr( >> stat->attributes |= STATX_ATTR_APPEND; >> if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP) >> stat->attributes |= STATX_ATTR_NODUMP; >> + if (IS_DAX(inode)) >> + stat->attributes |= STATX_ATTR_DAX; >> >> stat->attributes_mask |= (STATX_ATTR_IMMUTABLE | >> STATX_ATTR_APPEND | >> - STATX_ATTR_NODUMP); >> + STATX_ATTR_NODUMP | >> + STATX_ATTR_DAX); > TBH I preferred your previous iteration on this, which only set the DAX > bit in the attributes_mask if the underlying storage was pmem and the > blocksize was correct, etc. since it made it easier to distinguish > between a filesystem where you /could/ have DAX (but it wasn't currently > enabled) and a filesystem where it just isn't possible. > > That might be enough to satisfy any critics who want to be able to > detect DAX support from an installer program. (nb: that previous iteration wasn't in public, just something I talked to Darrick about) I'm sympathetic to that argument, but the exact details of what the mask means in general, and for dax in particular, seems to be subject to ongoing debate. I'd like to just set it with the simplest definition "the fileystem supports the feature" for now, so that we aren't ever setting a feature that's omitted from the mask, and refine the mask-setting for the dax flag in another iteration if/when we reach agreement. -Eric > --D >
On Tue, Dec 1, 2020 at 8:59 AM Eric Sandeen <sandeen@redhat.com> wrote: > > It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS; > while the VFS can detect the current DAX state, it is the filesystem which > actually sets S_DAX on the inode, and the filesystem is the place that > knows whether DAX is something that the "filesystem actually supports" [1] > so that the statx attributes_mask can be properly set. > > So, move STATX_ATTR_DAX attribute setting to the individual dax-capable > filesystems, and update the attributes_mask there as well. I'm not really understanding the logic behind this. The whole IS_DAX(inode) thing exists in various places outside the low-level filesystem, why shouldn't stat() do this? If IS_DAX() is incorrect, then we have much bigger problems than some stat results. We have core functions like generic_file_read_iter() etc all making actual behavioral judgements on IS_DAX(). And if IS_DAX() is correct, then why shouldn't this just be done in generic code? Why move it to every individual filesystem? Linus
On 12/1/20 2:04 PM, Linus Torvalds wrote: > On Tue, Dec 1, 2020 at 8:59 AM Eric Sandeen <sandeen@redhat.com> wrote: >> >> It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS; >> while the VFS can detect the current DAX state, it is the filesystem which >> actually sets S_DAX on the inode, and the filesystem is the place that >> knows whether DAX is something that the "filesystem actually supports" [1] >> so that the statx attributes_mask can be properly set. >> >> So, move STATX_ATTR_DAX attribute setting to the individual dax-capable >> filesystems, and update the attributes_mask there as well. > > I'm not really understanding the logic behind this. > > The whole IS_DAX(inode) thing exists in various places outside the > low-level filesystem, why shouldn't stat() do this? > > If IS_DAX() is incorrect, then we have much bigger problems than some > stat results. We have core functions like generic_file_read_iter() etc > all making actual behavioral judgements on IS_DAX(). It's not incorrect, I didn't mean to imply that. Current code does accurately set the DAX flag in the statx attributes. > And if IS_DAX() is correct, then why shouldn't this just be done in > generic code? Why move it to every individual filesystem? At the end of the day, it's because only the individual filesystems can manage the dax flag in the statx attributes_mask. (That's only place that knows if dax "is available" in general, as opposed to being set on a specific inode) So if they have to do that, they may as well set the actual attribute as well, like they do for every other flag they manage... I mean, we could leave the statx->attributes setting in the vfs, and add the statx->attributes_mask setting to each dax-capable filesystem. That just felt a bit asymmetric vs. the way every other filesystem-specific flag gets handled. -Eric
On Tue, Dec 01, 2020 at 09:39:05AM -0800, Darrick J. Wong wrote: > On Tue, Dec 01, 2020 at 10:59:36AM -0600, Eric Sandeen wrote: > > It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS; > > while the VFS can detect the current DAX state, it is the filesystem which > > actually sets S_DAX on the inode, and the filesystem is the place that > > knows whether DAX is something that the "filesystem actually supports" [1] > > so that the statx attributes_mask can be properly set. > > > > So, move STATX_ATTR_DAX attribute setting to the individual dax-capable > > filesystems, and update the attributes_mask there as well. > > > > [1] 3209f68b3ca4 statx: Include a mask for stx_attributes in struct statx > > > > Signed-off-by: Eric Sandeen <sandeen@redhat.com> > > --- > > fs/ext2/inode.c | 6 +++++- > > fs/ext4/inode.c | 5 ++++- > > fs/stat.c | 3 --- > > fs/xfs/xfs_iops.c | 5 ++++- > > 4 files changed, 13 insertions(+), 6 deletions(-) > > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c > > index 11c5c6fe75bb..3550783a6ea0 100644 > > --- a/fs/ext2/inode.c > > +++ b/fs/ext2/inode.c > > @@ -1653,11 +1653,15 @@ int ext2_getattr(const struct path *path, struct kstat *stat, > > stat->attributes |= STATX_ATTR_IMMUTABLE; > > if (flags & EXT2_NODUMP_FL) > > stat->attributes |= STATX_ATTR_NODUMP; > > + if (IS_DAX(inode)) > > + stat->attributes |= STATX_ATTR_DAX; > > + > > stat->attributes_mask |= (STATX_ATTR_APPEND | > > STATX_ATTR_COMPRESSED | > > STATX_ATTR_ENCRYPTED | > > STATX_ATTR_IMMUTABLE | > > - STATX_ATTR_NODUMP); > > + STATX_ATTR_NODUMP | > > + STATX_ATTR_DAX); > > > > generic_fillattr(inode, stat); > > return 0; > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index 0d8385aea898..848a0f2b154e 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -5550,13 +5550,16 @@ int ext4_getattr(const struct path *path, struct kstat *stat, > > stat->attributes |= STATX_ATTR_NODUMP; > > if (flags & EXT4_VERITY_FL) > > stat->attributes |= STATX_ATTR_VERITY; > > + if (IS_DAX(inode)) > > + stat->attributes |= STATX_ATTR_DAX; > > > > stat->attributes_mask |= (STATX_ATTR_APPEND | > > STATX_ATTR_COMPRESSED | > > STATX_ATTR_ENCRYPTED | > > STATX_ATTR_IMMUTABLE | > > STATX_ATTR_NODUMP | > > - STATX_ATTR_VERITY); > > + STATX_ATTR_VERITY | > > + STATX_ATTR_DAX); > > > > generic_fillattr(inode, stat); > > return 0; > > diff --git a/fs/stat.c b/fs/stat.c > > index dacecdda2e79..5bd90949c69b 100644 > > --- a/fs/stat.c > > +++ b/fs/stat.c > > @@ -80,9 +80,6 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, > > if (IS_AUTOMOUNT(inode)) > > stat->attributes |= STATX_ATTR_AUTOMOUNT; > > > > - if (IS_DAX(inode)) > > - stat->attributes |= STATX_ATTR_DAX; > > - > > if (inode->i_op->getattr) > > return inode->i_op->getattr(path, stat, request_mask, > > query_flags); > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > > index 1414ab79eacf..56deda7042fd 100644 > > --- a/fs/xfs/xfs_iops.c > > +++ b/fs/xfs/xfs_iops.c > > @@ -575,10 +575,13 @@ xfs_vn_getattr( > > stat->attributes |= STATX_ATTR_APPEND; > > if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP) > > stat->attributes |= STATX_ATTR_NODUMP; > > + if (IS_DAX(inode)) > > + stat->attributes |= STATX_ATTR_DAX; > > > > stat->attributes_mask |= (STATX_ATTR_IMMUTABLE | > > STATX_ATTR_APPEND | > > - STATX_ATTR_NODUMP); > > + STATX_ATTR_NODUMP | > > + STATX_ATTR_DAX); > > TBH I preferred your previous iteration on this, which only set the DAX > bit in the attributes_mask if the underlying storage was pmem and the > blocksize was correct, etc. since it made it easier to distinguish > between a filesystem where you /could/ have DAX (but it wasn't currently > enabled) and a filesystem where it just isn't possible. I think that's the only thing that makes sense from a userspace perspective. THe man page explicitly says that: stx_attributes_mask A mask indicating which bits in stx_attributes are supported by the VFS and the filesystem. So if DAX can never be turned on for that filesystem instance then, by definition, it does not support DAX and the bit should never be set. e.g. We don't talk about kernels that support reflink - what matters to userspace is whether the filesystem instance supports reflink. Think of the useless mess that xfs_info would be if it reported kernel capabilities instead of filesystem instance capabilities. i.e. we don't report that a filesystem supports reflink just because the kernel supports it - it reports whether the filesystem instance being queried supports reflink. And that also implies the kernel supports it, because the kernel has to support it to mount the filesystem... So, yeah, I think it really does need to be conditional on the filesystem instance being queried to be actually useful to users.... Cheers, Dave.
Linus Torvalds <torvalds@linux-foundation.org> wrote: > And if IS_DAX() is correct, then why shouldn't this just be done in > generic code? Why move it to every individual filesystem? One way of looking at it is that the check is then done for every filesystem - most of which don't support it. Not sure whether that's a big enough problem to worry about. The same is true of the automount test too, I suppose... David
On 12/1/20 2:52 PM, Dave Chinner wrote: > On Tue, Dec 01, 2020 at 09:39:05AM -0800, Darrick J. Wong wrote: >> On Tue, Dec 01, 2020 at 10:59:36AM -0600, Eric Sandeen wrote: >>> It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS; >>> while the VFS can detect the current DAX state, it is the filesystem which >>> actually sets S_DAX on the inode, and the filesystem is the place that >>> knows whether DAX is something that the "filesystem actually supports" [1] >>> so that the statx attributes_mask can be properly set. >>> >>> So, move STATX_ATTR_DAX attribute setting to the individual dax-capable >>> filesystems, and update the attributes_mask there as well. >>> >>> [1] 3209f68b3ca4 statx: Include a mask for stx_attributes in struct statx >>> >>> Signed-off-by: Eric Sandeen <sandeen@redhat.com> >>> --- >>> fs/ext2/inode.c | 6 +++++- >>> fs/ext4/inode.c | 5 ++++- >>> fs/stat.c | 3 --- >>> fs/xfs/xfs_iops.c | 5 ++++- >>> 4 files changed, 13 insertions(+), 6 deletions(-) >>> >>> diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c >>> index 11c5c6fe75bb..3550783a6ea0 100644 >>> --- a/fs/ext2/inode.c >>> +++ b/fs/ext2/inode.c >>> @@ -1653,11 +1653,15 @@ int ext2_getattr(const struct path *path, struct kstat *stat, >>> stat->attributes |= STATX_ATTR_IMMUTABLE; >>> if (flags & EXT2_NODUMP_FL) >>> stat->attributes |= STATX_ATTR_NODUMP; >>> + if (IS_DAX(inode)) >>> + stat->attributes |= STATX_ATTR_DAX; >>> + >>> stat->attributes_mask |= (STATX_ATTR_APPEND | >>> STATX_ATTR_COMPRESSED | >>> STATX_ATTR_ENCRYPTED | >>> STATX_ATTR_IMMUTABLE | >>> - STATX_ATTR_NODUMP); >>> + STATX_ATTR_NODUMP | >>> + STATX_ATTR_DAX); >>> >>> generic_fillattr(inode, stat); >>> return 0; >>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c >>> index 0d8385aea898..848a0f2b154e 100644 >>> --- a/fs/ext4/inode.c >>> +++ b/fs/ext4/inode.c >>> @@ -5550,13 +5550,16 @@ int ext4_getattr(const struct path *path, struct kstat *stat, >>> stat->attributes |= STATX_ATTR_NODUMP; >>> if (flags & EXT4_VERITY_FL) >>> stat->attributes |= STATX_ATTR_VERITY; >>> + if (IS_DAX(inode)) >>> + stat->attributes |= STATX_ATTR_DAX; >>> >>> stat->attributes_mask |= (STATX_ATTR_APPEND | >>> STATX_ATTR_COMPRESSED | >>> STATX_ATTR_ENCRYPTED | >>> STATX_ATTR_IMMUTABLE | >>> STATX_ATTR_NODUMP | >>> - STATX_ATTR_VERITY); >>> + STATX_ATTR_VERITY | >>> + STATX_ATTR_DAX); >>> >>> generic_fillattr(inode, stat); >>> return 0; >>> diff --git a/fs/stat.c b/fs/stat.c >>> index dacecdda2e79..5bd90949c69b 100644 >>> --- a/fs/stat.c >>> +++ b/fs/stat.c >>> @@ -80,9 +80,6 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, >>> if (IS_AUTOMOUNT(inode)) >>> stat->attributes |= STATX_ATTR_AUTOMOUNT; >>> >>> - if (IS_DAX(inode)) >>> - stat->attributes |= STATX_ATTR_DAX; >>> - >>> if (inode->i_op->getattr) >>> return inode->i_op->getattr(path, stat, request_mask, >>> query_flags); >>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c >>> index 1414ab79eacf..56deda7042fd 100644 >>> --- a/fs/xfs/xfs_iops.c >>> +++ b/fs/xfs/xfs_iops.c >>> @@ -575,10 +575,13 @@ xfs_vn_getattr( >>> stat->attributes |= STATX_ATTR_APPEND; >>> if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP) >>> stat->attributes |= STATX_ATTR_NODUMP; >>> + if (IS_DAX(inode)) >>> + stat->attributes |= STATX_ATTR_DAX; >>> >>> stat->attributes_mask |= (STATX_ATTR_IMMUTABLE | >>> STATX_ATTR_APPEND | >>> - STATX_ATTR_NODUMP); >>> + STATX_ATTR_NODUMP | >>> + STATX_ATTR_DAX); >> >> TBH I preferred your previous iteration on this, which only set the DAX >> bit in the attributes_mask if the underlying storage was pmem and the >> blocksize was correct, etc. since it made it easier to distinguish >> between a filesystem where you /could/ have DAX (but it wasn't currently >> enabled) and a filesystem where it just isn't possible. > > I think that's the only thing that makes sense from a userspace > perspective. THe man page explicitly says that: > > stx_attributes_mask > A mask indicating which bits in stx_attributes are supported > by the VFS and the filesystem. > > So if DAX can never be turned on for that filesystem instance then, > by definition, it does not support DAX and the bit should never be > set. > > e.g. We don't talk about kernels that support reflink - what matters > to userspace is whether the filesystem instance supports reflink. > Think of the useless mess that xfs_info would be if it reported > kernel capabilities instead of filesystem instance capabilities. > i.e. we don't report that a filesystem supports reflink just because > the kernel supports it - it reports whether the filesystem instance > being queried supports reflink. And that also implies the kernel > supports it, because the kernel has to support it to mount the > filesystem... > > So, yeah, I think it really does need to be conditional on the > filesystem instance being queried to be actually useful to users.... So now we're back to "attributes_mask, how does it work?" The original implementation, as written by the statx interface author, added: diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5d02b922afa3..b9ffa9f4191f 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5413,6 +5413,12 @@ int ext4_getattr(const struct path *path, struct kstat *stat, if (flags & EXT4_NODUMP_FL) stat->attributes |= STATX_ATTR_NODUMP; + stat->attributes_mask |= (STATX_ATTR_APPEND | + STATX_ATTR_COMPRESSED | + STATX_ATTR_ENCRYPTED | + STATX_ATTR_IMMUTABLE | + STATX_ATTR_NODUMP); + generic_fillattr(inode, stat); return 0; } setting all those flags /unconditionally/ i.e. STATX_ATTR_ENCRYPTED is always set in the mask, even if CONFIG_FS_ENCRYPTION=n And as for compression, that's even better ... so, um... That's why I was keen to just add DAX unconditionally at this point, and if we want to invent/refine meanings for the mask, we can still try to do that? -Eric
On Tue, Dec 1, 2020 at 1:04 PM David Howells <dhowells@redhat.com> wrote: > > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > And if IS_DAX() is correct, then why shouldn't this just be done in > > generic code? Why move it to every individual filesystem? > > One way of looking at it is that the check is then done for every filesystem - > most of which don't support it. Not sure whether that's a big enough problem > to worry about. The same is true of the automount test too, I suppose... So I'd rather have it in one single place than spread out in the filesystems. Especially when it turns out that the STATX_ATTR_DAX bitmask value was wrong - now clearly it doesn't seem to currently *matter* to anything, but imagine if we had to have some strange compat rule to fix things up with stat() versioning or similar. That's exactly the kind of code we would _not_ want in every filesystem. So basically, the thing that argues against this patch is that it seems to just duplicate things inside filesystems, when the VFS layter already has the information. Now, if the VFS information was possibly stale or wrong, that woudl be one thing. But then we'd have other and bigger problems elsewhere as far as I can tell. IOW - make generic what can be made generic, and try to avoid having filesystems do their own thing. [ Replace "filesystems" by "architectures" or whatever else, this is obviously not a filesystem-specific rule in general. ] And don't get me wrong - I don't _hate_ the patch, and I don't care _that_ deeply, but it just doesn't seem to make any sense to me. My initial query was really about "what am I missing - can you please flesh out the commit message because I don't understand what's wrong". Linus
On Tue, Dec 1, 2020 at 2:03 PM Eric Sandeen <sandeen@sandeen.net> wrote: > > That's why I was keen to just add DAX unconditionally at this point, and if we want > to invent/refine meanings for the mask, we can still try to do that? Oh Gods. Let's *not* make this some "random filesystem choice" where now semantics depends on what some filesystem decided to do, and different filesystems have very subtly different semantics. This all screams "please keep this in the VFS layer" so that we at least have _one_ place where these kinds of decisions are made. I suspect very very few people actually end up caring about any of the STATX flags at all, of course. The fact that the DAX one was apparently entirely the wrong bit argues that this isn't all that important. Linus
On 12/1/20 4:12 PM, Linus Torvalds wrote: > On Tue, Dec 1, 2020 at 2:03 PM Eric Sandeen <sandeen@sandeen.net> wrote: >> >> That's why I was keen to just add DAX unconditionally at this point, and if we want >> to invent/refine meanings for the mask, we can still try to do that? > > Oh Gods. Let's *not* make this some "random filesystem choice" where > now semantics depends on what some filesystem decided to do, and > different filesystems have very subtly different semantics. Well, I had hoped that refinement might start with the interface documentation, I'm certainly not suggesting every filesystem should go its own way. > This all screams "please keep this in the VFS layer" so that we at > least have _one_ place where these kinds of decisions are made. Making the "right decision" depends on what the mask actually means, I guess. Today we set a DAX attribute in statx which is not set in the mask. That seems clearly broken. We can either leave that as it is, set DAX in the mask for everyone in the VFS, or delegate that mask-setting task to the individual filesystems so that it reflects <something>, probably "can this inode ever be in dax mode?" I honestly don't care if we keep setting the attribute itself in the VFS; if that's the right thing to do, that's fine. (If so, it seems like IS_IMMUTABLE -> STATX_ATTR_IMMUTABLE etc could be moved there, too.) -Eric
On 12/1/20 4:09 PM, Linus Torvalds wrote: > So basically, the thing that argues against this patch is that it > seems to just duplicate things inside filesystems, when the VFS layter > already has the information. > > Now, if the VFS information was possibly stale or wrong, that woudl be > one thing. But then we'd have other and bigger problems elsewhere as > far as I can tell. > > IOW - make generic what can be made generic, and try to avoid having > filesystems do their own thing. > > [ Replace "filesystems" by "architectures" or whatever else, this is > obviously not a filesystem-specific rule in general. ] > > And don't get me wrong - I don't _hate_ the patch, and I don't care > _that_ deeply, but it just doesn't seem to make any sense to me. My > initial query was really about "what am I missing - can you please > flesh out the commit message because I don't understand what's wrong". Backing way up, my motivation was: Only the filesystem can appropriately set the statx->attributes_mask, so it has to be done there. Since that has to be done in the filesystem, set the actual attribute flag adjacent to it, as is done for ~every other flag. *shrug* In any case I resent the flag value clash fix on a separate thread as V2, hopefully that one is straightforward enough to go in. Thanks, -Eric
On Tue, Dec 01, 2020 at 04:26:43PM -0600, Eric Sandeen wrote: > On 12/1/20 4:12 PM, Linus Torvalds wrote: > > On Tue, Dec 1, 2020 at 2:03 PM Eric Sandeen <sandeen@sandeen.net> wrote: > >> > >> That's why I was keen to just add DAX unconditionally at this point, and if we want > >> to invent/refine meanings for the mask, we can still try to do that? > > > > Oh Gods. Let's *not* make this some "random filesystem choice" where > > now semantics depends on what some filesystem decided to do, and > > different filesystems have very subtly different semantics. > > Well, I had hoped that refinement might start with the interface > documentation, I'm certainly not suggesting every filesystem should go > its own way. > > This all screams "please keep this in the VFS layer" so that we at > > least have _one_ place where these kinds of decisions are made. > > Making the "right decision" depends on what the mask actually means, > I guess. > > Today we set a DAX attribute in statx which is not set in the mask. > That seems clearly broken. Yes... and no. You can't set the statx DAX flag directly. It is only an indicator of the current file state. That state is affected by the inode mode and the DAX mount option. But I agree that having a bit set when the corresponding mask is 0 is odd. > > We can either leave that as it is, set DAX in the mask for everyone in > the VFS, or delegate that mask-setting task to the individual filesystems > so that it reflects <something>, probably "can this inode ever be in dax > mode?" > > I honestly don't care if we keep setting the attribute itself in the VFS; > if that's the right thing to do, that's fine. (If so, it seems like > IS_IMMUTABLE -> STATX_ATTR_IMMUTABLE etc could be moved there, too.) The reason I put it in the VFS layer was that the statx was intended to be a VFS indication of the state of the inode. This differs from the FS_XFLAG_DAX which is a state of the on-disk inode. The VFS IS_DAX can be altered by the mount option forcing DAX or no-DAX. So there was a reason for having it at that level. But it is true that only FS's which support DAX will ever set IS_DAX() and having the attribute set near the mask probably makes the code a bit more clear. So I'm not opposed to the patch either. But users can't ever set the flag directly so I'm not sure if it being in the mask is going to confuse anyone. Ira > > -Eric
On Tue, Dec 01, 2020 at 02:12:19PM -0800, Linus Torvalds wrote: > On Tue, Dec 1, 2020 at 2:03 PM Eric Sandeen <sandeen@sandeen.net> wrote: > > > > That's why I was keen to just add DAX unconditionally at this point, and if we want > > to invent/refine meanings for the mask, we can still try to do that? > > Oh Gods. Let's *not* make this some "random filesystem choice" where > now semantics depends on what some filesystem decided to do, and > different filesystems have very subtly different semantics. > > This all screams "please keep this in the VFS layer" so that we at > least have _one_ place where these kinds of decisions are made. > > I suspect very very few people actually end up caring about any of the > STATX flags at all, of course. The fact that the DAX one was > apparently entirely the wrong bit argues that this isn't all that > important. Agreed. That whole support interface is just weird. But the only thing that remotely makes (a little bit of) sense is to just set all bits known about by this particular kernel in the VFS. Everything else is going to be a complete mess.
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 11c5c6fe75bb..3550783a6ea0 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -1653,11 +1653,15 @@ int ext2_getattr(const struct path *path, struct kstat *stat, stat->attributes |= STATX_ATTR_IMMUTABLE; if (flags & EXT2_NODUMP_FL) stat->attributes |= STATX_ATTR_NODUMP; + if (IS_DAX(inode)) + stat->attributes |= STATX_ATTR_DAX; + stat->attributes_mask |= (STATX_ATTR_APPEND | STATX_ATTR_COMPRESSED | STATX_ATTR_ENCRYPTED | STATX_ATTR_IMMUTABLE | - STATX_ATTR_NODUMP); + STATX_ATTR_NODUMP | + STATX_ATTR_DAX); generic_fillattr(inode, stat); return 0; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 0d8385aea898..848a0f2b154e 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5550,13 +5550,16 @@ int ext4_getattr(const struct path *path, struct kstat *stat, stat->attributes |= STATX_ATTR_NODUMP; if (flags & EXT4_VERITY_FL) stat->attributes |= STATX_ATTR_VERITY; + if (IS_DAX(inode)) + stat->attributes |= STATX_ATTR_DAX; stat->attributes_mask |= (STATX_ATTR_APPEND | STATX_ATTR_COMPRESSED | STATX_ATTR_ENCRYPTED | STATX_ATTR_IMMUTABLE | STATX_ATTR_NODUMP | - STATX_ATTR_VERITY); + STATX_ATTR_VERITY | + STATX_ATTR_DAX); generic_fillattr(inode, stat); return 0; diff --git a/fs/stat.c b/fs/stat.c index dacecdda2e79..5bd90949c69b 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -80,9 +80,6 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, if (IS_AUTOMOUNT(inode)) stat->attributes |= STATX_ATTR_AUTOMOUNT; - if (IS_DAX(inode)) - stat->attributes |= STATX_ATTR_DAX; - if (inode->i_op->getattr) return inode->i_op->getattr(path, stat, request_mask, query_flags); diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 1414ab79eacf..56deda7042fd 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -575,10 +575,13 @@ xfs_vn_getattr( stat->attributes |= STATX_ATTR_APPEND; if (ip->i_d.di_flags & XFS_DIFLAG_NODUMP) stat->attributes |= STATX_ATTR_NODUMP; + if (IS_DAX(inode)) + stat->attributes |= STATX_ATTR_DAX; stat->attributes_mask |= (STATX_ATTR_IMMUTABLE | STATX_ATTR_APPEND | - STATX_ATTR_NODUMP); + STATX_ATTR_NODUMP | + STATX_ATTR_DAX); switch (inode->i_mode & S_IFMT) { case S_IFBLK:
It's a bit odd to set STATX_ATTR_DAX into the statx attributes in the VFS; while the VFS can detect the current DAX state, it is the filesystem which actually sets S_DAX on the inode, and the filesystem is the place that knows whether DAX is something that the "filesystem actually supports" [1] so that the statx attributes_mask can be properly set. So, move STATX_ATTR_DAX attribute setting to the individual dax-capable filesystems, and update the attributes_mask there as well. [1] 3209f68b3ca4 statx: Include a mask for stx_attributes in struct statx Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- fs/ext2/inode.c | 6 +++++- fs/ext4/inode.c | 5 ++++- fs/stat.c | 3 --- fs/xfs/xfs_iops.c | 5 ++++- 4 files changed, 13 insertions(+), 6 deletions(-)