Message ID | 1503582115-37220-1-git-send-email-houtao1@huawei.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On Thu, Aug 24, 2017 at 09:41:55PM +0800, Hou Tao wrote: > It will be useful if there is a corresponding online uevent after > a XFS filesystem has been mounted. A typical usage of the uevent > is setting the error configuration for a specific XFS filesystem > or all XFS filesystems by using udevd. > > The following is an example of udevd rule which will shutdown > any XFS filesystem after the filesystem gets any IO error: > > ACTION=="online", SUBSYSTEM=="xfs", DEVPATH=="/fs/xfs/*", \ > RUN+="/bin/sh -c 'echo 0 > /sys%p/error/metadata/default/max_retries; \ > echo 0 > /sys%p/error/metadata/EIO/max_retries; \ > echo 0 > /sys%p/error/metadata/ENOSPC/max_retries; \ > echo 0 > /sys%p/error/metadata/ENODEV/max_retries'" > > Suggested-by: Dave Chinner <david@fromorbit.com> > Signed-off-by: Hou Tao <houtao1@huawei.com> > --- > fs/xfs/xfs_super.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > index 38aaacd..695fe85f 100644 > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -1530,6 +1530,19 @@ xfs_destroy_percpu_counters( > percpu_counter_destroy(&mp->m_fdblocks); > } > > +static void > +xfs_fs_uevent( > + struct xfs_mount *mp, > + enum kobject_action action) > +{ > + int err; > + > + err = kobject_uevent(&mp->m_kobj.kobject, action); It might be useful to send kobject_uevent_env so that we can squeeze in things like the fs uuid for easier identification? I envision a /etc/xfs/fs.conf file like this: [903db5b1-82cd-4f26-8221-443a4ed0563a] error.metadata.EIO.max_retries = 0 error.fail_at_unmount = 1 config.cowextsize = 1048576 config.inherit_noatime = 1 With a udev helper that uses the fsuuid to find the appropriate section of the ini^Wconfig file and load the appropriate values into sysfs. If we ever decide to expose more config knobs through sysfs then they'll be automatically supported. > + if (err) > + xfs_notice(mp, "Sending XFS uevent %d got error %d", > + action, err); > +} > + > STATIC int > xfs_fs_fill_super( > struct super_block *sb, > @@ -1667,6 +1680,8 @@ xfs_fs_fill_super( > goto out_unmount; > } > > + xfs_fs_uevent(mp, KOBJ_ONLINE); That said, I wonder if we ought to move this discussion to fsdevel to pull in any /other/ filesystems that have sysfs knobs that they might like to be able to persist? ext4 has a bunch of tunables too... --D > + > return 0; > > out_filestream_unmount: > -- > 2.5.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
+ linux-fsdevel On Thu, Aug 24, 2017 at 10:10:28AM -0700, Darrick J. Wong wrote: > On Thu, Aug 24, 2017 at 09:41:55PM +0800, Hou Tao wrote: > > It will be useful if there is a corresponding online uevent after > > a XFS filesystem has been mounted. A typical usage of the uevent > > is setting the error configuration for a specific XFS filesystem > > or all XFS filesystems by using udevd. > > > > The following is an example of udevd rule which will shutdown > > any XFS filesystem after the filesystem gets any IO error: > > > > ACTION=="online", SUBSYSTEM=="xfs", DEVPATH=="/fs/xfs/*", \ > > RUN+="/bin/sh -c 'echo 0 > /sys%p/error/metadata/default/max_retries; \ > > echo 0 > /sys%p/error/metadata/EIO/max_retries; \ > > echo 0 > /sys%p/error/metadata/ENOSPC/max_retries; \ > > echo 0 > /sys%p/error/metadata/ENODEV/max_retries'" > > > > Suggested-by: Dave Chinner <david@fromorbit.com> > > Signed-off-by: Hou Tao <houtao1@huawei.com> > > --- > > fs/xfs/xfs_super.c | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > > index 38aaacd..695fe85f 100644 > > --- a/fs/xfs/xfs_super.c > > +++ b/fs/xfs/xfs_super.c > > @@ -1530,6 +1530,19 @@ xfs_destroy_percpu_counters( > > percpu_counter_destroy(&mp->m_fdblocks); > > } > > > > +static void > > +xfs_fs_uevent( > > + struct xfs_mount *mp, > > + enum kobject_action action) > > +{ > > + int err; > > + > > + err = kobject_uevent(&mp->m_kobj.kobject, action); > > It might be useful to send kobject_uevent_env so that we can squeeze in > things like the fs uuid for easier identification? I envision a > /etc/xfs/fs.conf file like this: > > [903db5b1-82cd-4f26-8221-443a4ed0563a] > error.metadata.EIO.max_retries = 0 > error.fail_at_unmount = 1 > config.cowextsize = 1048576 > config.inherit_noatime = 1 > > With a udev helper that uses the fsuuid to find the appropriate section > of the ini^Wconfig file and load the appropriate values into sysfs. Interesting, so yet-another possible configuration file for filesystems use here could be of a udev helper scraping for matches. This in turn would mean such udev helpers would need support to parse config files. Reason I think this is important is it may raise the importance of evaluating which file configuration parser we end up using for mkfs.xfs as well. While they don't necessarily need to match, its perhaps ideal if they could. I realize for instance e2fsprogs is using its own libprofile fork, and chances of it moving off are low due to the gains libprofile provides for it: supports different OSes easier, the size is great. However if in any way the more filesystems will tend use adopt configuration files it may be a question to raise once again for that filesystem if the parser of choice used is the best. Or a general community one: do we want to share a common config file parser? On that note FWIW, at this point I've reviewed 4 libraries for mkfs.xfs: o libconfig -- we want to stay away from this [0] o e2fsprogs libprofile -- its a fork of krb5 libparser, size-wise this is [1] a best option, however its also a fork on its own, or o libini_config -- my pick and what I recommend [2] due it supporting uint64_t well, having a good new community traction, and a good set of tests o nfs: has its own incarnation of a library *Iff* we want to share the same library for mkfs.xfs and also a udev helper, then a new requirement here is perhaps also having a respective library on other higher level language like python for instance, if not now perhaps for the future. Unless of course we'd expect such udev heleprs to be C based. [0] https://gitlab.com/mcgrof/libconfig-int64-issues [1] https://gitlab.com/mcgrof/libekprofile [2] https://gitlab.com/mcgrof/libini_config-demo > If > we ever decide to expose more config knobs through sysfs then they'll > be automatically supported. While all this is nice, I'm sure we're all aware of the dangers of setting things in stone through sysfs, its likely already decided for the above tunables, but adding a uevent *is* yet another layer of user interface which userspace can *expect* and we should be certain we want this so we won't regress userspace later. Just saying, we better damn be sure this is the way we want to go. > > + if (err) > > + xfs_notice(mp, "Sending XFS uevent %d got error %d", > > + action, err); > > +} > > + > > STATIC int > > xfs_fs_fill_super( > > struct super_block *sb, > > @@ -1667,6 +1680,8 @@ xfs_fs_fill_super( > > goto out_unmount; > > } > > > > + xfs_fs_uevent(mp, KOBJ_ONLINE); > > That said, I wonder if we ought to move this discussion to fsdevel to > pull in any /other/ filesystems that have sysfs knobs that they might > like to be able to persist? ext4 has a bunch of tunables too... Done :) Luis -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On the other thread we can discuss the configuration file parser used, this other thread we can use for the sysfs knobs and if it makes sense to share anything in consideration for a possible future udev event. More below. On Thu, Aug 24, 2017 at 08:04:41PM +0200, Luis R. Rodriguez wrote: > On Thu, Aug 24, 2017 at 10:10:28AM -0700, Darrick J. Wong wrote: > > It might be useful to send kobject_uevent_env so that we can squeeze in > > things like the fs uuid for easier identification? I envision a > > /etc/xfs/fs.conf file like this: > > > > [903db5b1-82cd-4f26-8221-443a4ed0563a] > > error.metadata.EIO.max_retries = 0 > > error.fail_at_unmount = 1 > > config.cowextsize = 1048576 > > config.inherit_noatime = 1 > > > > With a udev helper that uses the fsuuid to find the appropriate section > > of the ini^Wconfig file and load the appropriate values into sysfs. > > While all this is nice, I'm sure we're all aware of the dangers of setting > things in stone through sysfs, its likely already decided for the above > tunables, <-- udev thing and now snip --> > Just saying, we better damn be sure this is the way we want to go. > > > > + if (err) > > > + xfs_notice(mp, "Sending XFS uevent %d got error %d", > > > + action, err); > > > +} > > > + > > > STATIC int > > > xfs_fs_fill_super( > > > struct super_block *sb, > > > @@ -1667,6 +1680,8 @@ xfs_fs_fill_super( > > > goto out_unmount; > > > } > > > > > > + xfs_fs_uevent(mp, KOBJ_ONLINE); > > > > That said, I wonder if we ought to move this discussion to fsdevel to > > pull in any /other/ filesystems that have sysfs knobs that they might > > like to be able to persist? ext4 has a bunch of tunables too... So were you thinking perhaps if we are going to use udev that it may make sense then to instead consider if we want to share tunables for uevents? Then instead of each filesystem having its own parser, etc, we have set of common knobs for any filesystem possible. Is that what you had in mind to discuss? Luis -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 24, 2017 at 08:04:41PM +0200, Luis R. Rodriguez wrote: > On that note FWIW, at this point I've reviewed 4 libraries for mkfs.xfs: > > o libconfig -- we want to stay away from this [0] > o e2fsprogs libprofile -- its a fork of krb5 libparser, size-wise this is [1] > a best option, however its also a fork on its own, or > o libini_config -- my pick and what I recommend [2] due it supporting > uint64_t well, having a good new community traction, and a good set of > tests > o nfs: has its own incarnation of a library > > *Iff* we want to share the same library for mkfs.xfs and also a udev helper, > then a new requirement here is perhaps also having a respective library on > other higher level language like python for instance, if not now perhaps > for the future. Unless of course we'd expect such udev heleprs to be C > based. > > [0] https://gitlab.com/mcgrof/libconfig-int64-issues > [1] https://gitlab.com/mcgrof/libekprofile > [2] https://gitlab.com/mcgrof/libini_config-demo FWIW -- for a more thorough evaluation folks can read a bit more about configuration file uses for filesystems on this work-in-progress paper: https://gitlab.com/mcgrof/filesystem-configuration-paper/blob/master/paper.pdf It does not take into account the newly mentioned possible use case as this is new and just now considered. Feedback/rants on other points welcomed. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 24 2017, Luis R. Rodriguez wrote: > > Just saying, we better damn be sure this is the way we want to go. > Just to provide contrast, NFS has a config file which can provide certain defaults for mounted filesystems. However it doesn't require a udev event to achieve this. Instead, it provides "/sbin/mount.nfs" which imposes the defaults at mount time, rather than just after mount time. There could, conceivably, be an "/sbin/mount.xfs" which read the config file and set all the defaults concurrently with the mount operation. I'm not necessarily saying this is a better way to go, but I agree with the need to be sure, and this provides evidence that other approaches are possible and worth considering. NeilBrown
On Fri, Aug 25, 2017 at 08:11:05AM +1000, NeilBrown wrote: > On Thu, Aug 24 2017, Luis R. Rodriguez wrote: > > > > Just saying, we better damn be sure this is the way we want to go. > > > > Just to provide contrast, NFS has a config file which can provide > certain defaults for mounted filesystems. However it doesn't require a > udev event to achieve this. > > Instead, it provides "/sbin/mount.nfs" which imposes the defaults at > mount time, rather than just after mount time. > > There could, conceivably, be an "/sbin/mount.xfs" which read the config > file and set all the defaults concurrently with the mount operation. > > I'm not necessarily saying this is a better way to go, but I agree with > the need to be sure, and this provides evidence that other approaches > are possible and worth considering. I like the NFS approach given: o It would mean you can keep things contained within each filesystem userspace tool. This would not escalate the requirements for what parser to use, ie we would not need to pick one for a higher level language for udev rules. Tools which already have a parser like e2fsprogs can keep on chugging with libprofile, meanwhile XFS and others can choose whatever library their own heart pleases and the requirement still remains just simple file parsing, not needing higher level language support. o One advantage of having things like /sbin/mount.<filesystem> is shared filesystems between a type of system can be kept separately from host specific mount points. This could allow sharing *one* file for a type of target system (database, development, etc). There would not be any need to look for your own filesystem in /etc/fstab, or adding entries, then /etc/fstab would just become very host specific. o Custom entries can be supported without having to worry about breaking parsing of the general /etc/fstab, it also removes clutter from complex lines and entries out of /etc/fstab Are there other needs for uevents which could not be addressed through a filesystem configuration file or which prove more efficient or provide more flexibility? Luis -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 25, 2017 at 08:11:05AM +1000, NeilBrown wrote: > > Just to provide contrast, NFS has a config file which can provide > certain defaults for mounted filesystems. However it doesn't require a > udev event to achieve this. I think we're mixing multiple things, and that may be confusing the discussion. A config file parser is a technology. *How* you use the config file parser can be quite different even if it is using the same config file parser or config file format. From what I've seen in this e-mail thread, there are at least three different use cases: 1) How mkfs uses to create the file system 2) How the file system should be configured when/while it is mounted 3) Message marshalling and unmarshalling when communicating with the kernel via uevents > Instead, it provides "/sbin/mount.nfs" which imposes the defaults at > mount time, rather than just after mount time. > > There could, conceivably, be an "/sbin/mount.xfs" which read the config > file and set all the defaults concurrently with the mount operation. For nfs there are a large number of knobs that are of interest at mount time, such that specifying them all as mount options on the command line, or in /etc/fstab, gets spectacularly painful. Which is why /etc/nfsmount.conf makes sense. For ext4 and nfs we have many more knobs that come into play when the file system is created, and so for ext4 that's why I have /etc/mke2fs.conf, where I might have some complex stanzas such as: hugefile = { features = extent,huge_file,bigalloc,flex_bg,uninit_bg,dir_nlink,extra_isize,^resize_inode,sparse_super2 cluster_size = 32768 hash_alg = half_md4 reserved_ratio = 0.0 num_backup_sb = 0 packed_meta_blocks = 1 make_hugefiles = 1 inode_ratio = 4194304 hugefiles_dir = /huge-dir hugefiles_name = huge-data hugefiles_digits = 0 hugefiles_size = 0 hugefiles_align = 256M hugefiles_align_disk = true num_hugefiles = 1 zero_hugefiles = false flex_bg_size = 262144 } ...and then the sysadmin only has to type "mke2fs -T hugefile /dev/sdc3" (for example). The question of using the same parser for uevents is yet another can of worms. And there, I think we need to be careful about what the requirements are for these uevents. Uevents is just the technology. There are many use cases for that technology; are you using them to signal mounts/unmounts? And at the file system level or at the mount namespace level? Are you using them to single file system corruption problems? And if so, are you trying to trying to send a single bit of information --- "the file system is corrupted"? Or are you trying to send much more information --- for example, sending each inode that file system found that was corrupted, and each block number where the SCSI device found a media error? (We have a solution that does this inside Google, and it uses a netlink socket to do the delievery. I'm pretty sure that if you used uevents, udev could melt down if there are a large number of errors reported.) And then the final question here is how much does standardization help? Are there use cases where progams really want to use the same format when updating /etc/mke2fs.conf versus /etc/nfsmount.conf versus /etc/mkfs.xfs.conf? Or is this more of, "let's try to make life slightly easier for the user by keeping things mostly the same as we can"? Or is this just a matter of file system developers sharing their experiences of various different designs, just so that they are aware of what libaries or code bases are available for them to use? Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 24, 2017 at 07:34:53PM -0400, Theodore Ts'o wrote: > On Fri, Aug 25, 2017 at 08:11:05AM +1000, NeilBrown wrote: > > > > Just to provide contrast, NFS has a config file which can provide > > certain defaults for mounted filesystems. However it doesn't require a > > udev event to achieve this. > > I think we're mixing multiple things, and that may be confusing the > discussion. A config file parser is a technology. *How* you use the > config file parser can be quite different even if it is using the same > config file parser or config file format. > > From what I've seen in this e-mail thread, there are at least three > different use cases: > > 1) How mkfs uses to create the file system > 2) How the file system should be configured when/while it is mounted > 3) Message marshalling and unmarshalling when communicating with the > kernel via uevents Right, its perhaps easier to relate them to the tools used for interfacing with filesystems: a) mkfs.<filesystem> b) mount.<filesystem> c) new but in this particular use case it was to parse a configuration file to tweak a set of sysfs knobs which tune the filesystem c) is somewhat related to b) which begs the question if some of these parameters can instead be set at mount time, and if not why not. This is also why the question came up about making some of these knobs generic. > The question of using the same parser for uevents is yet another can > of worms. And there, I think we need to be careful about what the > requirements are for these uevents. Uevents is just the technology. > There are many use cases for that technology; are you using them to > signal mounts/unmounts? And at the file system level or at the mount > namespace level? In the given use case which brought this up it was to tweak specific filesystems with XFS syfs knobs for error handling, as an example shut down the filesyste on any IO error, for a careful pedantic and paranoid admin/setup. > Are you using them to single file system corruption problems? And if > so, are you trying to trying to send a single bit of information --- > "the file system is corrupted"? Or are you trying to send much more > information --- for example, sending each inode that file system found > that was corrupted, and each block number where the SCSI device found > a media error? (We have a solution that does this inside Google, and > it uses a netlink socket to do the delievery. I'm pretty sure that if > you used uevents, udev could melt down if there are a large number of > errors reported.) Such uses seem rather advanced and perhaps should be considered in the future. I agree netlink is much more suitable in that case (or generic netlink), that was designed to scale, at least at the networking level. > And then the final question here is how much does standardization > help? One aspect here was the syfs knobs and whether or not we wanted a shared set of errors for which we want to grant admins a way to say easily "Hey for these types of errors please shutdown the filesystem". Let's start off with that as it stays a bit more on topic for now. For the rest (mkfs.<filesystem>, mount.<filesystem>) I think that those already have generic attributes figured out, and that custom stuff has been thought out as well. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 24, 2017 at 10:10:28AM -0700, Darrick J. Wong wrote: > On Thu, Aug 24, 2017 at 09:41:55PM +0800, Hou Tao wrote: > > It will be useful if there is a corresponding online uevent after > > a XFS filesystem has been mounted. A typical usage of the uevent > > is setting the error configuration for a specific XFS filesystem > > or all XFS filesystems by using udevd. > > > > The following is an example of udevd rule which will shutdown > > any XFS filesystem after the filesystem gets any IO error: > > > > ACTION=="online", SUBSYSTEM=="xfs", DEVPATH=="/fs/xfs/*", \ > > RUN+="/bin/sh -c 'echo 0 > /sys%p/error/metadata/default/max_retries; \ > > echo 0 > /sys%p/error/metadata/EIO/max_retries; \ > > echo 0 > /sys%p/error/metadata/ENOSPC/max_retries; \ > > echo 0 > /sys%p/error/metadata/ENODEV/max_retries'" > > > > Suggested-by: Dave Chinner <david@fromorbit.com> > > Signed-off-by: Hou Tao <houtao1@huawei.com> > > --- > > fs/xfs/xfs_super.c | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > > index 38aaacd..695fe85f 100644 > > --- a/fs/xfs/xfs_super.c > > +++ b/fs/xfs/xfs_super.c > > @@ -1530,6 +1530,19 @@ xfs_destroy_percpu_counters( > > percpu_counter_destroy(&mp->m_fdblocks); > > } > > > > +static void > > +xfs_fs_uevent( > > + struct xfs_mount *mp, > > + enum kobject_action action) > > +{ > > + int err; > > + > > + err = kobject_uevent(&mp->m_kobj.kobject, action); > > It might be useful to send kobject_uevent_env so that we can squeeze in > things like the fs uuid for easier identification? I envision a > /etc/xfs/fs.conf file like this: > > [903db5b1-82cd-4f26-8221-443a4ed0563a] > error.metadata.EIO.max_retries = 0 > error.fail_at_unmount = 1 > config.cowextsize = 1048576 > config.inherit_noatime = 1 > > With a udev helper that uses the fsuuid to find the appropriate section > of the ini^Wconfig file and load the appropriate values into sysfs. If > we ever decide to expose more config knobs through sysfs then they'll > be automatically supported. Yup, that's sounds like a good idea. > > + if (err) > > + xfs_notice(mp, "Sending XFS uevent %d got error %d", > > + action, err); > > +} > > + > > STATIC int > > xfs_fs_fill_super( > > struct super_block *sb, > > @@ -1667,6 +1680,8 @@ xfs_fs_fill_super( > > goto out_unmount; > > } > > > > + xfs_fs_uevent(mp, KOBJ_ONLINE); > > That said, I wonder if we ought to move this discussion to fsdevel to > pull in any /other/ filesystems that have sysfs knobs that they might > like to be able to persist? ext4 has a bunch of tunables too... Nope. We'll never get anywhere if we do that, just like the last umpteenth times that generic fs events have been discussed. These are XFS specific kobjs so we can do what we like with them and we have an XFS specific problem that needs solving.... Cheers, Dave.
On Thu, Aug 24, 2017 at 08:04:41PM +0200, Luis R. Rodriguez wrote: > While all this is nice, I'm sure we're all aware of the dangers of setting > things in stone through sysfs, its likely already decided for the above > tunables, but adding a uevent *is* yet another layer of user interface > which userspace can *expect* and we should be certain we want this so > we won't regress userspace later. > > Just saying, we better damn be sure this is the way we want to go. I'm not sure what you are trying to warn us about? :/ These are events on an XFS specific kobj, it's not a generic VFS filesystem event we are generating here. It's no different from a hardware device generating it's own uevents to tell userspace something changed in the device. A quick grep would have told you that GFS2 already has it's own online/offline uevents (e.g. gfs2_online_uevent()), as does the DLM code. Orangefs seems to use quite a few of uevent notifications, too. So it's not like we're doing something controversial or unique here, nor something that locks us out of a non-existent VFS event notification mechanism. Please, talk all you want about a Grand Unified VFS Event Notification infrastructure and filesystem tool libraries, but please don't hijack an unrelated discussion to do it. Cheers, Dave.
On Fri, Aug 25, 2017 at 11:01:55AM +1000, Dave Chinner wrote: > On Thu, Aug 24, 2017 at 08:04:41PM +0200, Luis R. Rodriguez wrote: > > While all this is nice, I'm sure we're all aware of the dangers of setting > > things in stone through sysfs, its likely already decided for the above > > tunables, but adding a uevent *is* yet another layer of user interface > > which userspace can *expect* and we should be certain we want this so > > we won't regress userspace later. > > > > Just saying, we better damn be sure this is the way we want to go. > > I'm not sure what you are trying to warn us about? :/ > > These are events on an XFS specific kobj, it's not a generic VFS > filesystem event we are generating here. It's no different from a > hardware device generating it's own uevents to tell userspace > something changed in the device. I meant that once its sent even if it is XFS specific, it will be something that some userspace can expect and then we'd always have to send it later. > A quick grep would have told you that GFS2 already has it's own > online/offline uevents (e.g. gfs2_online_uevent()), as does the > DLM code. Orangefs seems to use quite a few of uevent notifications, > too. So it's not like we're doing something controversial or unique > here, nor something that locks us out of a non-existent VFS event > notification mechanism. I don't think its controversial *at all*. Just that a mount uevent, with uuid, sounds like something we could at least agree is pretty generic. Regardless of what we end up doing with it or how. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 25, 2017 at 03:20:01AM +0200, Luis R. Rodriguez wrote: > On Fri, Aug 25, 2017 at 11:01:55AM +1000, Dave Chinner wrote: > > On Thu, Aug 24, 2017 at 08:04:41PM +0200, Luis R. Rodriguez wrote: > > > While all this is nice, I'm sure we're all aware of the dangers of setting > > > things in stone through sysfs, its likely already decided for the above > > > tunables, but adding a uevent *is* yet another layer of user interface > > > which userspace can *expect* and we should be certain we want this so > > > we won't regress userspace later. > > > > > > Just saying, we better damn be sure this is the way we want to go. > > > > I'm not sure what you are trying to warn us about? :/ > > > > These are events on an XFS specific kobj, it's not a generic VFS > > filesystem event we are generating here. It's no different from a > > hardware device generating it's own uevents to tell userspace > > something changed in the device. > > I meant that once its sent even if it is XFS specific, it will be > something that some userspace can expect and then we'd always have > to send it later. Well, yes. Just like we have to support ioctls, /proc and /sysfs interfaces and the quota event interface essentially forever. There's nothing new or difficult about that. > > A quick grep would have told you that GFS2 already has it's own > > online/offline uevents (e.g. gfs2_online_uevent()), as does the > > DLM code. Orangefs seems to use quite a few of uevent notifications, > > too. So it's not like we're doing something controversial or unique > > here, nor something that locks us out of a non-existent VFS event > > notification mechanism. > > I don't think its controversial *at all*. Just that a mount uevent, > with uuid, sounds like something we could at least agree is pretty generic. Yeah, right. Call me cynical, but every single time this has been brought up it devolves into a paint-fest where everyone wants something to be added before anything can be done because generic filesystem events is a deep, dark complex hole. e.g. think of bind mounts: go and develop arguments for and against whether we should send generic mount notifications for bind mounts. Generic doesn't mean simple. IOWs, what you describe as "pretty generic", I see as "a great big can of worms". Cheers, Dave.
On Thu, Aug 24, 2017 at 07:34:53PM -0400, Theodore Ts'o wrote: > On Fri, Aug 25, 2017 at 08:11:05AM +1000, NeilBrown wrote: > > > > Just to provide contrast, NFS has a config file which can provide > > certain defaults for mounted filesystems. However it doesn't require a > > udev event to achieve this. > > I think we're mixing multiple things, and that may be confusing the > discussion. A config file parser is a technology. *How* you use the > config file parser can be quite different even if it is using the same > config file parser or config file format. <shrug> Well, they /are/ interlinked, at least -- if XFS were to adopt a program to inject config file values into the kernel and a config file for mkfs, I think it makes sense at least to think about sharing the same parser for both, though we needn't do that in /this/ particular thread. OTOH it's something to keep in the back of one's head while we hash out what, if anything, to put in a mount uevent. > From what I've seen in this e-mail thread, there are at least three > different use cases: > > 1) How mkfs uses to create the file system > 2) How the file system should be configured when/while it is mounted > 3) Message marshalling and unmarshalling when communicating with the > kernel via uevents I wasn't suggesting we try to make uevent formats so fancy that they require parsing; the current environment variable mechanism is just fine. > > Instead, it provides "/sbin/mount.nfs" which imposes the defaults at > > mount time, rather than just after mount time. > > > > There could, conceivably, be an "/sbin/mount.xfs" which read the config > > file and set all the defaults concurrently with the mount operation. > > For nfs there are a large number of knobs that are of interest at > mount time, such that specifying them all as mount options on the > command line, or in /etc/fstab, gets spectacularly painful. Which is > why /etc/nfsmount.conf makes sense. > > For ext4 and nfs we have many more knobs that come into play when the > file system is created, and so for ext4 that's why I have > /etc/mke2fs.conf, where I might have some complex stanzas such as: > > hugefile = { > features = extent,huge_file,bigalloc,flex_bg,uninit_bg,dir_nlink,extra_isize,^resize_inode,sparse_super2 > cluster_size = 32768 > hash_alg = half_md4 > reserved_ratio = 0.0 > num_backup_sb = 0 > packed_meta_blocks = 1 > make_hugefiles = 1 > inode_ratio = 4194304 > hugefiles_dir = /huge-dir > hugefiles_name = huge-data > hugefiles_digits = 0 > hugefiles_size = 0 > hugefiles_align = 256M > hugefiles_align_disk = true > num_hugefiles = 1 > zero_hugefiles = false > flex_bg_size = 262144 > } > > ...and then the sysadmin only has to type "mke2fs -T hugefile > /dev/sdc3" (for example). (FWIW I'm interested in pursuing usage types in a mkfs.xfs config but I'll hold on to that until we start that discussion.) > The question of using the same parser for uevents is yet another can > of worms. And there, I think we need to be careful about what the > requirements are for these uevents. Uevents is just the technology. > There are many use cases for that technology; are you using them to > signal mounts/unmounts? And at the file system level or at the mount > namespace level? > > Are you using them to single file system corruption problems? And if > so, are you trying to trying to send a single bit of information --- > "the file system is corrupted"? Or are you trying to send much more > information --- for example, sending each inode that file system found > that was corrupted, and each block number where the SCSI device found > a media error? (We have a solution that does this inside Google, and > it uses a netlink socket to do the delievery. I'm pretty sure that if > you used uevents, udev could melt down if there are a large number of > errors reported.) For now I'd only send along the fact that the fs shut down. If XFS ever grows the ability to receive notifications that a range of blocks has gone bad we could (ratelimited) push that up too. > And then the final question here is how much does standardization > help? Are there use cases where progams really want to use the same > format when updating /etc/mke2fs.conf versus /etc/nfsmount.conf versus > /etc/mkfs.xfs.conf? Or is this more of, "let's try to make life > slightly easier for the user by keeping things mostly the same as we > can"? Or is this just a matter of file system developers sharing > their experiences of various different designs, just so that they are > aware of what libaries or code bases are available for them to use? /I/ was looking for sharing experiences and trying to make life somewhat easier for administrators by keeping the syntax roughly the same, even if the particular keys and values aren't exactly the same from one fs to the other. Ultimately, we can (and probably will) do whatever we think is a good fit for XFS, but seeing as there are some things that other filesystems have done that we haven't, we could at least benefit from whatever people learned from plumbing support into those filesystms. I thought about the "mount.nfs mounts the filesystem and then plugs in the config as desired" but then realized that it's a suid executable and yeeech. :) --D > > Cheers, > > - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 25, 2017 at 12:18:30PM +1000, Dave Chinner wrote: > On Fri, Aug 25, 2017 at 03:20:01AM +0200, Luis R. Rodriguez wrote: > > On Fri, Aug 25, 2017 at 11:01:55AM +1000, Dave Chinner wrote: > > > On Thu, Aug 24, 2017 at 08:04:41PM +0200, Luis R. Rodriguez wrote: > > > > While all this is nice, I'm sure we're all aware of the dangers of setting > > > > things in stone through sysfs, its likely already decided for the above > > > > tunables, but adding a uevent *is* yet another layer of user interface > > > > which userspace can *expect* and we should be certain we want this so > > > > we won't regress userspace later. > > > > > > > > Just saying, we better damn be sure this is the way we want to go. > > > > > > I'm not sure what you are trying to warn us about? :/ > > > > > > These are events on an XFS specific kobj, it's not a generic VFS > > > filesystem event we are generating here. It's no different from a > > > hardware device generating it's own uevents to tell userspace > > > something changed in the device. > > > > I meant that once its sent even if it is XFS specific, it will be > > something that some userspace can expect and then we'd always have > > to send it later. > > Well, yes. Just like we have to support ioctls, /proc and /sysfs > interfaces and the quota event interface essentially forever. > There's nothing new or difficult about that. > > > > A quick grep would have told you that GFS2 already has it's own > > > online/offline uevents (e.g. gfs2_online_uevent()), as does the > > > DLM code. Orangefs seems to use quite a few of uevent notifications, > > > too. So it's not like we're doing something controversial or unique > > > here, nor something that locks us out of a non-existent VFS event > > > notification mechanism. > > > > I don't think its controversial *at all*. Just that a mount uevent, > > with uuid, sounds like something we could at least agree is pretty generic. It's not a slam dunk -- look at all the rattling that went on when overlayfs tried to use s_uuid. > Yeah, right. Call me cynical, but every single time this has been > brought up it devolves into a paint-fest where everyone wants > something to be added before anything can be done because generic > filesystem events is a deep, dark complex hole. e.g. think of bind > mounts: go and develop arguments for and against whether we should > send generic mount notifications for bind mounts. Generic doesn't > mean simple. > > IOWs, what you describe as "pretty generic", I see as "a great big > can of worms". By my count there are 70 filesystems (in fs/ anyway) so the likelihood of successfully coordinating the development of a common vfs uevent anything is so low I wasn't even going to consider it. I've proposed patches that change things across all the filesystems before; it sucks, and meh. I don't see a problem with continuing to let individual filesystems manage their own kobject attributes and events, but frankly I've found it very useful to absorb a more diverse set of points of view by asking other filesystem developers what they've done and what pain points they might have hit. What I'd really like to see is that XFS retains control of its per-fs kobj. We'll add FS UUID as an environment variable to the uevent patch (XFS has uuids, no controversy there!) and run it by fsdevel to see if anyone from the wider fsdevel audience can make a convincing argument to add/remove/change the variables that get passed to the uevent, or even to consider a different approach. (This is what seems to be happening, if in a somewhat incohesive manner.) Then we do our usual rounds of review and (presumably) merge it along with the appropriate xfsprogs helper. Since uevents can send freeform text to the uevent handler in the form of environment variables, this means that any other fs project that might be considering adding uevents can build on the keys we've created. They can also do their own thing. I've written udev rule files, it's not difficult to adapt the ruleset to format shifts, at least not as difficult as structure changes to an ABI. IOWs, I'm looking for de facto coordination if/where desired, not a heavy top down coordination process. If your project doesn't like it, you're not bound by it. Maybe I should've been more explicit with what I was looking for, but TBH in the email just prior to this thread being cc'd to fsdevel I really was only asking the xfs list "should we take this to fsdevel or just leave it here"? --D > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 38aaacd..695fe85f 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1530,6 +1530,19 @@ xfs_destroy_percpu_counters( percpu_counter_destroy(&mp->m_fdblocks); } +static void +xfs_fs_uevent( + struct xfs_mount *mp, + enum kobject_action action) +{ + int err; + + err = kobject_uevent(&mp->m_kobj.kobject, action); + if (err) + xfs_notice(mp, "Sending XFS uevent %d got error %d", + action, err); +} + STATIC int xfs_fs_fill_super( struct super_block *sb, @@ -1667,6 +1680,8 @@ xfs_fs_fill_super( goto out_unmount; } + xfs_fs_uevent(mp, KOBJ_ONLINE); + return 0; out_filestream_unmount:
It will be useful if there is a corresponding online uevent after a XFS filesystem has been mounted. A typical usage of the uevent is setting the error configuration for a specific XFS filesystem or all XFS filesystems by using udevd. The following is an example of udevd rule which will shutdown any XFS filesystem after the filesystem gets any IO error: ACTION=="online", SUBSYSTEM=="xfs", DEVPATH=="/fs/xfs/*", \ RUN+="/bin/sh -c 'echo 0 > /sys%p/error/metadata/default/max_retries; \ echo 0 > /sys%p/error/metadata/EIO/max_retries; \ echo 0 > /sys%p/error/metadata/ENOSPC/max_retries; \ echo 0 > /sys%p/error/metadata/ENODEV/max_retries'" Suggested-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Hou Tao <houtao1@huawei.com> --- fs/xfs/xfs_super.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)