Message ID | 1449714846-18174-1-git-send-email-quwenruo@cn.fujitsu.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Thu, Dec 10, 2015 at 10:34:06AM +0800, Qu Wenruo wrote: > Introduce a new mount option "nologreplay" to co-operate with "ro" mount > option to get real readonly mount, like "norecovery" in ext* and xfs. > > Since the new parse_options() need to check new flags at remount time, > so add a new parameter for parse_options(). > > Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> > Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> > Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com> I've read the discussions around the change and from the user's POV I'd suggest to add another mount option that would be just an alias for any mount options that would implement the 'hard-ro' semantics. Say it's called 'nowr'. Now it would imply 'nologreplay', but may cover more options in the future. mount -o ro,nowr /dev/sdx /mnt would work when switching kernels. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015-12-14 12:32, David Sterba wrote: > On Thu, Dec 10, 2015 at 10:34:06AM +0800, Qu Wenruo wrote: >> Introduce a new mount option "nologreplay" to co-operate with "ro" mount >> option to get real readonly mount, like "norecovery" in ext* and xfs. >> >> Since the new parse_options() need to check new flags at remount time, >> so add a new parameter for parse_options(). >> >> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> >> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> >> Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com> > > I've read the discussions around the change and from the user's POV I'd > suggest to add another mount option that would be just an alias for any > mount options that would implement the 'hard-ro' semantics. > > Say it's called 'nowr'. Now it would imply 'nologreplay', but may cover > more options in the future. It should also imply noatime. I'm not sure how BTRFS handles atime when mounted RO, but I know a lot of old UNIX systems updated atime even on filesystems mounted RO, and I know that at least at one point Linux did too. > > mount -o ro,nowr /dev/sdx /mnt > > would work when switching kernels. I like this idea, but I think that having a name like true-ro or hard-ro and making it imply ro (and noatime) would probably be better (or at least, simpler to use from a user perspective). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2015-12-14 at 18:32 +0100, David Sterba wrote: > I've read the discussions around the change and from the user's POV > I'd > suggest to add another mount option that would be just an alias for > any > mount options that would implement the 'hard-ro' semantics. Nice to hear... > Say it's called 'nowr' though I'm deeply saddened, that you don't like my proposed "hard-ro" which I though about for nearly 1s ;-) > mount -o ro,nowr /dev/sdx /mnt Sounds reasonable... especially I mean that, as long ro's documentation points to "nowr" and clearly states whether both (ro+nowr) are required to get the desired behaviour, I have no very strong opinion, whether both (ro+nowr) should be required, or whether nowr, should imply ro. Though I think, the later may be better. Thanks, Chris.
On Mon, 2015-12-14 at 12:50 -0500, Austin S. Hemmelgarn wrote: > It should also imply noatime. I'm not sure how BTRFS handles atime > when > mounted RO, but I know a lot of old UNIX systems updated atime even > on > filesystems mounted RO, and I know that at least at one point Linux > did too. I stumbled over that recently myself, and haven't bothered to try it out, yet. But Duncan's argument, why at least ro-snapshots (yes I know, this may not be exactly the same as RO mount option) would need to imply noatime, is pretty convincing. :) Anyway, if it "ro" wouldn't imply noatime, I would ask why, because the atime is definitely something the fs exports normally to userland,... and that's how I'd basically consider hard-ro vs. (soft-)ro: soft-ro: data as visible by the mounted fs must not change (unless perhaps for necessary repair/replay operations to get the filesystem back in a consistent state) hard-ro: soft-ro + nothing on the backing devices may change (bitwise) Cheers, Chris.
On 2015-12-14 14:16, Christoph Anton Mitterer wrote: > On Mon, 2015-12-14 at 12:50 -0500, Austin S. Hemmelgarn wrote: >> It should also imply noatime. I'm not sure how BTRFS handles atime >> when >> mounted RO, but I know a lot of old UNIX systems updated atime even >> on >> filesystems mounted RO, and I know that at least at one point Linux >> did too. > I stumbled over that recently myself, and haven't bothered to try it > out, yet. > But Duncan's argument, why at least ro-snapshots (yes I know, this may > not be exactly the same as RO mount option) would need to imply > noatime, is pretty convincing. :) The traditional reasoning was that read-only meant that users couldn't change anything, not that the actual data on disk wouldn't change. That, and there's been some really brain-dead software over the years that depended on atimes being right (now, the only remaining software I know of that even uses them at all is Mutt). > > Anyway, if it "ro" wouldn't imply noatime, I would ask why, because the > atime is definitely something the fs exports normally to userland,... > and that's how I'd basically consider hard-ro vs. (soft-)ro: > > soft-ro: data as visible by the mounted fs must not change (unless > perhaps for necessary repair/replay operations to get the > filesystem back in a consistent state) > hard-ro: soft-ro + nothing on the backing devices may change (bitwise) This should be 'Nothing on the backing device may change as a result of the FS', nitpicking I know, but we should be specific so that we hopefully avoid ending up in the same situation again. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2015-12-14 at 14:33 -0500, Austin S. Hemmelgarn wrote: > The traditional reasoning was that read-only meant that users > couldn't > change anything Where I'd however count the atime changes to. The atimes wouldn't change magically, but only because the user stared some program, configured some daemon, etc. ... which reads/writes/etc. the file. > , not that the actual data on disk wouldn't change. > That, and there's been some really brain-dead software over the years > that depended on atimes being right (now, the only remaining software > I > know of that even uses them at all is Mutt). Wasn't tmpwatcher anoterh candidate? > This should be 'Nothing on the backing device may change as a result > of > the FS', nitpicking I know, but we should be specific so that we > hopefully avoid ending up in the same situation again. Of course, you're right! :-) (especially when btrfs should ever be formalised in a standards document, this should read like: >hard-ro: Nothing on the backing device may change as a result of the >FS, however, e.g. maleware, may directly destroy the data on the >blockdevice ;-) Chris.
On 2015-12-14 14:44, Christoph Anton Mitterer wrote: > On Mon, 2015-12-14 at 14:33 -0500, Austin S. Hemmelgarn wrote: >> The traditional reasoning was that read-only meant that users >> couldn't >> change anything > Where I'd however count the atime changes to. > The atimes wouldn't change magically, but only because the user stared > some program, configured some daemon, etc. ... which reads/writes/etc. > the file. But reading the file is allowed, which is where this starts to get ambiguous. Reading a file updates the atime (and in fact, this is the way that most stuff that uses them cares about them), but even a ro mount allows reading the file. The traditional meaning of ro on UNIX was (AFAIUI) that directory structure couldn't change, new files couldn't be created, existing files couldn't be deleted, flags on the inodes couldn't be changed, and file data couldn't be changed. TBH, I'm not even certain that atime updates on ro filesystems was even an intentional thing in the first place, it really sounds to me like the type of thing that somebody forgot to put in a permissions check for, and then people thought it was a feature. > > >> , not that the actual data on disk wouldn't change. >> That, and there's been some really brain-dead software over the years >> that depended on atimes being right (now, the only remaining software >> I >> know of that even uses them at all is Mutt). > Wasn't tmpwatcher anoterh candidate? Most such software can use it, but doesn't depend on it. TBH, many people these days run /tmp (and even /var/tmp) on an in memory filesystem, so atime updates aren't as much of an issue there. Also, even with noatime, I'm pretty sure the VFS updates the atime every time the mtime changes (because not doing so would be somewhat stupid, and you're writing the inode anyway), which technically means that stuff could work around this by opening the file, truncating it to the size it already is, and then closing it. > > >> This should be 'Nothing on the backing device may change as a result >> of >> the FS', nitpicking I know, but we should be specific so that we >> hopefully avoid ending up in the same situation again. > Of course, you're right! :-) > > (especially when btrfs should ever be formalised in a standards > document, this should read like: >> hard-ro: Nothing on the backing device may change as a result of the >> FS, however, e.g. maleware, may directly destroy the data on the >> blockdevice ;-) > > > Chris. > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2015-12-14 at 15:20 -0500, Austin S. Hemmelgarn wrote: > On 2015-12-14 14:44, Christoph Anton Mitterer wrote: > > On Mon, 2015-12-14 at 14:33 -0500, Austin S. Hemmelgarn wrote: > > > The traditional reasoning was that read-only meant that users > > > couldn't > > > change anything > > Where I'd however count the atime changes to. > > The atimes wouldn't change magically, but only because the user > > stared > > some program, configured some daemon, etc. ... which > > reads/writes/etc. > > the file. > But reading the file is allowed, which is where this starts to get > ambiguous. Why? > Reading a file updates the atime (and in fact, this is the > way that most stuff that uses them cares about them), but even a ro > mount allows reading the file. As I just wrote in the other post, at least for btrfs (haven't checked ext/xfs due to being... well... lazy O:-) ) ro mount option or ro snapshot seems to mean: no atime updates even if mounted with strictatimes (or maybe I did just something stupid when checking, so better double check) > The traditional meaning of ro on UNIX > was (AFAIUI) that directory structure couldn't change, new files > couldn't be created, existing files couldn't be deleted, flags on the > inodes couldn't be changed, and file data couldn't be changed. TBH, > I'm > not even certain that atime updates on ro filesystems was even an > intentional thing in the first place, it really sounds to me like the > type of thing that somebody forgot to put in a permissions check for, > and then people thought it was a feature. Well in the end it probably doesn't matter how it came to existence,... rather what it should be and what it actually is. As said, I, personally, from the user PoV, would says soft-ro already includes no dates on files being modifiable (including atime), as I'd consider these a property of the file. However anyone else may of course see that differently and at the same time be smarter than I am. > Also, > even with noatime, I'm pretty sure the VFS updates the atime every > time > the mtime changes I've just checked and not it doesn't: File: ‘subvol/FILE’ Size: 8 Blocks: 16 IO Block: 4096 regular file Device: 30h/48d Inode: 257 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-12-15 00:01:46.452007798 +0100 Modify: 2015-12-15 00:31:26.579511816 +0100 Change: 2015-12-15 00:31:26.579511816 +0100 (rw,noatime mounted,... mtime, is more recent than atime) > (because not doing so would be somewhat stupid, and > you're writing the inode anyway), which technically means that stuff > could work around this by opening the file, truncating it to the size > it > already is, and then closing it. Hmm I don't have a strong opinion here... it sounds "supid" from the technical point in that it *could* write the atime and that wouldn't cost much. OTOH, that would make things more ambiguous when atimes change and when not... (they'd only change on writes, never on reads,...) So I think it's good as it is... and it matches the name, which is noatime - and not noatime-unless-on-writes ;-) Cheers, Chris.
On 2015-12-14 18:34, Christoph Anton Mitterer wrote: > On Mon, 2015-12-14 at 15:20 -0500, Austin S. Hemmelgarn wrote: >> On 2015-12-14 14:44, Christoph Anton Mitterer wrote: >>> On Mon, 2015-12-14 at 14:33 -0500, Austin S. Hemmelgarn wrote: >>>> The traditional reasoning was that read-only meant that users >>>> couldn't >>>> change anything >>> Where I'd however count the atime changes to. >>> The atimes wouldn't change magically, but only because the user >>> stared >>> some program, configured some daemon, etc. ... which >>> reads/writes/etc. >>> the file. >> But reading the file is allowed, which is where this starts to get >> ambiguous. > Why? Because according to POSIX, when a file gets read, the atime gets updated. Except that POSIX doesn't specify what happens if the filesystem is mounted read-only, but the underlying block device is writable. > >> Reading a file updates the atime (and in fact, this is the >> way that most stuff that uses them cares about them), but even a ro >> mount allows reading the file. > As I just wrote in the other post, at least for btrfs (haven't checked > ext/xfs due to being... well... lazy O:-) ) ro mount option or ro > snapshot seems to mean: no atime updates even if mounted with > strictatimes (or maybe I did just something stupid when checking, so > better double check) > > >> The traditional meaning of ro on UNIX >> was (AFAIUI) that directory structure couldn't change, new files >> couldn't be created, existing files couldn't be deleted, flags on the >> inodes couldn't be changed, and file data couldn't be changed. TBH, >> I'm >> not even certain that atime updates on ro filesystems was even an >> intentional thing in the first place, it really sounds to me like the >> type of thing that somebody forgot to put in a permissions check for, >> and then people thought it was a feature. > Well in the end it probably doesn't matter how it came to existence,... > rather what it should be and what it actually is. Knowing how you got where you are is pretty important for figuring out how to not end up there again :) > As said, I, personally, from the user PoV, would says soft-ro already > includes no dates on files being modifiable (including atime), as I'd > consider these a property of the file. > However anyone else may of course see that differently and at the same > time be smarter than I am. AFAIK, the original versions of UNIX had no touch command or utime() syscall, so ctime, mtime, and atime were these things that just got magically updated by the system (ctime is still this way), and thus wasn't something that was considered user modification to the filesystem. > >> Also, > >> even with noatime, I'm pretty sure the VFS updates the atime every >> time >> the mtime changes > I've just checked and not it doesn't: > File: ‘subvol/FILE’ > Size: 8 Blocks: 16 IO Block: 4096 regular > file > Device: 30h/48d Inode: 257 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: > ( 0/ root) > Access: 2015-12-15 00:01:46.452007798 +0100 > Modify: 2015-12-15 00:31:26.579511816 +0100 > Change: 2015-12-15 00:31:26.579511816 +0100 > > (rw,noatime mounted,... mtime, is more recent than atime) Hmm, I could have sworn that updating the mtime on a file would force an atime update. \me checks documentation. OK, I was thinking of relatime, which updates the atime if it's older than mtime or ctime. > >> (because not doing so would be somewhat stupid, and >> you're writing the inode anyway), which technically means that stuff >> could work around this by opening the file, truncating it to the size >> it >> already is, and then closing it. > Hmm I don't have a strong opinion here... it sounds "supid" from the > technical point in that it *could* write the atime and that wouldn't > cost much. > OTOH, that would make things more ambiguous when atimes change and when > not... (they'd only change on writes, never on reads,...) > So I think it's good as it is... and it matches the name, which is > noatime - and not noatime-unless-on-writes ;-) Except there are still ways to update the atime even on a filesystem mounted noatime. For example, on _every_ POSIX compliant system out there (and Linux itself is mostly POSIX compliant, it's primarily the userspace that isn't), you can update the atime using the utime() system call, unless the filesystem is read-only. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba wrote on 2015/12/14 18:32 +0100: > On Thu, Dec 10, 2015 at 10:34:06AM +0800, Qu Wenruo wrote: >> Introduce a new mount option "nologreplay" to co-operate with "ro" mount >> option to get real readonly mount, like "norecovery" in ext* and xfs. >> >> Since the new parse_options() need to check new flags at remount time, >> so add a new parameter for parse_options(). >> >> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> >> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> >> Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com> > > I've read the discussions around the change and from the user's POV I'd > suggest to add another mount option that would be just an alias for any > mount options that would implement the 'hard-ro' semantics. > > Say it's called 'nowr'. Now it would imply 'nologreplay', but may cover > more options in the future. > > mount -o ro,nowr /dev/sdx /mnt > > would work when switching kernels. > > That would be nice. I'd like to forward the idea/discussion to filesystem ml, not only btrfs maillist. Such behavior should better be coordinated between all(at least xfs and ext4 and btrfs) filesystems. One sad example is, we can't use 'norecovery' mount option to disable log replay in btrfs, as there is 'recovery' mount option already. So I hope we can have a unified mount option between mainline filesystems. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-12-16 at 09:36 +0800, Qu Wenruo wrote: > One sad example is, we can't use 'norecovery' mount option to disable > log replay in btrfs, as there is 'recovery' mount option already. I think "norecovery" would anyway not really fit... the name should rather indicated, that from the filesystem side, nothing changes the underlying device's contents. "norecovery" would just tell, that no recovery options would be tried, however, any other changes (optimisations, etc.) could still go on. David's "nowr" is already, better, though it could be misinterpreted as no write/read (as wr being rw swapped), so perhaps "nowrites" would be better... but that again may be considered just another name for "ro". So perhaps one could do something that includes "dev", like "rodev", "ro-dev", or "immuatable-dev"... or instead of "dev" "devs" to cover multi-device cases. OTOH, the devices aren't really set "ro" (as in blockdev --setro). Maybe "nodevwrites" or "no-dev-writes" or one of these with "device" not abbreviated? Many programs have a "--dry-run" option, but I kinda don't liky drymount or something like that. Guess from the above, I'd personally like "nodevwrites" the most. Oh and Qu's idea of coordinating that with the other filesystems is surely good. Cheers, Chris.
Qu Wenruo posted on Wed, 16 Dec 2015 09:36:23 +0800 as excerpted: > David Sterba wrote on 2015/12/14 18:32 +0100: >> On Thu, Dec 10, 2015 at 10:34:06AM +0800, Qu Wenruo wrote: >>> Introduce a new mount option "nologreplay" to co-operate with "ro" >>> mount option to get real readonly mount, like "norecovery" in ext* and >>> xfs. >>> >>> Since the new parse_options() need to check new flags at remount time, >>> so add a new parameter for parse_options(). >>> >>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> >>> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> >>> Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com> >> >> I've read the discussions around the change and from the user's POV I'd >> suggest to add another mount option that would be just an alias for any >> mount options that would implement the 'hard-ro' semantics. >> >> Say it's called 'nowr'. Now it would imply 'nologreplay', but may cover >> more options in the future. >> >> mount -o ro,nowr /dev/sdx /mnt >> >> would work when switching kernels. >> >> > That would be nice. > > I'd like to forward the idea/discussion to filesystem ml, not only btrfs > maillist. > > Such behavior should better be coordinated between all(at least xfs and > ext4 and btrfs) filesystems. > > One sad example is, we can't use 'norecovery' mount option to disable > log replay in btrfs, as there is 'recovery' mount option already. > > So I hope we can have a unified mount option between mainline > filesystems. FWIW, I was just reading the mount manpage in connection with a reply for a different thread, and noticed... mount (8) (from util-linux 2.27.1) says noload and norecovery are the same option, for ext3/4 at least. It refers to the xfs (5) manpage, from xfsprogs, for xfs mount options, and that's not installed here, so I can't confirm noload for it, but it's there for ext3/4. And noload doesn't have the namespace collision problem norecovery does on btrfs, so I'd strongly suggest using it, at least as an alias for whatever other btrfs-specific name we might choose.
On Wed, 2015-12-16 at 11:10 +0000, Duncan wrote: > And noload doesn't have the namespace collision problem norecovery > does > on btrfs, so I'd strongly suggest using it, at least as an alias for > whatever other btrfs-specific name we might choose. but noload is, AFAIU, not what's desired here, is it? Per manpage it's "Don't load the journal on mounting",... not only wouldn't that fit for btrfs, it's also not what's really desired, i.e. an option that implies everything necessary to not modify the device. Cheers, Chris.
On 2015-12-16 06:10, Duncan wrote: > Qu Wenruo posted on Wed, 16 Dec 2015 09:36:23 +0800 as excerpted: > >> David Sterba wrote on 2015/12/14 18:32 +0100: >>> On Thu, Dec 10, 2015 at 10:34:06AM +0800, Qu Wenruo wrote: >>>> Introduce a new mount option "nologreplay" to co-operate with "ro" >>>> mount option to get real readonly mount, like "norecovery" in ext* and >>>> xfs. >>>> >>>> Since the new parse_options() need to check new flags at remount time, >>>> so add a new parameter for parse_options(). >>>> >>>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> >>>> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> >>>> Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com> >>> >>> I've read the discussions around the change and from the user's POV I'd >>> suggest to add another mount option that would be just an alias for any >>> mount options that would implement the 'hard-ro' semantics. >>> >>> Say it's called 'nowr'. Now it would imply 'nologreplay', but may cover >>> more options in the future. >>> >>> mount -o ro,nowr /dev/sdx /mnt >>> >>> would work when switching kernels. >>> >>> >> That would be nice. >> >> I'd like to forward the idea/discussion to filesystem ml, not only btrfs >> maillist. >> >> Such behavior should better be coordinated between all(at least xfs and >> ext4 and btrfs) filesystems. >> >> One sad example is, we can't use 'norecovery' mount option to disable >> log replay in btrfs, as there is 'recovery' mount option already. >> >> So I hope we can have a unified mount option between mainline >> filesystems. > > FWIW, I was just reading the mount manpage in connection with a reply for > a different thread, and noticed... > > mount (8) (from util-linux 2.27.1) says noload and norecovery are the > same option, for ext3/4 at least. It refers to the xfs (5) manpage, from > xfsprogs, for xfs mount options, and that's not installed here, so I > can't confirm noload for it, but it's there for ext3/4. Unless it's undocumented, XFS doesn't have it (as much as I hate XFS, I have to have xfsprogs installed so that I can do recovery for the few systems at work that actually use it if the need arises). > > And noload doesn't have the namespace collision problem norecovery does > on btrfs, so I'd strongly suggest using it, at least as an alias for > whatever other btrfs-specific name we might choose. I kind of agree with Christoph here. I don't think that noload should be the what we actually use, although I do think having it as an alias for whatever name we end up using would be a good thing. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-12-16 at 07:12 -0500, Austin S. Hemmelgarn wrote: > I kind of agree with Christoph here. I don't think that noload > should > be the what we actually use, although I do think having it as an > alias > for whatever name we end up using would be a good thing. No, because people would start using it, getting used to it, and in 4 years we could never change it again,... which may be necessary... noload, seems to mean don't load the journal. Unless btrfs gets a journal in the sense xfs/ext has one, it simply should either not use that name at all... or not try to "map" it to something of it's own which is similar, but in reality not the same. Chris.
On 2015-12-16 07:34, Christoph Anton Mitterer wrote: > On Wed, 2015-12-16 at 07:12 -0500, Austin S. Hemmelgarn wrote: >> I kind of agree with Christoph here. I don't think that noload >> should >> be the what we actually use, although I do think having it as an >> alias >> for whatever name we end up using would be a good thing. > No, because people would start using it, getting used to it, and in 4 > years we could never change it again,... which may be necessary... > No, because we should ease the transition from other filesystems to the greatest extent reasonably possible. It should be clearly documented as an alias for compatibility with ext{3,4}, and that it might go away in the future. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-12-16 at 07:57 -0500, Austin S. Hemmelgarn wrote: > No, because we should ease the transition from other filesystems to > the > greatest extent reasonably possible. It should be clearly documented > as > an alias for compatibility with ext{3,4}, and that it might go away > in > the future. Where's the need for an easy migration path? Such option wouldn't be used by default and even if, people would need to change their fstab/etc. anyway (ext->btrfs). The note that things go away never really work out that easy... people start to use it, rely on it... and sooner or later you have a situation as with atime, that you basically never can get rid of it. IMHO an alias here would be just ambiguous. Cheers, Chris.
On Mon, Dec 14, 2015 at 12:50:37PM -0500, Austin S. Hemmelgarn wrote: > On 2015-12-14 12:32, David Sterba wrote: > > On Thu, Dec 10, 2015 at 10:34:06AM +0800, Qu Wenruo wrote: > >> Introduce a new mount option "nologreplay" to co-operate with "ro" mount > >> option to get real readonly mount, like "norecovery" in ext* and xfs. > >> > >> Since the new parse_options() need to check new flags at remount time, > >> so add a new parameter for parse_options(). > >> > >> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> > >> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> > >> Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com> > > > > I've read the discussions around the change and from the user's POV I'd > > suggest to add another mount option that would be just an alias for any > > mount options that would implement the 'hard-ro' semantics. > > > > Say it's called 'nowr'. Now it would imply 'nologreplay', but may cover > > more options in the future. > It should also imply noatime. I'm not sure how BTRFS handles atime when > mounted RO, but I know a lot of old UNIX systems updated atime even on > filesystems mounted RO, and I know that at least at one point Linux did too. A mount with -o ro will not touch atimes. At one point the read-only snapshots changed atimes, but this has been fixed since. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/inode.c#n1602 http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/btrfs/inode.c#n5973 > > mount -o ro,nowr /dev/sdx /mnt > > > > would work when switching kernels. > > I like this idea, but I think that having a name like true-ro or hard-ro > and making it imply ro (and noatime) would probably be better (or at > least, simpler to use from a user perspective). Ok, a single option to do the real-ro sounds better than ro,something. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 16, 2015 at 09:36:23AM +0800, Qu Wenruo wrote: > > > > mount -o ro,nowr /dev/sdx /mnt > > > > would work when switching kernels. > > > That would be nice. > > I'd like to forward the idea/discussion to filesystem ml, not only btrfs > maillist. Good idea. > Such behavior should better be coordinated between all(at least xfs and > ext4 and btrfs) filesystems. > > One sad example is, we can't use 'norecovery' mount option to disable > log replay in btrfs, as there is 'recovery' mount option already. I think we should pick a name that's not tied to the implementation how the potential writes could happen under a RO mount. Recovery/replay/whatever, the expected use is "avoid any writes". > So I hope we can have a unified mount option between mainline filesystems. That would be a good thing indeed. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Anton Mitterer posted on Wed, 16 Dec 2015 12:45:00 +0100 as excerpted: > On Wed, 2015-12-16 at 11:10 +0000, Duncan wrote: >> And noload doesn't have the namespace collision problem norecovery does >> on btrfs, so I'd strongly suggest using it, at least as an alias for >> whatever other btrfs-specific name we might choose. > > but noload is, AFAIU, not what's desired here, is it? > Per manpage it's "Don't load the journal on mounting",... not only > wouldn't that fit for btrfs, it's also not what's really desired, i.e. > an option that implies everything necessary to not modify the device. Well, "don't load the journal on mounting" is exactly what the option would do. The journal (aka log) of course has a slightly different meaning, it's only the fsync log, but loading it is exactly what the option would prevent, here. Of course that isn't to say there shouldn't be another option, call it nomodify, for argument, that includes this and perhaps other options that would otherwise trigger filesystem level changes on a normal read-only mount. Too bad we can't simply rename the recovery mount option so norecovery could be used as well, but I guess that could potentially break too many existing deployments. =:^(
On Thu, 2015-12-17 at 01:09 +0000, Duncan wrote: > Well, "don't load the journal on mounting" is exactly what the option > would do. The journal (aka log) of course has a slightly different > meaning, it's only the fsync log, but loading it is exactly what the > option would prevent, here. That's not the point. What David asked for was an option, that has the meaning "do what ever is necessary to mount the fs in such a way, that the device isn't changed". At least that's how I've understood him. *Right now* this is for btrfs the "nologreplay" option, so "nodevwrites" or whatever you call it, would simply imply "nologreply"... and in the future any other options that are necessary to get the above defined semantics. For ext, "nodevwrites" would imply "noload" (AFAIU). If you now make "noload" an alias for "nodevwrites" in btrfs you clearly break semantics here: "noload" from ext4 hasn't the same meaning as "nodevwrites" from btrfs.. it has *just now* while ext doesn't need any other possible future options. Maybe in 10 years, ext has a dozens new features (because btrfs still hasn't stabilised yet, as it misses snapshot-aware defrag and checksums for noCoWed data >:-D O:-D ... sorry, couldn't resist ;) ), one of that new features needs to be disabled for "hard ro mounts", thus ext's "nodevwrites" would in addition imply "noSuperNewExtFeature". Then "noload" from ext4 isn't even *effectively* the same anymore as "nodevwrites" from btrfs. Therefore, it shouldn't be an alias. Cheers, Chris.
diff --git a/Documentation/filesystems/btrfs.txt b/Documentation/filesystems/btrfs.txt index c772b47..0f2ed48 100644 --- a/Documentation/filesystems/btrfs.txt +++ b/Documentation/filesystems/btrfs.txt @@ -168,6 +168,15 @@ Options with (*) are default options and will not show in the mount options. notreelog Enable/disable the tree logging used for fsync and O_SYNC writes. + nologreplay + Disable the log tree replay at mount time to prevent filesystem + from getting modified. + Must be used with 'ro' mount option. + A filesystem mounted with the 'nologreplay' option cannot + transition to a read-write mount via remount,rw - the filesystem + must be unmounted and mounted back again if read-write access is + desired. + recovery Enable autorecovery attempts if a bad tree root is found at mount time. Currently this scans a list of several previous tree roots and tries to diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index a0165c6..c54ff25 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2184,6 +2184,7 @@ struct btrfs_ioctl_defrag_range_args { #define BTRFS_MOUNT_RESCAN_UUID_TREE (1 << 23) #define BTRFS_MOUNT_FRAGMENT_DATA (1 << 24) #define BTRFS_MOUNT_FRAGMENT_METADATA (1 << 25) +#define BTRFS_MOUNT_NOLOGREPLAY (1 << 26) #define BTRFS_DEFAULT_COMMIT_INTERVAL (30) #define BTRFS_DEFAULT_MAX_INLINE (8192) @@ -4070,7 +4071,8 @@ void btrfs_sysfs_remove_mounted(struct btrfs_fs_info *fs_info); ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size); /* super.c */ -int btrfs_parse_options(struct btrfs_root *root, char *options); +int btrfs_parse_options(struct btrfs_root *root, char *options, + unsigned long new_flags); int btrfs_sync_fs(struct super_block *sb, int wait); #ifdef CONFIG_PRINTK diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 1eb0839..617bf4f 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2711,7 +2711,7 @@ int open_ctree(struct super_block *sb, */ fs_info->compress_type = BTRFS_COMPRESS_ZLIB; - ret = btrfs_parse_options(tree_root, options); + ret = btrfs_parse_options(tree_root, options, sb->s_flags); if (ret) { err = ret; goto fail_alloc; @@ -3009,8 +3009,9 @@ retry_root_backup: if (ret) goto fail_trans_kthread; - /* do not make disk changes in broken FS */ - if (btrfs_super_log_root(disk_super) != 0) { + /* do not make disk changes in broken FS or nologreplay is given */ + if (btrfs_super_log_root(disk_super) != 0 && + !btrfs_test_opt(tree_root, NOLOGREPLAY)) { ret = btrfs_replay_log(fs_info, fs_devices); if (ret) { err = ret; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 24154e4..55989cb 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -302,7 +302,7 @@ enum { Opt_check_integrity_print_mask, Opt_fatal_errors, Opt_rescan_uuid_tree, Opt_commit_interval, Opt_barrier, Opt_nodefrag, Opt_nodiscard, Opt_noenospc_debug, Opt_noflushoncommit, Opt_acl, Opt_datacow, - Opt_datasum, Opt_treelog, Opt_noinode_cache, + Opt_datasum, Opt_treelog, Opt_noinode_cache, Opt_nologreplay, #ifdef CONFIG_BTRFS_DEBUG Opt_fragment_data, Opt_fragment_metadata, Opt_fragment_all, #endif @@ -334,6 +334,7 @@ static match_table_t tokens = { {Opt_noacl, "noacl"}, {Opt_notreelog, "notreelog"}, {Opt_treelog, "treelog"}, + {Opt_nologreplay, "nologreplay"}, {Opt_flushoncommit, "flushoncommit"}, {Opt_noflushoncommit, "noflushoncommit"}, {Opt_ratio, "metadata_ratio=%d"}, @@ -371,7 +372,8 @@ static match_table_t tokens = { * reading in a new superblock is parsed here. * XXX JDM: This needs to be cleaned up for remount. */ -int btrfs_parse_options(struct btrfs_root *root, char *options) +int btrfs_parse_options(struct btrfs_root *root, char *options, + unsigned long new_flags) { struct btrfs_fs_info *info = root->fs_info; substring_t args[MAX_OPT_ARGS]; @@ -386,8 +388,12 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) if (cache_gen) btrfs_set_opt(info->mount_opt, SPACE_CACHE); + /* + * Even the options are empty, we still need to do extra check + * against new flags + */ if (!options) - goto out; + goto check; /* * strsep changes the string, duplicate it because parse_options @@ -587,6 +593,10 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) btrfs_clear_and_info(root, NOTREELOG, "enabling tree log"); break; + case Opt_nologreplay: + btrfs_set_and_info(root, NOLOGREPLAY, + "disabling log replay at mount time"); + break; case Opt_flushoncommit: btrfs_set_and_info(root, FLUSHONCOMMIT, "turning on flush-on-commit"); @@ -753,6 +763,15 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) break; } } +check: + /* + * Extra check for current option against current flag + */ + if (btrfs_test_opt(root, NOLOGREPLAY) && !(new_flags & MS_RDONLY)) { + btrfs_err(root->fs_info, + "nologreplay must be used with ro mount option"); + ret = -EINVAL; + } out: if (!ret && btrfs_test_opt(root, SPACE_CACHE)) btrfs_info(root->fs_info, "disk space caching is enabled"); @@ -1154,6 +1173,8 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) seq_puts(seq, ",ssd"); if (btrfs_test_opt(root, NOTREELOG)) seq_puts(seq, ",notreelog"); + if (btrfs_test_opt(root, NOLOGREPLAY)) + seq_puts(seq, ",nologreplay"); if (btrfs_test_opt(root, FLUSHONCOMMIT)) seq_puts(seq, ",flushoncommit"); if (btrfs_test_opt(root, DISCARD)) @@ -1637,7 +1658,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) } } - ret = btrfs_parse_options(root, data); + ret = btrfs_parse_options(root, data, *flags); if (ret) { ret = -EINVAL; goto restore;