diff mbox

Btrfs: disallow 'btrfs {balance,replace} cancel' on ro mounts

Message ID 1381426821-31981-1-git-send-email-idryomov@gmail.com (mailing list archive)
State Accepted, archived
Headers show

Commit Message

Ilya Dryomov Oct. 10, 2013, 5:40 p.m. UTC
For both balance and replace, cancelling involves changing the on-disk
state and committing a transaction, which is not a good thing to do on
read-only filesystems.

Cc: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
---
 fs/btrfs/dev-replace.c |    3 +++
 fs/btrfs/volumes.c     |    3 +++
 2 files changed, 6 insertions(+)

Comments

Wang Shilong Oct. 11, 2013, 1:13 a.m. UTC | #1
On 10/11/2013 01:40 AM, Ilya Dryomov wrote:

I have a question in my mind.

Can we reach a state that there is operation in progress when filesystem
has been readonly?If we do cancel operations on a ro filesystem, we should
get "No operations in progress" .

Thanks,
Wang
> For both balance and replace, cancelling involves changing the on-disk
> state and committing a transaction, which is not a good thing to do on
> read-only filesystems.
>
> Cc: Stefan Behrens <sbehrens@giantdisaster.de>
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> ---
>   fs/btrfs/dev-replace.c |    3 +++
>   fs/btrfs/volumes.c     |    3 +++
>   2 files changed, 6 insertions(+)
>
> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> index 9efb94e..98df261 100644
> --- a/fs/btrfs/dev-replace.c
> +++ b/fs/btrfs/dev-replace.c
> @@ -650,6 +650,9 @@ static u64 __btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info)
>   	u64 result;
>   	int ret;
>   
> +	if (fs_info->sb->s_flags & MS_RDONLY)
> +		return -EROFS;
> +
>   	mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
>   	btrfs_dev_replace_lock(dev_replace);
>   	switch (dev_replace->replace_state) {
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index a306db9..2630f38 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3424,6 +3424,9 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info)
>   
>   int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)
>   {
> +	if (fs_info->sb->s_flags & MS_RDONLY)
> +		return -EROFS;
> +
>   	mutex_lock(&fs_info->balance_mutex);
>   	if (!fs_info->balance_ctl) {
>   		mutex_unlock(&fs_info->balance_mutex);

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Behrens Oct. 11, 2013, 9:23 a.m. UTC | #2
On Fri, 11 Oct 2013 09:13:24 +0800, Wang Shilong wrote:
> On 10/11/2013 01:40 AM, Ilya Dryomov wrote:
> 
> I have a question in my mind.
> 
> Can we reach a state that there is operation in progress when filesystem
> has been readonly?If we do cancel operations on a ro filesystem, we should
> get "No operations in progress" .

Well, it's arguable what ro means. No write to the devices at all? Or
replay log on mount (which means to write to the filesystem), but no
write access in addition to that? Or allow filesystem internal things to
be modified and written to disk, like it was done when the balance or
replace control items were modified on disk when the cancelling was
requested?

In any case, don't make it more complicated then necessary IMO, and
return EROFS if someone calls cancel on a ro filesystem regardless of
the state of the operation. It only adds errors to try to distuingish
such things and is of no benefit for anybody IMHO.


>> For both balance and replace, cancelling involves changing the on-disk
>> state and committing a transaction, which is not a good thing to do on
>> read-only filesystems.
>>
>> Cc: Stefan Behrens <sbehrens@giantdisaster.de>
>> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
>> ---
>>   fs/btrfs/dev-replace.c |    3 +++
>>   fs/btrfs/volumes.c     |    3 +++
>>   2 files changed, 6 insertions(+)
>>
>> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
>> index 9efb94e..98df261 100644
>> --- a/fs/btrfs/dev-replace.c
>> +++ b/fs/btrfs/dev-replace.c
>> @@ -650,6 +650,9 @@ static u64 __btrfs_dev_replace_cancel(struct
>> btrfs_fs_info *fs_info)
>>       u64 result;
>>       int ret;
>>   +    if (fs_info->sb->s_flags & MS_RDONLY)
>> +        return -EROFS;
>> +
>>       mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
>>       btrfs_dev_replace_lock(dev_replace);
>>       switch (dev_replace->replace_state) {
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index a306db9..2630f38 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -3424,6 +3424,9 @@ int btrfs_pause_balance(struct btrfs_fs_info
>> *fs_info)
>>     int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)
>>   {
>> +    if (fs_info->sb->s_flags & MS_RDONLY)
>> +        return -EROFS;
>> +
>>       mutex_lock(&fs_info->balance_mutex);
>>       if (!fs_info->balance_ctl) {
>>           mutex_unlock(&fs_info->balance_mutex);
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Remco Hosman - Yerf-IT Oct. 11, 2013, 9:28 a.m. UTC | #3
Op 11-10-2013 11:23, Stefan Behrens schreef:
> On Fri, 11 Oct 2013 09:13:24 +0800, Wang Shilong wrote:
>> On 10/11/2013 01:40 AM, Ilya Dryomov wrote:
>>
>> I have a question in my mind.
>>
>> Can we reach a state that there is operation in progress when filesystem
>> has been readonly?If we do cancel operations on a ro filesystem, we should
>> get "No operations in progress" .
> Well, it's arguable what ro means. No write to the devices at all? Or
> replay log on mount (which means to write to the filesystem), but no
> write access in addition to that? Or allow filesystem internal things to
> be modified and written to disk, like it was done when the balance or
> replace control items were modified on disk when the cancelling was
> requested?
>
> In any case, don't make it more complicated then necessary IMO, and
> return EROFS if someone calls cancel on a ro filesystem regardless of
> the state of the operation. It only adds errors to try to distuingish
> such things and is of no benefit for anybody IMHO.
>
just my 2 cents:

I once had to recover a ext3 filesystem from a device that would crash 
if you write anything beyond 2T. The problem is that there was an entry 
in the journal telling the OS to do just that. so the device would just 
crash every time out mount the filesystem. even in RO mode.

The manual speaks about a 'skip journal replay' option, but it was never 
implemented.
kernel source was something like:
case: SKIP_JOURNAL_REPLAY:
     return error;


The result: there was no way to mount the filesystem, even in RO mode. i 
endedup dd'ing the whole thing to another device, and then mounting the 
resulting image. it took a very long time!

i would expect a RO mount never to write anything to a filesystem. not 
even replay a journal (or a seperate option for that).
Its possible that the device is not writable at all, if its a snapshot 
or a RO iscsi device of some kind.

Remco

>>> For both balance and replace, cancelling involves changing the on-disk
>>> state and committing a transaction, which is not a good thing to do on
>>> read-only filesystems.
>>>
>>> Cc: Stefan Behrens <sbehrens@giantdisaster.de>
>>> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
>>> ---
>>>    fs/btrfs/dev-replace.c |    3 +++
>>>    fs/btrfs/volumes.c     |    3 +++
>>>    2 files changed, 6 insertions(+)
>>>
>>> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
>>> index 9efb94e..98df261 100644
>>> --- a/fs/btrfs/dev-replace.c
>>> +++ b/fs/btrfs/dev-replace.c
>>> @@ -650,6 +650,9 @@ static u64 __btrfs_dev_replace_cancel(struct
>>> btrfs_fs_info *fs_info)
>>>        u64 result;
>>>        int ret;
>>>    +    if (fs_info->sb->s_flags & MS_RDONLY)
>>> +        return -EROFS;
>>> +
>>>        mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
>>>        btrfs_dev_replace_lock(dev_replace);
>>>        switch (dev_replace->replace_state) {
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index a306db9..2630f38 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -3424,6 +3424,9 @@ int btrfs_pause_balance(struct btrfs_fs_info
>>> *fs_info)
>>>      int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)
>>>    {
>>> +    if (fs_info->sb->s_flags & MS_RDONLY)
>>> +        return -EROFS;
>>> +
>>>        mutex_lock(&fs_info->balance_mutex);
>>>        if (!fs_info->balance_ctl) {
>>>            mutex_unlock(&fs_info->balance_mutex);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roman Mamedov Oct. 11, 2013, 9:35 a.m. UTC | #4
On Fri, 11 Oct 2013 11:23:04 +0200
Stefan Behrens <sbehrens@giantdisaster.de> wrote:

> On Fri, 11 Oct 2013 09:13:24 +0800, Wang Shilong wrote:
> > On 10/11/2013 01:40 AM, Ilya Dryomov wrote:
> > 
> > I have a question in my mind.
> > 
> > Can we reach a state that there is operation in progress when filesystem
> > has been readonly?If we do cancel operations on a ro filesystem, we should
> > get "No operations in progress" .
> 
> Well, it's arguable what ro means. No write to the devices at all?

If I had an FS image and mounted it as -o loop,ro I'd simply expect md5sum of
that image to match before mount and after unmount, i.e. no writes at all.
Really, how can one argue with what "read only" means? If it will mean
something else than a complete absence of writes, then how can we mount
devices or FS images to do forensics, etc? Or do a recovery from a difficult
corruption or try to debug an FS crash.
Ilya Dryomov Oct. 11, 2013, 9:36 a.m. UTC | #5
On Fri, Oct 11, 2013 at 4:13 AM, Wang Shilong
<wangsl.fnst@cn.fujitsu.com> wrote:
> On 10/11/2013 01:40 AM, Ilya Dryomov wrote:
>
> I have a question in my mind.
>
> Can we reach a state that there is operation in progress when filesystem
> has been readonly?If we do cancel operations on a ro filesystem, we should
> get "No operations in progress" .

So if I understand you correctly you are saying that we should return
the "Not in progress" error code if the filesystem is mounted ro and
there is nothing to cancel.  I actually did think about it, but decided
against it because one can extend this argument to starting commands
with something like "If I order a start on a read-only fs and that
operation has already been started, I should get EINPROGRESS.", and
that's not what we are currently doing.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilya Dryomov Oct. 11, 2013, 9:39 a.m. UTC | #6
On Fri, Oct 11, 2013 at 12:28 PM, Remco Hosman - Yerf-IT
<remco@yerf-it.nl> wrote:
> i would expect a RO mount never to write anything to a filesystem. not even
> replay a journal (or a seperate option for that).
> Its possible that the device is not writable at all, if its a snapshot or a
> RO iscsi device of some kind.

I agree, and that's exactly the point of this patch.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hugo Mills Oct. 11, 2013, 9:55 a.m. UTC | #7
On Fri, Oct 11, 2013 at 03:35:46PM +0600, Roman Mamedov wrote:
> On Fri, 11 Oct 2013 11:23:04 +0200
> Stefan Behrens <sbehrens@giantdisaster.de> wrote:
> 
> > On Fri, 11 Oct 2013 09:13:24 +0800, Wang Shilong wrote:
> > > On 10/11/2013 01:40 AM, Ilya Dryomov wrote:
> > > 
> > > I have a question in my mind.
> > > 
> > > Can we reach a state that there is operation in progress when filesystem
> > > has been readonly?If we do cancel operations on a ro filesystem, we should
> > > get "No operations in progress" .
> > 
> > Well, it's arguable what ro means. No write to the devices at all?
> 
> If I had an FS image and mounted it as -o loop,ro I'd simply expect md5sum of
> that image to match before mount and after unmount, i.e. no writes at all.
> Really, how can one argue with what "read only" means? If it will mean
> something else than a complete absence of writes, then how can we mount
> devices or FS images to do forensics, etc? Or do a recovery from a difficult
> corruption or try to debug an FS crash.

   I think the issue here is that with ext3,4 the FS data structures
can be in an inconsistent (i.e. broken) state unless you do the
journal replay, so you're left with the choice of "replay the journal
and have a working filesystem that you can't write to", or "don't
modify anything, and have a useless filesystem with errors you could
fix with the information in the journal".

   Fortunately, we don't (well, shouldn't) have that issue in btrfs --
if we don't replay the log, we've still got a consistent FS, so it's
much easier conceptually to mount an FS read-only (with the proviso
that a RO mount may be missing some data that was saved earlier, but
you'll get that back if you allow the log reply with a rw mount
later).

   Hugo.
diff mbox

Patch

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 9efb94e..98df261 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -650,6 +650,9 @@  static u64 __btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info)
 	u64 result;
 	int ret;
 
+	if (fs_info->sb->s_flags & MS_RDONLY)
+		return -EROFS;
+
 	mutex_lock(&dev_replace->lock_finishing_cancel_unmount);
 	btrfs_dev_replace_lock(dev_replace);
 	switch (dev_replace->replace_state) {
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a306db9..2630f38 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3424,6 +3424,9 @@  int btrfs_pause_balance(struct btrfs_fs_info *fs_info)
 
 int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)
 {
+	if (fs_info->sb->s_flags & MS_RDONLY)
+		return -EROFS;
+
 	mutex_lock(&fs_info->balance_mutex);
 	if (!fs_info->balance_ctl) {
 		mutex_unlock(&fs_info->balance_mutex);