Message ID | 6c15e8cf5f832398461e8dbde0a02255c35fdd4b.1397572052.git.dsterba@suse.cz (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On 04/15/2014 10:41 AM, David Sterba wrote: > The patch "Btrfs: fix protection between send and root deletion" > (18f687d538449373c37c) does not actually prevent to delete the snapshot > and just takes care during background cleaning, but this seems rather > user unfriendly, this patch implements the idea presented in > > http://www.spinics.net/lists/linux-btrfs/msg30813.html > > - add an internal root_item flag to denote a dead root > - check if the send_in_progress is set and refuse to delete, otherwise > set the flag and proceed > - check the flag in send similar to the btrfs_root_readonly checks, for > all involved roots > > The root lookup in send via btrfs_read_fs_root_no_name will check if the > root is really dead or not. If it is, ENOENT, aborted send. If it's > alive, it's protected by send_in_progress, send can continue. I'm worried about the use case where we have: * periodic automated snapshots * periodic automated deletion of old snapshots * periodic send for backup The automated deletion doesn't want to error out if send is in progress, it just wants the deletion to happen in the background. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 15, 2014 at 10:52:14AM -0400, Chris Mason wrote: > On 04/15/2014 10:41 AM, David Sterba wrote: > >The patch "Btrfs: fix protection between send and root deletion" > >(18f687d538449373c37c) does not actually prevent to delete the snapshot > >and just takes care during background cleaning, but this seems rather > >user unfriendly, this patch implements the idea presented in > > > >http://www.spinics.net/lists/linux-btrfs/msg30813.html > > > >- add an internal root_item flag to denote a dead root > >- check if the send_in_progress is set and refuse to delete, otherwise > > set the flag and proceed > >- check the flag in send similar to the btrfs_root_readonly checks, for > > all involved roots > > > >The root lookup in send via btrfs_read_fs_root_no_name will check if the > >root is really dead or not. If it is, ENOENT, aborted send. If it's > >alive, it's protected by send_in_progress, send can continue. > > I'm worried about the use case where we have: > > * periodic automated snapshots > * periodic automated deletion of old snapshots > * periodic send for backup > > The automated deletion doesn't want to error out if send is in progress, it > just wants the deletion to happen in the background. I'd give the precedence to the 'backup' process before the 'clean old snapshots', because it can do more harm if the snapshot is removed meanwhile without any possibility to recover. I understand that send does not have to be done only for the backup purposes, the snapshots can be recreated in case of error, etc. Adding more tunables would lead to confusion and usability mess, eg. somehow mark the snapshot as disposable, or not, wrt the send-in-progress status. I don't want to go that way. The automatic deletion process is external to btrfs and has more context of what to do if the subvolume deletion fails, for example schedule another deletion attempt. I don't think this would cause severe problems if the the snapshots live for a bit longer, but yes it needs more work on the user's side. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/15/2014 11:44 AM, David Sterba wrote: > On Tue, Apr 15, 2014 at 10:52:14AM -0400, Chris Mason wrote: >> On 04/15/2014 10:41 AM, David Sterba wrote: >>> The patch "Btrfs: fix protection between send and root deletion" >>> (18f687d538449373c37c) does not actually prevent to delete the snapshot >>> and just takes care during background cleaning, but this seems rather >>> user unfriendly, this patch implements the idea presented in >>> >>> https://urldefense.proofpoint.com/v1/url?u=http://www.spinics.net/lists/linux-btrfs/msg30813.html&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=MrYh%2F3Q4f2DGE6aitHYI%2FFn%2F9KpbUxrrMnLZ69AX73s%3D%0A&s=5102d03d75a57a38a5f67f4a258b075a849ef8790cad39f8ed81adb40df73980 >>> >>> - add an internal root_item flag to denote a dead root >>> - check if the send_in_progress is set and refuse to delete, otherwise >>> set the flag and proceed >>> - check the flag in send similar to the btrfs_root_readonly checks, for >>> all involved roots >>> >>> The root lookup in send via btrfs_read_fs_root_no_name will check if the >>> root is really dead or not. If it is, ENOENT, aborted send. If it's >>> alive, it's protected by send_in_progress, send can continue. >> >> I'm worried about the use case where we have: >> >> * periodic automated snapshots >> * periodic automated deletion of old snapshots >> * periodic send for backup >> >> The automated deletion doesn't want to error out if send is in progress, it >> just wants the deletion to happen in the background. > > I'd give the precedence to the 'backup' process before the 'clean old > snapshots', because it can do more harm if the snapshot is removed > meanwhile without any possibility to recover. Right, we don't want either process to stop with an error. We just want them to continue happily and do the right thing... -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 15, 2014 at 12:00:49PM -0400, Chris Mason wrote: > >>I'm worried about the use case where we have: > >> > >> * periodic automated snapshots > >> * periodic automated deletion of old snapshots > >> * periodic send for backup > >> > >>The automated deletion doesn't want to error out if send is in progress, it > >>just wants the deletion to happen in the background. > > > >I'd give the precedence to the 'backup' process before the 'clean old > >snapshots', because it can do more harm if the snapshot is removed > >meanwhile without any possibility to recover. > > Right, we don't want either process to stop with an error. We just want > them to continue happily and do the right thing... ... if everything goes without errors. Not like send going out of memory, send through network has a glitch, send to a file runs out of space, and has to be restarted. Is this too unrealistic to happen? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/15/2014 12:27 PM, David Sterba wrote: > On Tue, Apr 15, 2014 at 12:00:49PM -0400, Chris Mason wrote: >>>> I'm worried about the use case where we have: >>>> >>>> * periodic automated snapshots >>>> * periodic automated deletion of old snapshots >>>> * periodic send for backup >>>> >>>> The automated deletion doesn't want to error out if send is in progress, it >>>> just wants the deletion to happen in the background. >>> >>> I'd give the precedence to the 'backup' process before the 'clean old >>> snapshots', because it can do more harm if the snapshot is removed >>> meanwhile without any possibility to recover. >> >> Right, we don't want either process to stop with an error. We just want >> them to continue happily and do the right thing... > > ... if everything goes without errors. Not like send going out of > memory, send through network has a glitch, send to a file runs out of > space, and has to be restarted. Is this too unrealistic to happen? > It's a good point, a better way to say what I have in mind is that we shouldn't be adding new transient errors to the send process (on purpose ;) -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 15, 2014 at 01:21:46PM -0400, Chris Mason wrote: > > > On 04/15/2014 12:27 PM, David Sterba wrote: > >On Tue, Apr 15, 2014 at 12:00:49PM -0400, Chris Mason wrote: > >>>>I'm worried about the use case where we have: > >>>> > >>>> * periodic automated snapshots > >>>> * periodic automated deletion of old snapshots > >>>> * periodic send for backup > >>>> > >>>>The automated deletion doesn't want to error out if send is in progress, it > >>>>just wants the deletion to happen in the background. > >>> > >>>I'd give the precedence to the 'backup' process before the 'clean old > >>>snapshots', because it can do more harm if the snapshot is removed > >>>meanwhile without any possibility to recover. > >> > >>Right, we don't want either process to stop with an error. We just want > >>them to continue happily and do the right thing... > > > >... if everything goes without errors. Not like send going out of > >memory, send through network has a glitch, send to a file runs out of > >space, and has to be restarted. Is this too unrealistic to happen? > > It's a good point, a better way to say what I have in mind is that we > shouldn't be adding new transient errors to the send process (on purpose ;) Ok, I got a wrong understanding from your previous reply. So the next thing in the list to try is to add tunables affecting delete vs send. Something like $ btrfs send --protect-subvols -c clone1 -c clone2 source that would disallow to delete clone1, clone2 and source, passed to the ioctl as a flag and internally adding another refcount for 'how many times it has been protected'. Sounds ugly, but would cover all possible combinations of sending with or without deletion protection. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/16/2014 09:32 AM, David Sterba wrote: > On Tue, Apr 15, 2014 at 01:21:46PM -0400, Chris Mason wrote: >> >> >> On 04/15/2014 12:27 PM, David Sterba wrote: >>> On Tue, Apr 15, 2014 at 12:00:49PM -0400, Chris Mason wrote: >>>>>> I'm worried about the use case where we have: >>>>>> >>>>>> * periodic automated snapshots >>>>>> * periodic automated deletion of old snapshots >>>>>> * periodic send for backup >>>>>> >>>>>> The automated deletion doesn't want to error out if send is in progress, it >>>>>> just wants the deletion to happen in the background. >>>>> >>>>> I'd give the precedence to the 'backup' process before the 'clean old >>>>> snapshots', because it can do more harm if the snapshot is removed >>>>> meanwhile without any possibility to recover. >>>> >>>> Right, we don't want either process to stop with an error. We just want >>>> them to continue happily and do the right thing... >>> >>> ... if everything goes without errors. Not like send going out of >>> memory, send through network has a glitch, send to a file runs out of >>> space, and has to be restarted. Is this too unrealistic to happen? >> >> It's a good point, a better way to say what I have in mind is that we >> shouldn't be adding new transient errors to the send process (on purpose ;) > > Ok, I got a wrong understanding from your previous reply. > > So the next thing in the list to try is to add tunables affecting delete > vs send. Something like > > $ btrfs send --protect-subvols -c clone1 -c clone2 source > > that would disallow to delete clone1, clone2 and source, passed to the > ioctl as a flag and internally adding another refcount for 'how many > times it has been protected'. Sounds ugly, but would cover all possible > combinations of sending with or without deletion protection. > Ok, I reread the patch and your original point about dealing with a send + delete + network interruption. That's the part I didn't catch the first time around. So in my example with the automated tool, the tool really shouldn't be deleting a snapshot where send is in progress. The tool should be told that snapshot is busy and try to delete it again later. It makes more sense now, 'll queue this up for 3.16 and we can try it out in -next. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2014/04/16 03:40 PM, Chris Mason wrote: > So in my example with the automated tool, the tool really shouldn't be > deleting a snapshot where send is in progress. The tool should be > told that snapshot is busy and try to delete it again later. > > It makes more sense now, 'll queue this up for 3.16 and we can try it > out in -next. > > -chris So ... does this mean the plan is to a) have userland tool give an error; or b) a deletion would be "scheduled" in the background for as soon as the send has completed?
On Wed, Apr 16, 2014 at 09:40:41AM -0400, Chris Mason wrote: > It makes more sense now, 'll queue this up for 3.16 and we can try it out in > -next. Thanks. 3.16 is fine for me. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 16, 2014 at 04:59:09PM +0200, Brendan Hide wrote: > On 2014/04/16 03:40 PM, Chris Mason wrote: > >So in my example with the automated tool, the tool really shouldn't be > >deleting a snapshot where send is in progress. The tool should be told > >that snapshot is busy and try to delete it again later. > > > >It makes more sense now, 'll queue this up for 3.16 and we can try it out > >in -next. > > > >-chris > So ... does this mean the plan is to a) have userland tool give an error; or > b) a deletion would be "scheduled" in the background for as soon as the send > has completed? b) is current state, a) is the plan with the patch, 'btrfs subvol delete' would return EPERM/EBUSY -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2014/04/16 05:22 PM, David Sterba wrote: > On Wed, Apr 16, 2014 at 04:59:09PM +0200, Brendan Hide wrote: >> On 2014/04/16 03:40 PM, Chris Mason wrote: >>> So in my example with the automated tool, the tool really shouldn't be >>> deleting a snapshot where send is in progress. The tool should be told >>> that snapshot is busy and try to delete it again later. >>> >>> It makes more sense now, 'll queue this up for 3.16 and we can try it out >>> in -next. >>> >>> -chris >> So ... does this mean the plan is to a) have userland tool give an error; or >> b) a deletion would be "scheduled" in the background for as soon as the send >> has completed? > b) is current state, a) is the plan > > with the patch, 'btrfs subvol delete' would return EPERM/EBUSY My apologies, I should have followed up on this a while ago already. :-/ Would having something closer to b) be more desirable if the resource simply disappears but continues in the background? This would be as in a lazy umount, where presently-open files are left open and writable but the directory tree has "disappeared". I submit that, with a), the actual status is more obvious/concrete whereas with b+lazy), current issues would flow smoothly with no errors and no foreseeable future issues. I reserve the right to be wrong, of course. ;)
On 04/15/2014 10:41 AM, David Sterba wrote: > The patch "Btrfs: fix protection between send and root deletion" > (18f687d538449373c37c) does not actually prevent to delete the snapshot > and just takes care during background cleaning, but this seems rather > user unfriendly, this patch implements the idea presented in > > https://urldefense.proofpoint.com/v1/url?u=http://www.spinics.net/lists/linux-btrfs/msg30813.html&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=oPq3jsCvKlZXohAj%2B1Mz9EvxoJ1iqsxD4%2Fkm%2By17esY%3D%0A&s=b820d70b0984d1e875bf8e22adadef7d6c2a6e0c1cdb4c3e0913f8f1521fc966 > > - add an internal root_item flag to denote a dead root > - check if the send_in_progress is set and refuse to delete, otherwise > set the flag and proceed > - check the flag in send similar to the btrfs_root_readonly checks, for > all involved roots > > The root lookup in send via btrfs_read_fs_root_no_name will check if the > root is really dead or not. If it is, ENOENT, aborted send. If it's > alive, it's protected by send_in_progress, send can continue. > > diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c > index 9dde9717c1b9..80ec4fdc0b40 100644 > --- a/fs/btrfs/send.c > +++ b/fs/btrfs/send.c > @@ -5263,7 +5263,7 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_) > > /* > * The subvolume must remain read-only during send, protect against > - * making it RW. > + * making it RW. This also protects against deletion. > */ > spin_lock(&send_root->root_item_lock); > send_root->send_in_progress++; > @@ -5284,6 +5284,15 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_) > goto out; > } > > + /* > + * Unlikely but possible, if the subvolume is marked for deletion but > + * is slow to remove the directory entry, send can still be started > + */ > + if (btrfs_root_dead(sctx->parent_root)) { > + ret = -EPERM; > + goto out; > + } > + This should be sctx->send_root and it should be lower right? At this point the sctx hasn't been allocated yet. I've changed it here and it'll go into integration shortly, but I can rebase in something different if required. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2c1a42ca519f..abe3eac6426b 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -754,6 +754,12 @@ struct btrfs_dir_item { #define BTRFS_ROOT_SUBVOL_RDONLY (1ULL << 0) +/* + * Internal in-memory flag that a subvolume has been marked for deletion but + * still visible as a directory + */ +#define BTRFS_ROOT_SUBVOL_DEAD (1ULL << 48) + struct btrfs_root_item { struct btrfs_inode_item inode; __le64 generation; @@ -2749,6 +2755,11 @@ static inline bool btrfs_root_readonly(struct btrfs_root *root) return (root->root_item.flags & cpu_to_le64(BTRFS_ROOT_SUBVOL_RDONLY)) != 0; } +static inline bool btrfs_root_dead(struct btrfs_root *root) +{ + return (root->root_item.flags & cpu_to_le64(BTRFS_ROOT_SUBVOL_DEAD)) != 0; +} + /* struct btrfs_root_backup */ BTRFS_SETGET_STACK_FUNCS(backup_tree_root, struct btrfs_root_backup, tree_root, 64); diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index a6d8efa46bfe..a385d88bd8d5 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2147,6 +2147,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file, struct btrfs_ioctl_vol_args *vol_args; struct btrfs_trans_handle *trans; struct btrfs_block_rsv block_rsv; + u64 root_flags; u64 qgroup_reserved; int namelen; int ret; @@ -2168,6 +2169,7 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file, if (err) goto out; + err = mutex_lock_killable_nested(&dir->i_mutex, I_MUTEX_PARENT); if (err == -EINTR) goto out_drop_write; @@ -2229,6 +2231,27 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file, } mutex_lock(&inode->i_mutex); + + /* + * Don't allow to delete a subvolume with send in progress. This is + * inside the i_mutex so the error handling that has to drop the bit + * again is not run concurrently. + */ + spin_lock(&dest->root_item_lock); + root_flags = btrfs_root_flags(&root->root_item); + if (root->send_in_progress == 0) { + btrfs_set_root_flags(&root->root_item, + root_flags | BTRFS_ROOT_SUBVOL_DEAD); + spin_unlock(&dest->root_item_lock); + } else { + spin_unlock(&dest->root_item_lock); + btrfs_warn(root->fs_info, + "Attempt to delete subvolume %llu during send", + root->root_key.objectid); + err = -EPERM; + goto out_dput; + } + err = d_invalidate(dentry); if (err) goto out_unlock; @@ -2317,6 +2340,13 @@ out_release: out_up_write: up_write(&root->fs_info->subvol_sem); out_unlock: + if (err) { + spin_lock(&dest->root_item_lock); + root_flags = btrfs_root_flags(&root->root_item); + btrfs_set_root_flags(&root->root_item, + root_flags & ~BTRFS_ROOT_SUBVOL_DEAD); + spin_unlock(&dest->root_item_lock); + } mutex_unlock(&inode->i_mutex); if (!err) { shrink_dcache_sb(root->fs_info->sb); diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 9dde9717c1b9..80ec4fdc0b40 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -5263,7 +5263,7 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_) /* * The subvolume must remain read-only during send, protect against - * making it RW. + * making it RW. This also protects against deletion. */ spin_lock(&send_root->root_item_lock); send_root->send_in_progress++; @@ -5284,6 +5284,15 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_) goto out; } + /* + * Unlikely but possible, if the subvolume is marked for deletion but + * is slow to remove the directory entry, send can still be started + */ + if (btrfs_root_dead(sctx->parent_root)) { + ret = -EPERM; + goto out; + } + arg = memdup_user(arg_, sizeof(*arg)); if (IS_ERR(arg)) { ret = PTR_ERR(arg); @@ -5411,7 +5420,8 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_) spin_lock(&sctx->parent_root->root_item_lock); sctx->parent_root->send_in_progress++; - if (!btrfs_root_readonly(sctx->parent_root)) { + if (!btrfs_root_readonly(sctx->parent_root) || + btrfs_root_dead(sctx->parent_root)) { spin_unlock(&sctx->parent_root->root_item_lock); srcu_read_unlock(&fs_info->subvol_srcu, index); ret = -EPERM;
The patch "Btrfs: fix protection between send and root deletion" (18f687d538449373c37c) does not actually prevent to delete the snapshot and just takes care during background cleaning, but this seems rather user unfriendly, this patch implements the idea presented in http://www.spinics.net/lists/linux-btrfs/msg30813.html - add an internal root_item flag to denote a dead root - check if the send_in_progress is set and refuse to delete, otherwise set the flag and proceed - check the flag in send similar to the btrfs_root_readonly checks, for all involved roots The root lookup in send via btrfs_read_fs_root_no_name will check if the root is really dead or not. If it is, ENOENT, aborted send. If it's alive, it's protected by send_in_progress, send can continue. CC: Miao Xie <miaox@cn.fujitsu.com> CC: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz> --- fs/btrfs/ctree.h | 11 +++++++++++ fs/btrfs/ioctl.c | 30 ++++++++++++++++++++++++++++++ fs/btrfs/send.c | 14 ++++++++++++-- 3 files changed, 53 insertions(+), 2 deletions(-)