Message ID | 12b53ab9683c805873d26a4881308137e0bd324e.1692097274.git.anand.jain@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] btrfs-progs: btrfstune -m|M remove 2-stage commit | expand |
On 2023/8/15 20:53, Anand Jain wrote: > The btrfstune -m|M operation changes the fsid and preserves the original > fsid in the metadata_uuid. > > Changing the fsid happens in two commits in the function > set_metadata_uuid() > - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag. > - Stage 2 updates the fsid in fs_info->super_copy (local memory), > resets the CHANGING_FSID_V2 flag (local memory), > and then calls commit. > > The two-stage operation with the CHANGING_FSID flag makes sense for > btrfstune -u|U, where the fsid is changed on each and every tree block, > involving many commits. The CHANGING_FSID flag helps to identify the > transient incomplete operation. > > However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2 flag, and > a single commit would have been sufficient. The original git commit that > added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for changing > the metadata uuid"), provides no reasoning or did I miss something? To me, it looks very sane to do all the operations just in one transaction. Furthermore unlike kernel, btrfs-progs has full control on whether to commit the transaction (exclusively owns the fs, thus no one else can modify/commit the fs), and all the operations are just modifying the super block flags. Thus merging the operations should be fine, and I'm a little surprised why we didn't do it in the first place. Thanks, Qu > (So marking this patch as RFC). > > This patch removes the two-stage commit for btrfstune -m|M and instead > performs it in a single commit. > > With this change, the CHANGING_FSID_V2 flag is unused. Nevertheless, for > legacy purposes, we will still reset the CHANGING_FSID_V2 flag during the > btrfstune -m|M commit. Furthermore, the patchset titled: > > [PATCH 00/16] btrfs-progs: recover from failed metadata_uuid > > will help recover from the split brain situation due to two stage > method in the older btrfstune. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > tune/change-metadata-uuid.c | 12 +----------- > 1 file changed, 1 insertion(+), 11 deletions(-) > > diff --git a/tune/change-metadata-uuid.c b/tune/change-metadata-uuid.c > index 371f34e679b4..ac8c2fd398c2 100644 > --- a/tune/change-metadata-uuid.c > +++ b/tune/change-metadata-uuid.c > @@ -33,7 +33,6 @@ int set_metadata_uuid(struct btrfs_root *root, const char *new_fsid_string) > u64 incompat_flags; > bool fsid_changed; > u64 super_flags; > - int ret; > > disk_super = root->fs_info->super_copy; > super_flags = btrfs_super_flags(disk_super); > @@ -66,14 +65,6 @@ int set_metadata_uuid(struct btrfs_root *root, const char *new_fsid_string) > > new_fsid = (memcmp(fsid, disk_super->fsid, BTRFS_FSID_SIZE) != 0); > > - /* Step 1 sets the in progress flag */ > - trans = btrfs_start_transaction(root, 1); > - super_flags |= BTRFS_SUPER_FLAG_CHANGING_FSID_V2; > - btrfs_set_super_flags(disk_super, super_flags); > - ret = btrfs_commit_transaction(trans, root); > - if (ret < 0) > - return ret; > - > if (new_fsid && fsid_changed && memcmp(disk_super->metadata_uuid, > fsid, BTRFS_FSID_SIZE) == 0) { > /* > @@ -106,8 +97,7 @@ int set_metadata_uuid(struct btrfs_root *root, const char *new_fsid_string) > trans = btrfs_start_transaction(root, 1); > > /* > - * Step 2 is to write the metadata_uuid, set the incompat flag and > - * clear the in progress flag > + * For leagcy purpose reset the BTRFS_SUPER_FLAG_CHANGING_FSID_V2 > */ > super_flags &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_V2; > btrfs_set_super_flags(disk_super, super_flags);
On Tue, Aug 15, 2023 at 08:53:24PM +0800, Anand Jain wrote: > The btrfstune -m|M operation changes the fsid and preserves the original > fsid in the metadata_uuid. > > Changing the fsid happens in two commits in the function > set_metadata_uuid() > - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag. > - Stage 2 updates the fsid in fs_info->super_copy (local memory), > resets the CHANGING_FSID_V2 flag (local memory), > and then calls commit. > > The two-stage operation with the CHANGING_FSID flag makes sense for > btrfstune -u|U, where the fsid is changed on each and every tree block, > involving many commits. The CHANGING_FSID flag helps to identify the > transient incomplete operation. > > However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2 flag, and > a single commit would have been sufficient. The original git commit that > added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for changing > the metadata uuid"), provides no reasoning or did I miss something? > (So marking this patch as RFC). I remember discussions with Nikolay about the corner cases that could happen due to interrupted conversion and there was a document explaining that. Part of that ended up in kernel in the logic to detect partially enabled metadata_uuid. So there was reason to do it in two phases but I'd have to look it up in mails or if I still have a link or copy of the document.
On 17/08/2023 19:52, David Sterba wrote: > On Tue, Aug 15, 2023 at 08:53:24PM +0800, Anand Jain wrote: >> The btrfstune -m|M operation changes the fsid and preserves the original >> fsid in the metadata_uuid. >> >> Changing the fsid happens in two commits in the function >> set_metadata_uuid() >> - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag. >> - Stage 2 updates the fsid in fs_info->super_copy (local memory), >> resets the CHANGING_FSID_V2 flag (local memory), >> and then calls commit. >> >> The two-stage operation with the CHANGING_FSID flag makes sense for >> btrfstune -u|U, where the fsid is changed on each and every tree block, >> involving many commits. The CHANGING_FSID flag helps to identify the >> transient incomplete operation. >> >> However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2 flag, and >> a single commit would have been sufficient. The original git commit that >> added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for changing >> the metadata uuid"), provides no reasoning or did I miss something? >> (So marking this patch as RFC). > > I remember discussions with Nikolay about the corner cases that could > happen due to interrupted conversion and there was a document explaining > that. Part of that ended up in kernel in the logic to detect partially > enabled metadata_uuid. So there was reason to do it in two phases but > I'd have to look it up in mails or if I still have a link or copy of the > document. On 18/08/2023 08:21, Qu Wenruo wrote: > Oh, my memory comes back, the original design for the two stage > commitment is to avoid split brain cases when one device is committed > with the new flag, while the remaining one doesn't. > > With the extra stage, even if at stage 1 or 2 the transaction is > interrupted and only one device got the new flag, it can help us to > locate the stage and recover. It is useful for `btrfstune -u` when there are many transaction commits to write. It uses the `CHANGING_FSID` flag for this purpose. Any device with the `CHANGING_FSID` flag fails to mount, and `btrfstune` should be called again to continue rewrite the new FSID. This is a fair process. However, in the case of `CHANGING_FSID_V2`, which the `btrfstune -m` command uses, only one transaction is required. How does this help? Disk1 Disk2 Commit1 CHANGING_FSID_V2 CHANGING_FSID_V2 Commit2 new-fsid new-fsid
On 2023/8/18 17:21, Anand Jain wrote: > On 17/08/2023 19:52, David Sterba wrote: >> On Tue, Aug 15, 2023 at 08:53:24PM +0800, Anand Jain wrote: >>> The btrfstune -m|M operation changes the fsid and preserves the original >>> fsid in the metadata_uuid. >>> >>> Changing the fsid happens in two commits in the function >>> set_metadata_uuid() >>> - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag. >>> - Stage 2 updates the fsid in fs_info->super_copy (local memory), >>> resets the CHANGING_FSID_V2 flag (local memory), >>> and then calls commit. >>> >>> The two-stage operation with the CHANGING_FSID flag makes sense for >>> btrfstune -u|U, where the fsid is changed on each and every tree block, >>> involving many commits. The CHANGING_FSID flag helps to identify the >>> transient incomplete operation. >>> >>> However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2 flag, >>> and >>> a single commit would have been sufficient. The original git commit that >>> added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for >>> changing >>> the metadata uuid"), provides no reasoning or did I miss something? >>> (So marking this patch as RFC). >> >> I remember discussions with Nikolay about the corner cases that could >> happen due to interrupted conversion and there was a document explaining >> that. Part of that ended up in kernel in the logic to detect partially >> enabled metadata_uuid. So there was reason to do it in two phases but >> I'd have to look it up in mails or if I still have a link or copy of the >> document. > > > On 18/08/2023 08:21, Qu Wenruo wrote: > > > Oh, my memory comes back, the original design for the two stage > > commitment is to avoid split brain cases when one device is committed > > with the new flag, while the remaining one doesn't. > > > > With the extra stage, even if at stage 1 or 2 the transaction is > > interrupted and only one device got the new flag, it can help us to > > locate the stage and recover. > > It is useful for `btrfstune -u` > when there are many transaction commits to write. It uses the > `CHANGING_FSID` flag for this purpose. Any device with the > `CHANGING_FSID` flag fails to mount, and `btrfstune` should be called > again to continue rewrite the new FSID. This is a fair process. > > > However, in the case of `CHANGING_FSID_V2`, which the `btrfstune -m` > command uses, only one transaction is required. How does this help? > > Disk1 Disk2 > > Commit1 CHANGING_FSID_V2 CHANGING_FSID_V2 > Commit2 new-fsid new-fsid > > > Instead if there is only one transaction to enable metadata_uuid feature, we can have the following situation where for the single transaction we only committed on disk 1: Disk 1 Disk 2 Meta uuid Regular UUID Then how do kernel/progs proper recover from this situation? Although I'd say, it's still possible to recover, but significantly more difficult, as we need to properly handle different generations of super blocks. For the two stage one, it's a little simpler but I'm not sure if we have the extra handling. E.g. if commit 1 failed partially: Disk 1 Disk 2 Changing_fsid_v2 no changing_fsid_v2 In this case, we can detects we're trying to start fsid change using metadata uuid. The same thing for the 2nd commit. Thanks, Qu
On 18/08/2023 17:27, Qu Wenruo wrote: > > > On 2023/8/18 17:21, Anand Jain wrote: >> On 17/08/2023 19:52, David Sterba wrote: >>> On Tue, Aug 15, 2023 at 08:53:24PM +0800, Anand Jain wrote: >>>> The btrfstune -m|M operation changes the fsid and preserves the >>>> original >>>> fsid in the metadata_uuid. >>>> >>>> Changing the fsid happens in two commits in the function >>>> set_metadata_uuid() >>>> - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag. >>>> - Stage 2 updates the fsid in fs_info->super_copy (local memory), >>>> resets the CHANGING_FSID_V2 flag (local memory), >>>> and then calls commit. >>>> >>>> The two-stage operation with the CHANGING_FSID flag makes sense for >>>> btrfstune -u|U, where the fsid is changed on each and every tree block, >>>> involving many commits. The CHANGING_FSID flag helps to identify the >>>> transient incomplete operation. >>>> >>>> However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2 >>>> flag, and >>>> a single commit would have been sufficient. The original git commit >>>> that >>>> added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for >>>> changing >>>> the metadata uuid"), provides no reasoning or did I miss something? >>>> (So marking this patch as RFC). >>> >>> I remember discussions with Nikolay about the corner cases that could >>> happen due to interrupted conversion and there was a document explaining >>> that. Part of that ended up in kernel in the logic to detect partially >>> enabled metadata_uuid. So there was reason to do it in two phases but >>> I'd have to look it up in mails or if I still have a link or copy of the >>> document. >> >> >> On 18/08/2023 08:21, Qu Wenruo wrote: >> >> > Oh, my memory comes back, the original design for the two stage >> > commitment is to avoid split brain cases when one device is committed >> > with the new flag, while the remaining one doesn't. >> > >> > With the extra stage, even if at stage 1 or 2 the transaction is >> > interrupted and only one device got the new flag, it can help us to >> > locate the stage and recover. >> >> It is useful for `btrfstune -u` >> when there are many transaction commits to write. It uses the >> `CHANGING_FSID` flag for this purpose. Any device with the >> `CHANGING_FSID` flag fails to mount, and `btrfstune` should be called >> again to continue rewrite the new FSID. This is a fair process. >> >> >> However, in the case of `CHANGING_FSID_V2`, which the `btrfstune -m` >> command uses, only one transaction is required. How does this help? >> >> Disk1 Disk2 >> >> Commit1 CHANGING_FSID_V2 CHANGING_FSID_V2 >> Commit2 new-fsid new-fsid >> >> >> > > Instead if there is only one transaction to enable metadata_uuid > feature, we can have the following situation where for the single > transaction we only committed on disk 1: > > Disk 1 Disk 2 > Meta uuid Regular UUID > > Then how do kernel/progs proper recover from this situation? > > Although I'd say, it's still possible to recover, but significantly more > difficult, as we need to properly handle different generations of super > blocks. > > For the two stage one, it's a little simpler but I'm not sure if we have > the extra handling. E.g. if commit 1 failed partially: > > Disk 1 Disk 2 > Changing_fsid_v2 no changing_fsid_v2 > > In this case, we can detects we're trying to start fsid change using > metadata uuid. > > The same thing for the 2nd commit. The changing_fsid_v2 flag an unnecessary overhead, right? As it could be something like: . Write new-fsid and Verify. Revert if needed and verify. ------ Summarizing the overall patches: Patchset [1] is a port of kernel changing_fsid_v2 recovery automation functions to the progs. So, hosts with older progs and can't upgrade to find a tmp host with the newer progs to fix the disks? [1] [PATCH 00/16] btrfs-progs: recover from failed metadata_uuid Automation in progs/kernel cannot fix all possible scenarios; we still need user intervention to reunite, as in patchset [2]. It adds --device and --noscan options to reunite. (Needs comments, so that I can rebase). [2] [PATCH 00/10] btrfs-progs: check and tune: add device and noscan options Patchset [3] removes the changing_fsid_v2 recovery code from the kernel, as pointed out- recovery code is robust in general, except in corner cases. So, after this, similar to changing_fsid, disks with changing_fsid_v2 are rejected. But when progs can recover the failed situation and could remove the use of the changing_fsid_v2 flag (patch [4]), we don't actually need the recovery in the kernel. [3] [PATCH] btrfs: reject device with CHANGING_FSID_V2 [4] [PATCH RFC] btrfs-progs: btrfstune -m|M remove 2-stage commit ------- Thanks, Anand
On Sat, Aug 19, 2023 at 07:14:40PM +0800, Anand Jain wrote: > > > On 18/08/2023 17:27, Qu Wenruo wrote: > > > > > > On 2023/8/18 17:21, Anand Jain wrote: > >> On 17/08/2023 19:52, David Sterba wrote: > >>> On Tue, Aug 15, 2023 at 08:53:24PM +0800, Anand Jain wrote: > >>>> The btrfstune -m|M operation changes the fsid and preserves the > >>>> original > >>>> fsid in the metadata_uuid. > >>>> > >>>> Changing the fsid happens in two commits in the function > >>>> set_metadata_uuid() > >>>> - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag. > >>>> - Stage 2 updates the fsid in fs_info->super_copy (local memory), > >>>> resets the CHANGING_FSID_V2 flag (local memory), > >>>> and then calls commit. > >>>> > >>>> The two-stage operation with the CHANGING_FSID flag makes sense for > >>>> btrfstune -u|U, where the fsid is changed on each and every tree block, > >>>> involving many commits. The CHANGING_FSID flag helps to identify the > >>>> transient incomplete operation. > >>>> > >>>> However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2 > >>>> flag, and > >>>> a single commit would have been sufficient. The original git commit > >>>> that > >>>> added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for > >>>> changing > >>>> the metadata uuid"), provides no reasoning or did I miss something? > >>>> (So marking this patch as RFC). > >>> > >>> I remember discussions with Nikolay about the corner cases that could > >>> happen due to interrupted conversion and there was a document explaining > >>> that. Part of that ended up in kernel in the logic to detect partially > >>> enabled metadata_uuid. So there was reason to do it in two phases but > >>> I'd have to look it up in mails or if I still have a link or copy of the > >>> document. > >> > >> > >> On 18/08/2023 08:21, Qu Wenruo wrote: > >> > >> > Oh, my memory comes back, the original design for the two stage > >> > commitment is to avoid split brain cases when one device is committed > >> > with the new flag, while the remaining one doesn't. > >> > > >> > With the extra stage, even if at stage 1 or 2 the transaction is > >> > interrupted and only one device got the new flag, it can help us to > >> > locate the stage and recover. > >> > >> It is useful for `btrfstune -u` > >> when there are many transaction commits to write. It uses the > >> `CHANGING_FSID` flag for this purpose. Any device with the > >> `CHANGING_FSID` flag fails to mount, and `btrfstune` should be called > >> again to continue rewrite the new FSID. This is a fair process. > >> > >> > >> However, in the case of `CHANGING_FSID_V2`, which the `btrfstune -m` > >> command uses, only one transaction is required. How does this help? > >> > >> Disk1 Disk2 > >> > >> Commit1 CHANGING_FSID_V2 CHANGING_FSID_V2 > >> Commit2 new-fsid new-fsid > >> > >> > >> > > > > Instead if there is only one transaction to enable metadata_uuid > > feature, we can have the following situation where for the single > > transaction we only committed on disk 1: > > > > Disk 1 Disk 2 > > Meta uuid Regular UUID > > > > Then how do kernel/progs proper recover from this situation? > > > > Although I'd say, it's still possible to recover, but significantly more > > difficult, as we need to properly handle different generations of super > > blocks. > > > > For the two stage one, it's a little simpler but I'm not sure if we have > > the extra handling. E.g. if commit 1 failed partially: > > > > Disk 1 Disk 2 > > Changing_fsid_v2 no changing_fsid_v2 > > > > In this case, we can detects we're trying to start fsid change using > > metadata uuid. > > > > The same thing for the 2nd commit. > > > > The changing_fsid_v2 flag an unnecessary overhead, right? As it could be > something like: > > . Write new-fsid and Verify. Revert if needed and verify. > > > ------ > Summarizing the overall patches: > > Patchset [1] is a port of kernel changing_fsid_v2 recovery automation > functions to the progs. So, hosts with older progs and can't upgrade to > find a tmp host with the newer progs to fix the disks? > > [1] [PATCH 00/16] btrfs-progs: recover from failed metadata_uuid > > > Automation in progs/kernel cannot fix all possible scenarios; we still > need user intervention to reunite, as in patchset [2]. It adds --device > and --noscan options to reunite. (Needs comments, so that I can rebase). > > [2] [PATCH 00/10] btrfs-progs: check and tune: add device and noscan > options > > > Patchset [3] removes the changing_fsid_v2 recovery code from the kernel, > as pointed out- recovery code is robust in general, except in corner > cases. So, after this, similar to changing_fsid, disks with > changing_fsid_v2 are rejected. Hm I think this is acceptable though degrading the feature a bit. We can't tell how often the kernel recovery of the metadata_uuid has happened but fixing all cases there might be too complex, more than it already is. The whole conversion in btrfstune is quite fast, crashing in the middle of that is possible but highly unlikely. We don't have a proper documentation for the metadata_uuid use case, there are the option and sysfs descriptions but not really how it's supposed to be used and what to do if setting the metadata_uuid fails. > But when progs can recover the failed situation and could remove the use > of the changing_fsid_v2 flag (patch [4]), we don't actually need the > recovery in the kernel. Ok, let's do it.
On 19/09/2023 06:38, David Sterba wrote: > On Sat, Aug 19, 2023 at 07:14:40PM +0800, Anand Jain wrote: >> >> >> On 18/08/2023 17:27, Qu Wenruo wrote: >>> >>> >>> On 2023/8/18 17:21, Anand Jain wrote: >>>> On 17/08/2023 19:52, David Sterba wrote: >>>>> On Tue, Aug 15, 2023 at 08:53:24PM +0800, Anand Jain wrote: >>>>>> The btrfstune -m|M operation changes the fsid and preserves the >>>>>> original >>>>>> fsid in the metadata_uuid. >>>>>> >>>>>> Changing the fsid happens in two commits in the function >>>>>> set_metadata_uuid() >>>>>> - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag. >>>>>> - Stage 2 updates the fsid in fs_info->super_copy (local memory), >>>>>> resets the CHANGING_FSID_V2 flag (local memory), >>>>>> and then calls commit. >>>>>> >>>>>> The two-stage operation with the CHANGING_FSID flag makes sense for >>>>>> btrfstune -u|U, where the fsid is changed on each and every tree block, >>>>>> involving many commits. The CHANGING_FSID flag helps to identify the >>>>>> transient incomplete operation. >>>>>> >>>>>> However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2 >>>>>> flag, and >>>>>> a single commit would have been sufficient. The original git commit >>>>>> that >>>>>> added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for >>>>>> changing >>>>>> the metadata uuid"), provides no reasoning or did I miss something? >>>>>> (So marking this patch as RFC). >>>>> >>>>> I remember discussions with Nikolay about the corner cases that could >>>>> happen due to interrupted conversion and there was a document explaining >>>>> that. Part of that ended up in kernel in the logic to detect partially >>>>> enabled metadata_uuid. So there was reason to do it in two phases but >>>>> I'd have to look it up in mails or if I still have a link or copy of the >>>>> document. >>>> >>>> >>>> On 18/08/2023 08:21, Qu Wenruo wrote: >>>> >>>> > Oh, my memory comes back, the original design for the two stage >>>> > commitment is to avoid split brain cases when one device is committed >>>> > with the new flag, while the remaining one doesn't. >>>> > >>>> > With the extra stage, even if at stage 1 or 2 the transaction is >>>> > interrupted and only one device got the new flag, it can help us to >>>> > locate the stage and recover. >>>> >>>> It is useful for `btrfstune -u` >>>> when there are many transaction commits to write. It uses the >>>> `CHANGING_FSID` flag for this purpose. Any device with the >>>> `CHANGING_FSID` flag fails to mount, and `btrfstune` should be called >>>> again to continue rewrite the new FSID. This is a fair process. >>>> >>>> >>>> However, in the case of `CHANGING_FSID_V2`, which the `btrfstune -m` >>>> command uses, only one transaction is required. How does this help? >>>> >>>> Disk1 Disk2 >>>> >>>> Commit1 CHANGING_FSID_V2 CHANGING_FSID_V2 >>>> Commit2 new-fsid new-fsid >>>> >>>> >>>> >>> >>> Instead if there is only one transaction to enable metadata_uuid >>> feature, we can have the following situation where for the single >>> transaction we only committed on disk 1: >>> >>> Disk 1 Disk 2 >>> Meta uuid Regular UUID >>> >>> Then how do kernel/progs proper recover from this situation? >>> >>> Although I'd say, it's still possible to recover, but significantly more >>> difficult, as we need to properly handle different generations of super >>> blocks. >>> >>> For the two stage one, it's a little simpler but I'm not sure if we have >>> the extra handling. E.g. if commit 1 failed partially: >>> >>> Disk 1 Disk 2 >>> Changing_fsid_v2 no changing_fsid_v2 >>> >>> In this case, we can detects we're trying to start fsid change using >>> metadata uuid. >>> >>> The same thing for the 2nd commit. >> >> >> >> The changing_fsid_v2 flag an unnecessary overhead, right? As it could be >> something like: >> >> . Write new-fsid and Verify. Revert if needed and verify. >> >> >> ------ >> Summarizing the overall patches: >> >> Patchset [1] is a port of kernel changing_fsid_v2 recovery automation >> functions to the progs. So, hosts with older progs and can't upgrade to >> find a tmp host with the newer progs to fix the disks? >> >> [1] [PATCH 00/16] btrfs-progs: recover from failed metadata_uuid >> >> >> Automation in progs/kernel cannot fix all possible scenarios; we still >> need user intervention to reunite, as in patchset [2]. It adds --device >> and --noscan options to reunite. (Needs comments, so that I can rebase). >> >> [2] [PATCH 00/10] btrfs-progs: check and tune: add device and noscan >> options >> >> >> Patchset [3] removes the changing_fsid_v2 recovery code from the kernel, >> as pointed out- recovery code is robust in general, except in corner >> cases. So, after this, similar to changing_fsid, disks with >> changing_fsid_v2 are rejected. > > Hm I think this is acceptable though degrading the feature a bit. We > can't tell how often the kernel recovery of the metadata_uuid has > happened but fixing all cases there might be too complex, more than it > already is. The whole conversion in btrfstune is quite fast, crashing in > the middle of that is possible but highly unlikely. > > We don't have a proper documentation for the metadata_uuid use case, > there are the option and sysfs descriptions but not really how it's > supposed to be used and what to do if setting the metadata_uuid fails. > >> But when progs can recover the failed situation and could remove the use >> of the changing_fsid_v2 flag (patch [4]), we don't actually need the >> recovery in the kernel. > > Ok, let's do it. Done. Kernel: [PATCH 0/2 v2] btrfs: reject device with CHANGING_FSID_V2 flag btrfs-progs (in the same order): [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel and [PATCH] btrfs-progs: misc-tests/034-metadata-uuid remove kernel support Thanks, Anand
diff --git a/tune/change-metadata-uuid.c b/tune/change-metadata-uuid.c index 371f34e679b4..ac8c2fd398c2 100644 --- a/tune/change-metadata-uuid.c +++ b/tune/change-metadata-uuid.c @@ -33,7 +33,6 @@ int set_metadata_uuid(struct btrfs_root *root, const char *new_fsid_string) u64 incompat_flags; bool fsid_changed; u64 super_flags; - int ret; disk_super = root->fs_info->super_copy; super_flags = btrfs_super_flags(disk_super); @@ -66,14 +65,6 @@ int set_metadata_uuid(struct btrfs_root *root, const char *new_fsid_string) new_fsid = (memcmp(fsid, disk_super->fsid, BTRFS_FSID_SIZE) != 0); - /* Step 1 sets the in progress flag */ - trans = btrfs_start_transaction(root, 1); - super_flags |= BTRFS_SUPER_FLAG_CHANGING_FSID_V2; - btrfs_set_super_flags(disk_super, super_flags); - ret = btrfs_commit_transaction(trans, root); - if (ret < 0) - return ret; - if (new_fsid && fsid_changed && memcmp(disk_super->metadata_uuid, fsid, BTRFS_FSID_SIZE) == 0) { /* @@ -106,8 +97,7 @@ int set_metadata_uuid(struct btrfs_root *root, const char *new_fsid_string) trans = btrfs_start_transaction(root, 1); /* - * Step 2 is to write the metadata_uuid, set the incompat flag and - * clear the in progress flag + * For leagcy purpose reset the BTRFS_SUPER_FLAG_CHANGING_FSID_V2 */ super_flags &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_V2; btrfs_set_super_flags(disk_super, super_flags);
The btrfstune -m|M operation changes the fsid and preserves the original fsid in the metadata_uuid. Changing the fsid happens in two commits in the function set_metadata_uuid() - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag. - Stage 2 updates the fsid in fs_info->super_copy (local memory), resets the CHANGING_FSID_V2 flag (local memory), and then calls commit. The two-stage operation with the CHANGING_FSID flag makes sense for btrfstune -u|U, where the fsid is changed on each and every tree block, involving many commits. The CHANGING_FSID flag helps to identify the transient incomplete operation. However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2 flag, and a single commit would have been sufficient. The original git commit that added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for changing the metadata uuid"), provides no reasoning or did I miss something? (So marking this patch as RFC). This patch removes the two-stage commit for btrfstune -m|M and instead performs it in a single commit. With this change, the CHANGING_FSID_V2 flag is unused. Nevertheless, for legacy purposes, we will still reset the CHANGING_FSID_V2 flag during the btrfstune -m|M commit. Furthermore, the patchset titled: [PATCH 00/16] btrfs-progs: recover from failed metadata_uuid will help recover from the split brain situation due to two stage method in the older btrfstune. Signed-off-by: Anand Jain <anand.jain@oracle.com> --- tune/change-metadata-uuid.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-)