Message ID | 20181004212443.26519-1-hans.van.kranenburg@mendix.com (mailing list archive) |
---|---|
Headers | show |
Series | Chunk allocator DUP fix and cleanups | expand |
On 2018/10/5 上午5:24, Hans van Kranenburg wrote: > This patch set contains an additional fix for a newly exposed bug after > the previous attempt to fix a chunk allocator bug for new DUP chunks: > > https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 For that bug, did you succeeded in reproducing the bug? I'm adding dev extent overlap checking in btrfs_verify_dev_extents() and btrfs-progs. I think it could help to detect such problem. Thanks, Qu > > The DUP fix is "fix more DUP stripe size handling". I did that one > before starting to change more things so it can be applied to earlier > LTS kernels. > > Besides that patch, which is fixing the bug in a way that is least > intrusive, I added a bunch of other patches to help getting the chunk > allocator code in a state that is a bit less error-prone and > bug-attracting. > > When running this and trying the reproduction scenario, I can now see > that the created DUP device extent is 827326464 bytes long, which is > good. > > I wrote and tested this on top of linus 4.19-rc5. I still need to create > a list of related use cases and test more things to at least walk > through a bunch of obvious use cases to see if there's nothing exploding > too quickly with these changes. However, I'm quite confident about it, > so I'm sharing all of it already. > > Any feedback and review is appreciated. Be gentle and keep in mind that > I'm still very much in a learning stage regarding kernel development. > > The stable patches handling workflow is not 100% clear to me yet. I > guess I have to add a Fixes: in the DUP patch which points to the > previous commit 92e222df7b. > > Moo!, > Knorrie > > Hans van Kranenburg (6): > btrfs: alloc_chunk: do not refurbish num_bytes > btrfs: alloc_chunk: improve chunk size variable name > btrfs: alloc_chunk: fix more DUP stripe size handling > btrfs: fix ncopies raid_attr for RAID56 > btrfs: introduce nparity raid_attr > btrfs: alloc_chunk: rework chunk/stripe calculations > > fs/btrfs/volumes.c | 84 +++++++++++++++++++++++----------------------- > fs/btrfs/volumes.h | 4 ++- > 2 files changed, 45 insertions(+), 43 deletions(-) >
On 2018/10/5 上午5:24, Hans van Kranenburg wrote: > This patch set contains an additional fix for a newly exposed bug after > the previous attempt to fix a chunk allocator bug for new DUP chunks: > > https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 For that bug, did you succeed in reproducing the bug? I'm adding dev extent overlap checking in btrfs_verify_dev_extents() and btrfs-progs. I think it could help to detect such problem. Thanks, Qu > > The DUP fix is "fix more DUP stripe size handling". I did that one > before starting to change more things so it can be applied to earlier > LTS kernels. > > Besides that patch, which is fixing the bug in a way that is least > intrusive, I added a bunch of other patches to help getting the chunk > allocator code in a state that is a bit less error-prone and > bug-attracting. > > When running this and trying the reproduction scenario, I can now see > that the created DUP device extent is 827326464 bytes long, which is > good. > > I wrote and tested this on top of linus 4.19-rc5. I still need to create > a list of related use cases and test more things to at least walk > through a bunch of obvious use cases to see if there's nothing exploding > too quickly with these changes. However, I'm quite confident about it, > so I'm sharing all of it already. > > Any feedback and review is appreciated. Be gentle and keep in mind that > I'm still very much in a learning stage regarding kernel development. > > The stable patches handling workflow is not 100% clear to me yet. I > guess I have to add a Fixes: in the DUP patch which points to the > previous commit 92e222df7b. > > Moo!, > Knorrie > > Hans van Kranenburg (6): > btrfs: alloc_chunk: do not refurbish num_bytes > btrfs: alloc_chunk: improve chunk size variable name > btrfs: alloc_chunk: fix more DUP stripe size handling > btrfs: fix ncopies raid_attr for RAID56 > btrfs: introduce nparity raid_attr > btrfs: alloc_chunk: rework chunk/stripe calculations > > fs/btrfs/volumes.c | 84 +++++++++++++++++++++++----------------------- > fs/btrfs/volumes.h | 4 ++- > 2 files changed, 45 insertions(+), 43 deletions(-) >
On 10/05/2018 09:51 AM, Qu Wenruo wrote: > > > On 2018/10/5 上午5:24, Hans van Kranenburg wrote: >> This patch set contains an additional fix for a newly exposed bug after >> the previous attempt to fix a chunk allocator bug for new DUP chunks: >> >> https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 > > For that bug, did you succeeded in reproducing the bug? Yes, open the above link and scroll to "Steps to reproduce". o/ > I'm adding dev extent overlap checking in btrfs_verify_dev_extents() and > btrfs-progs. > I think it could help to detect such problem. > > Thanks, > Qu > >> >> The DUP fix is "fix more DUP stripe size handling". I did that one >> before starting to change more things so it can be applied to earlier >> LTS kernels. >> >> Besides that patch, which is fixing the bug in a way that is least >> intrusive, I added a bunch of other patches to help getting the chunk >> allocator code in a state that is a bit less error-prone and >> bug-attracting. >> >> When running this and trying the reproduction scenario, I can now see >> that the created DUP device extent is 827326464 bytes long, which is >> good. >> >> I wrote and tested this on top of linus 4.19-rc5. I still need to create >> a list of related use cases and test more things to at least walk >> through a bunch of obvious use cases to see if there's nothing exploding >> too quickly with these changes. However, I'm quite confident about it, >> so I'm sharing all of it already. >> >> Any feedback and review is appreciated. Be gentle and keep in mind that >> I'm still very much in a learning stage regarding kernel development. >> >> The stable patches handling workflow is not 100% clear to me yet. I >> guess I have to add a Fixes: in the DUP patch which points to the >> previous commit 92e222df7b. >> >> Moo!, >> Knorrie >> >> Hans van Kranenburg (6): >> btrfs: alloc_chunk: do not refurbish num_bytes >> btrfs: alloc_chunk: improve chunk size variable name >> btrfs: alloc_chunk: fix more DUP stripe size handling >> btrfs: fix ncopies raid_attr for RAID56 >> btrfs: introduce nparity raid_attr >> btrfs: alloc_chunk: rework chunk/stripe calculations >> >> fs/btrfs/volumes.c | 84 +++++++++++++++++++++++----------------------- >> fs/btrfs/volumes.h | 4 ++- >> 2 files changed, 45 insertions(+), 43 deletions(-) >> >
On 2018/10/5 下午6:58, Hans van Kranenburg wrote: > On 10/05/2018 09:51 AM, Qu Wenruo wrote: >> >> >> On 2018/10/5 上午5:24, Hans van Kranenburg wrote: >>> This patch set contains an additional fix for a newly exposed bug after >>> the previous attempt to fix a chunk allocator bug for new DUP chunks: >>> >>> https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 >> >> For that bug, did you succeeded in reproducing the bug? > > Yes, open the above link and scroll to "Steps to reproduce". That's beyond device boundary one. Also reproduced here. And hand-crafted a super small image as test case for btrfs-progs. But I'm a little curious about the dev extent overlapping case. Have you got one? Thanks, Qu > > o/ > >> I'm adding dev extent overlap checking in btrfs_verify_dev_extents() and >> btrfs-progs. >> I think it could help to detect such problem. >> >> Thanks, >> Qu >> >>> >>> The DUP fix is "fix more DUP stripe size handling". I did that one >>> before starting to change more things so it can be applied to earlier >>> LTS kernels. >>> >>> Besides that patch, which is fixing the bug in a way that is least >>> intrusive, I added a bunch of other patches to help getting the chunk >>> allocator code in a state that is a bit less error-prone and >>> bug-attracting. >>> >>> When running this and trying the reproduction scenario, I can now see >>> that the created DUP device extent is 827326464 bytes long, which is >>> good. >>> >>> I wrote and tested this on top of linus 4.19-rc5. I still need to create >>> a list of related use cases and test more things to at least walk >>> through a bunch of obvious use cases to see if there's nothing exploding >>> too quickly with these changes. However, I'm quite confident about it, >>> so I'm sharing all of it already. >>> >>> Any feedback and review is appreciated. Be gentle and keep in mind that >>> I'm still very much in a learning stage regarding kernel development. >>> >>> The stable patches handling workflow is not 100% clear to me yet. I >>> guess I have to add a Fixes: in the DUP patch which points to the >>> previous commit 92e222df7b. >>> >>> Moo!, >>> Knorrie >>> >>> Hans van Kranenburg (6): >>> btrfs: alloc_chunk: do not refurbish num_bytes >>> btrfs: alloc_chunk: improve chunk size variable name >>> btrfs: alloc_chunk: fix more DUP stripe size handling >>> btrfs: fix ncopies raid_attr for RAID56 >>> btrfs: introduce nparity raid_attr >>> btrfs: alloc_chunk: rework chunk/stripe calculations >>> >>> fs/btrfs/volumes.c | 84 +++++++++++++++++++++++----------------------- >>> fs/btrfs/volumes.h | 4 ++- >>> 2 files changed, 45 insertions(+), 43 deletions(-) >>> >> > >
On 10/08/2018 08:43 AM, Qu Wenruo wrote: > > > On 2018/10/5 下午6:58, Hans van Kranenburg wrote: >> On 10/05/2018 09:51 AM, Qu Wenruo wrote: >>> >>> >>> On 2018/10/5 上午5:24, Hans van Kranenburg wrote: >>>> This patch set contains an additional fix for a newly exposed bug after >>>> the previous attempt to fix a chunk allocator bug for new DUP chunks: >>>> >>>> https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 >>> >>> For that bug, did you succeeded in reproducing the bug? >> >> Yes, open the above link and scroll to "Steps to reproduce". > > That's beyond device boundary one. Also reproduced here. > And hand-crafted a super small image as test case for btrfs-progs. > > But I'm a little curious about the dev extent overlapping case. > Have you got one? Ah ok, I see. No, I didn't do that yet. By using unmodified tooling, I think this can be done by a combination of a few resizings and using very specific balance to cause a fs of exactly 7880MiB again with a single 1578MiB gap inside... I'll try later today to see if I can come up with a recipe for this.
On 10/08/2018 03:19 PM, Hans van Kranenburg wrote: > On 10/08/2018 08:43 AM, Qu Wenruo wrote: >> >> >> On 2018/10/5 下午6:58, Hans van Kranenburg wrote: >>> On 10/05/2018 09:51 AM, Qu Wenruo wrote: >>>> >>>> >>>> On 2018/10/5 上午5:24, Hans van Kranenburg wrote: >>>>> This patch set contains an additional fix for a newly exposed bug after >>>>> the previous attempt to fix a chunk allocator bug for new DUP chunks: >>>>> >>>>> https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 >>>> >>>> For that bug, did you succeeded in reproducing the bug? >>> >>> Yes, open the above link and scroll to "Steps to reproduce". >> >> That's beyond device boundary one. Also reproduced here. >> And hand-crafted a super small image as test case for btrfs-progs. >> >> But I'm a little curious about the dev extent overlapping case. >> Have you got one? > > Ah ok, I see. No, I didn't do that yet. > > By using unmodified tooling, I think this can be done by a combination > of a few resizings and using very specific balance to cause a fs of > exactly 7880MiB again with a single 1578MiB gap inside... > > I'll try later today to see if I can come up with a recipe for this. Ok, this is actually pretty simple to do: ---- >8 ---- -# mkdir bork -# cd bork -# dd if=/dev/zero of=image bs=1 count=0 seek=1024M 0+0 records in 0+0 records out 0 bytes copied, 0.000183343 s, 0.0 kB/s -# mkfs.btrfs -d dup -m dup image -# losetup -f image -# mount -o space_cache=v2 /dev/loop0 mountpoint -# dd if=/dev/zero of=mountpoint/flapsie bs=1M dd: error writing 'mountpoint/flapsie': No space left on device 453+0 records in 452+0 records out 474185728 bytes (474 MB, 452 MiB) copied, 4.07663 s, 116 MB/s ---- >8 ---- -# ./show_usage.py /bork/mountpoint/ Target profile for SYSTEM (chunk tree): DUP Target profile for METADATA: DUP Target profile for DATA: DUP Mixed groups: False Virtual space usage by block group type: | | type total used | ---- ----- ---- | Data 452.31MiB 452.22MiB | System 8.00MiB 16.00KiB | Metadata 51.19MiB 656.00KiB Total raw filesystem size: 1.00GiB Total raw allocated bytes: 1023.00MiB Allocatable bytes remaining: 1.00MiB Unallocatable raw bytes: 0.00B Unallocatable bytes that can be reclaimed by balancing: 0.00B Estimated virtual space left to use for metadata: 51.05MiB Estimated virtual space left to use for data: 96.00KiB Allocated raw disk bytes by chunk type. Parity is a reserved part of the allocated bytes, limiting the amount that can be used for data or metadata: | | flags allocated used parity | ----- --------- ---- ------ | DATA|DUP 904.62MiB 904.44MiB 0.00B | SYSTEM|DUP 16.00MiB 32.00KiB 0.00B | METADATA|DUP 102.38MiB 1.28MiB 0.00B Allocated bytes per device: | | devid total size allocated path | ----- ---------- --------- ---- | 1 1.00GiB 1023.00MiB /dev/loop0 Allocated bytes per device, split up per chunk type. Parity bytes are again part of the total amount of allocated bytes. | | Device ID: 1 | | flags allocated parity | | ----- --------- ------ | | DATA|DUP 904.62MiB 0.00B | | SYSTEM|DUP 16.00MiB 0.00B | | METADATA|DUP 102.38MiB 0.00B Unallocatable raw bytes per device: | | devid unallocatable | ----- ------------- | 1 0.00B ---- >8 ---- Now we're gonna cause some neat 1578MiB to happen that we're going to free up later to reproduce the failure: -# dd if=/dev/zero of=image bs=1 count=0 seek=2602M 0+0 records in 0+0 records out 0 bytes copied, 0.000188621 s, 0.0 kB/s -# losetup -c /dev/loop0 -# btrfs fi resize max mountpoint/ Resize 'mountpoint/' of 'max' -# dd if=/dev/zero of=mountpoint/1578MiB bs=1M dd: error writing 'mountpoint/1578MiB': No space left on device 790+0 records in 789+0 records out 827326464 bytes (827 MB, 789 MiB) copied, 12.2452 s, 67.6 MB/s ---- >8 ---- -# python3 import btrfs fs = btrfs.FileSystem('/bork/mountpoint/') for d in fs.dev_extents(): print("start {} end {} vaddr {}".format(d.paddr, d.paddr + d.length, d.vaddr)) start 1048576 end 11534336 vaddr 547880960 start 11534336 end 22020096 vaddr 547880960 start 22020096 end 30408704 vaddr 22020096 start 30408704 end 38797312 vaddr 22020096 start 38797312 end 92471296 vaddr 30408704 start 92471296 end 146145280 vaddr 30408704 start 146145280 end 213254144 vaddr 84082688 start 213254144 end 280363008 vaddr 84082688 start 280363008 end 397803520 vaddr 151191552 start 397803520 end 515244032 vaddr 151191552 start 515244032 end 632684544 vaddr 268632064 start 632684544 end 750125056 vaddr 268632064 start 750125056 end 867565568 vaddr 386072576 start 867565568 end 985006080 vaddr 386072576 start 985006080 end 1029373952 vaddr 503513088 start 1029373952 end 1073741824 vaddr 503513088 start 1073741824 end 1358954496 vaddr 558366720 start 1358954496 end 1644167168 vaddr 558366720 start 1644167168 end 1929379840 vaddr 843579392 start 1929379840 end 2214592512 vaddr 843579392 start 2214592512 end 2471493632 vaddr 1128792064 start 2471493632 end 2728394752 vaddr 1128792064 The last three block groups were just added, using up the new space: for vaddr in (558366720, 843579392, 1128792064): print(fs.block_group(vaddr)) block group vaddr 558366720 length 285212672 flags DATA|DUP used 285212672 used_pct 100 block group vaddr 843579392 length 285212672 flags DATA|DUP used 285212672 used_pct 100 block group vaddr 1128792064 length 256901120 flags DATA|DUP used 256901120 used_pct 100 By the way.. for the first and second time (to do the writeup) I did this, extent allocation looks like paste linked below O_o... I have no idea where the pattern with decreasing sizes the first time is coming from, and no idea why the second time all the data is being stored in 144KiB extents... http://paste.debian.net/plainh/537bdd95 ---- >8 ---- Ok, now let's extend the disk to our famous number of 7880M -# dd if=/dev/zero of=image bs=1 count=0 seek=7880M 0+0 records in 0+0 records out 0 bytes copied, 0.000185768 s, 0.0 kB/s -# losetup -c /dev/loop0 -# btrfs fi resize max mountpoint/ Resize 'mountpoint/' of 'max' -# ./show_usage.py /bork/mountpoint/ [...] Total raw filesystem size: 7.70GiB Total raw allocated bytes: 2.54GiB Allocatable bytes remaining: 5.16GiB [...] -# dd if=/dev/zero of=mountpoint/bakkiepleur bs=1M dd: error writing 'mountpoint/bakkiepleur': No space left on device 2384+0 records in 2383+0 records out 2498756608 bytes (2.5 GB, 2.3 GiB) copied, 34.1946 s, 73.1 MB/s -# ./show_usage.py /bork/mountpoint/ Target profile for SYSTEM (chunk tree): DUP Target profile for METADATA: DUP Target profile for DATA: DUP Mixed groups: False Virtual space usage by block group type: | | type total used | ---- ----- ---- | Data 3.54GiB 3.54GiB | System 8.00MiB 16.00KiB | Metadata 307.19MiB 102.05MiB Total raw filesystem size: 7.70GiB Total raw allocated bytes: 7.69GiB Allocatable bytes remaining: 1.00MiB Unallocatable raw bytes: 0.00B Unallocatable bytes that can be reclaimed by balancing: 0.00B Estimated virtual space left to use for metadata: 205.64MiB Estimated virtual space left to use for data: 0.00B Allocated raw disk bytes by chunk type. Parity is a reserved part of the allocated bytes, limiting the amount that can be used for data or metadata: | | flags allocated used parity | ----- --------- ---- ------ | DATA|DUP 7.08GiB 7.08GiB 0.00B | SYSTEM|DUP 16.00MiB 32.00KiB 0.00B | METADATA|DUP 614.38MiB 204.09MiB 0.00B Allocated bytes per device: | | devid total size allocated path | ----- ---------- --------- ---- | 1 7.70GiB 7.69GiB /dev/loop0 Allocated bytes per device, split up per chunk type. Parity bytes are again part of the total amount of allocated bytes. | | Device ID: 1 | | flags allocated parity | | ----- --------- ------ | | DATA|DUP 7.08GiB 0.00B | | SYSTEM|DUP 16.00MiB 0.00B | | METADATA|DUP 614.38MiB 0.00B Unallocatable raw bytes per device: | | devid unallocatable | ----- ------------- | 1 0.00B ---- >8 ---- Now, generating the gap in the physical is as simple as dropping the 1578MiB file and triggering a transaction commit to clean up empty block groups: -# rm mountpoint/1578MiB -# btrfs fi sync mountpoint/ for d in fs.dev_extents(): print("start {} end {} vaddr {}".format(d.paddr, d.paddr + d.length, d.vaddr)) start 1048576 end 11534336 vaddr 547880960 start 11534336 end 22020096 vaddr 547880960 start 22020096 end 30408704 vaddr 22020096 start 30408704 end 38797312 vaddr 22020096 start 38797312 end 92471296 vaddr 30408704 start 92471296 end 146145280 vaddr 30408704 start 146145280 end 213254144 vaddr 84082688 start 213254144 end 280363008 vaddr 84082688 start 280363008 end 397803520 vaddr 151191552 start 397803520 end 515244032 vaddr 151191552 start 515244032 end 632684544 vaddr 268632064 start 632684544 end 750125056 vaddr 268632064 start 750125056 end 867565568 vaddr 386072576 start 867565568 end 985006080 vaddr 386072576 start 985006080 end 1029373952 vaddr 503513088 start 1029373952 end 1073741824 vaddr 503513088 [558366720, 843579392 and 1128792064 are gone] start 2728394752 end 3567255552 vaddr 1385693184 start 3567255552 end 4406116352 vaddr 1385693184 start 4406116352 end 5244977152 vaddr 2224553984 start 5244977152 end 6083837952 vaddr 2224553984 start 6083837952 end 6352273408 vaddr 3063414784 start 6352273408 end 6620708864 vaddr 3063414784 start 6620708864 end 7441743872 vaddr 3331850240 start 7441743872 end 8262778880 vaddr 3331850240 -# ./show_usage.py /bork/mountpoint/ [...] Allocatable bytes remaining: 1.54GiB [...] ---- >8 ---- And then... -# dd if=/dev/zero of=mountpoint/omgwtfbbq bs=1M dd: error writing 'mountpoint/omgwtfbbq': No space left on device 801+0 records in 800+0 records out 838860800 bytes (839 MB, 800 MiB) copied, 12.6689 s, 66.2 MB/s We happily write 800MiB(!), destroying the first 22MiB of the device extent at 2728394752... for d in fs.dev_extents(): print("start {} end {} vaddr {}".format(d.paddr, d.paddr + d.length, d.vaddr)) [...] start 1912602624 end 2751463424 vaddr 4152885248 start 2728394752 end 3567255552 vaddr 1385693184 [...] import btrfs fs = btrfs.FileSystem('/bork/mountpoint/') paddr = 1912602624 tree = btrfs.ctree.DEV_TREE_OBJECTID devid = 1 for header, data in btrfs.ioctl.search_v2(fs.fd, tree, key, key): btrfs.utils.pretty_print(btrfs.ctree.DevExtent(header, data)) <btrfs.ctree.DevExtent (1 DEV_EXTENT 1912602624)> devid: 1 (key objectid) paddr: 1912602624 (key offset) length: 838860800 chunk_offset: 4152885248 uuid: 00000000-0000-0000-0000-000000000000 chunk_objectid: 256 chunk_tree: 3 Ouch! -# ./show_usage.py /bork/mountpoint/ [...] Total raw filesystem size: 7.70GiB Total raw allocated bytes: 7.72GiB Allocatable bytes remaining: 0.00B Unallocatable raw bytes: -22020096.00B [...] ---- >8 ---- And now more fun: -# btrfs scrub status mountpoint/ scrub status for 1f7a998a-4ea3-4117-8b42-9d0e6f1d3b4c scrub started at Tue Oct 9 01:35:30 2018 and finished after 00:00:13 total bytes scrubbed: 6.52GiB with 0 errors But that's probably because it's all data from /dev/zero... -# btrfs check /dev/loop0 Checking filesystem on /dev/loop0 UUID: 1f7a998a-4ea3-4117-8b42-9d0e6f1d3b4c checking extents Device extent[1, 2728394752, 838860800] existed. Chunk[256, 228, 1385693184] stripe[1, 2728394752] dismatch dev extent[1, 1912602624, 838860800] Dev extent's total-byte(7445938176) is not equal to byte-used(8284798976) in dev[1, 216, 1] ERROR: errors found in extent allocation tree or chunk allocation checking free space tree checking fs roots checking only csum items (without verifying data) checking root refs found 3917824000 bytes used, error(s) found total csum bytes: 3722464 total tree bytes: 106020864 total fs tree bytes: 48955392 total extent tree bytes: 48644096 btree space waste bytes: 1428937 file data blocks allocated: 3811803136 referenced 3811803136 This one actually already detects that something is wrong.
On Thu, Oct 04, 2018 at 11:24:37PM +0200, Hans van Kranenburg wrote: > This patch set contains an additional fix for a newly exposed bug after > the previous attempt to fix a chunk allocator bug for new DUP chunks: > > https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 > > The DUP fix is "fix more DUP stripe size handling". I did that one > before starting to change more things so it can be applied to earlier > LTS kernels. > > Besides that patch, which is fixing the bug in a way that is least > intrusive, I added a bunch of other patches to help getting the chunk > allocator code in a state that is a bit less error-prone and > bug-attracting. > > When running this and trying the reproduction scenario, I can now see > that the created DUP device extent is 827326464 bytes long, which is > good. > > I wrote and tested this on top of linus 4.19-rc5. I still need to create > a list of related use cases and test more things to at least walk > through a bunch of obvious use cases to see if there's nothing exploding > too quickly with these changes. However, I'm quite confident about it, > so I'm sharing all of it already. > > Any feedback and review is appreciated. Be gentle and keep in mind that > I'm still very much in a learning stage regarding kernel development. The patches look good, thanks. Problem is explained, preparatory work is separated from the fix itself. > The stable patches handling workflow is not 100% clear to me yet. I > guess I have to add a Fixes: in the DUP patch which points to the > previous commit 92e222df7b. Almost nobody does it right, no worries. If you can identify a single patch that introduces a bug then it's for Fixes:, otherwise a CC: stable with version where it makes sense & applies is enough. I do that check myself regardless of what's in the patch. I ran the patches in a VM and hit a division-by-zero in test fstests/btrfs/011, stacktrace below. First guess is that it's caused by patch 3/6. [ 3116.065595] BTRFS: device fsid e3bd8db5-304f-4b1a-8488-7791ea94088f devid 1 transid 5 /dev/vdb [ 3116.071274] BTRFS: device fsid e3bd8db5-304f-4b1a-8488-7791ea94088f devid 2 transid 5 /dev/vdc [ 3116.087086] BTRFS info (device vdb): disk space caching is enabled [ 3116.088644] BTRFS info (device vdb): has skinny extents [ 3116.089796] BTRFS info (device vdb): flagging fs with big metadata feature [ 3116.093971] BTRFS info (device vdb): checking UUID tree [ 3125.853755] BTRFS info (device vdb): dev_replace from /dev/vdb (devid 1) to /dev/vdd started [ 3125.860269] divide error: 0000 [#1] PREEMPT SMP [ 3125.861264] CPU: 1 PID: 6477 Comm: btrfs Not tainted 4.19.0-rc7-default+ #288 [ 3125.862841] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 [ 3125.865385] RIP: 0010:__btrfs_alloc_chunk+0x368/0xa70 [btrfs] [ 3125.870541] RSP: 0018:ffffa4ea0409fa48 EFLAGS: 00010206 [ 3125.871862] RAX: 0000000004000000 RBX: ffff94e074374508 RCX: 0000000000000002 [ 3125.873587] RDX: 0000000000000000 RSI: ffff94e017818c80 RDI: 0000000002000000 [ 3125.874677] RBP: 0000000080800000 R08: 0000000000000000 R09: 0000000000000002 [ 3125.875816] R10: 0000000300000000 R11: 0000000080900000 R12: 0000000000000000 [ 3125.876742] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000002 [ 3125.877657] FS: 00007f6de34208c0(0000) GS:ffff94e07d600000(0000) knlGS:0000000000000000 [ 3125.878862] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3125.880080] CR2: 00007ffe963d5ce8 CR3: 000000007659b000 CR4: 00000000000006e0 [ 3125.881485] Call Trace: [ 3125.882105] do_chunk_alloc+0x266/0x3e0 [btrfs] [ 3125.882841] btrfs_inc_block_group_ro+0x10e/0x160 [btrfs] [ 3125.883875] scrub_enumerate_chunks+0x18b/0x5d0 [btrfs] [ 3125.884658] ? is_module_address+0x11/0x30 [ 3125.885271] ? wait_for_completion+0x160/0x190 [ 3125.885928] btrfs_scrub_dev+0x1b8/0x5a0 [btrfs] [ 3125.887767] ? start_transaction+0xa1/0x470 [btrfs] [ 3125.888648] btrfs_dev_replace_start.cold.19+0x155/0x17e [btrfs] [ 3125.889459] btrfs_dev_replace_by_ioctl+0x35/0x60 [btrfs] [ 3125.890251] btrfs_ioctl+0x2a94/0x31d0 [btrfs] [ 3125.890885] ? do_sigaction+0x7c/0x210 [ 3125.891731] ? do_vfs_ioctl+0xa2/0x6b0 [ 3125.892652] do_vfs_ioctl+0xa2/0x6b0 [ 3125.893642] ? do_sigaction+0x1a7/0x210 [ 3125.894665] ksys_ioctl+0x3a/0x70 [ 3125.895523] __x64_sys_ioctl+0x16/0x20 [ 3125.896339] do_syscall_64+0x5a/0x1a0 [ 3125.896949] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 3125.897638] RIP: 0033:0x7f6de28ecaa7 [ 3125.901313] RSP: 002b:00007ffe963da9c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 3125.902486] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f6de28ecaa7 [ 3125.903538] RDX: 00007ffe963dae00 RSI: 00000000ca289435 RDI: 0000000000000003 [ 3125.904878] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 3125.905788] R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffe963de26f [ 3125.906700] R13: 0000000000000001 R14: 0000000000000004 R15: 000055fceeceb2a0 [ 3125.907954] Modules linked in: btrfs libcrc32c xor zstd_decompress zstd_compress xxhash raid6_pq loop [ 3125.909342] ---[ end trace 5492bb467d3be2da ]--- [ 3125.910031] RIP: 0010:__btrfs_alloc_chunk+0x368/0xa70 [btrfs] [ 3125.913600] RSP: 0018:ffffa4ea0409fa48 EFLAGS: 00010206 [ 3125.914595] RAX: 0000000004000000 RBX: ffff94e074374508 RCX: 0000000000000002 [ 3125.916209] RDX: 0000000000000000 RSI: ffff94e017818c80 RDI: 0000000002000000 [ 3125.917701] RBP: 0000000080800000 R08: 0000000000000000 R09: 0000000000000002 [ 3125.919209] R10: 0000000300000000 R11: 0000000080900000 R12: 0000000000000000 [ 3125.920782] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000002 [ 3125.922413] FS: 00007f6de34208c0(0000) GS:ffff94e07d600000(0000) knlGS:0000000000000000 [ 3125.924264] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3125.925627] CR2: 00007ffe963d5ce8 CR3: 000000007659b000 CR4: 00000000000006e0
On 10/11/2018 05:13 PM, David Sterba wrote: > On Thu, Oct 04, 2018 at 11:24:37PM +0200, Hans van Kranenburg wrote: >> This patch set contains an additional fix for a newly exposed bug after >> the previous attempt to fix a chunk allocator bug for new DUP chunks: >> >> https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 >> >> The DUP fix is "fix more DUP stripe size handling". I did that one >> before starting to change more things so it can be applied to earlier >> LTS kernels. >> >> Besides that patch, which is fixing the bug in a way that is least >> intrusive, I added a bunch of other patches to help getting the chunk >> allocator code in a state that is a bit less error-prone and >> bug-attracting. >> >> When running this and trying the reproduction scenario, I can now see >> that the created DUP device extent is 827326464 bytes long, which is >> good. >> >> I wrote and tested this on top of linus 4.19-rc5. I still need to create >> a list of related use cases and test more things to at least walk >> through a bunch of obvious use cases to see if there's nothing exploding >> too quickly with these changes. However, I'm quite confident about it, >> so I'm sharing all of it already. >> >> Any feedback and review is appreciated. Be gentle and keep in mind that >> I'm still very much in a learning stage regarding kernel development. > > The patches look good, thanks. Problem is explained, preparatory work is > separated from the fix itself. \o/ >> The stable patches handling workflow is not 100% clear to me yet. I >> guess I have to add a Fixes: in the DUP patch which points to the >> previous commit 92e222df7b. > > Almost nobody does it right, no worries. If you can identify a single > patch that introduces a bug then it's for Fixes:, otherwise a CC: stable > with version where it makes sense & applies is enough. I do that check > myself regardless of what's in the patch. It's 92e222df7b and the thing I'm not sure about is if it also will catch the same patch which was already backported to LTS kernels since 92e222df7b also has Fixes in it... So by now the new bug is in 4.19, 4.14, 4.9, 4.4, 3.16... > I ran the patches in a VM and hit a division-by-zero in test > fstests/btrfs/011, stacktrace below. First guess is that it's caused by > patch 3/6. Ah interesting, dev replace. I'll play around with replace and find out how to run the tests properly and then reproduce this. The code introduced in patch 3 is removed again in patch 6, so I don't suspect that one. :) But, I'll find out. Thanks for testing. Hans > [ 3116.065595] BTRFS: device fsid e3bd8db5-304f-4b1a-8488-7791ea94088f devid 1 transid 5 /dev/vdb > [ 3116.071274] BTRFS: device fsid e3bd8db5-304f-4b1a-8488-7791ea94088f devid 2 transid 5 /dev/vdc > [ 3116.087086] BTRFS info (device vdb): disk space caching is enabled > [ 3116.088644] BTRFS info (device vdb): has skinny extents > [ 3116.089796] BTRFS info (device vdb): flagging fs with big metadata feature > [ 3116.093971] BTRFS info (device vdb): checking UUID tree > [ 3125.853755] BTRFS info (device vdb): dev_replace from /dev/vdb (devid 1) to /dev/vdd started > [ 3125.860269] divide error: 0000 [#1] PREEMPT SMP > [ 3125.861264] CPU: 1 PID: 6477 Comm: btrfs Not tainted 4.19.0-rc7-default+ #288 > [ 3125.862841] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 > [ 3125.865385] RIP: 0010:__btrfs_alloc_chunk+0x368/0xa70 [btrfs] > [ 3125.870541] RSP: 0018:ffffa4ea0409fa48 EFLAGS: 00010206 > [ 3125.871862] RAX: 0000000004000000 RBX: ffff94e074374508 RCX: 0000000000000002 > [ 3125.873587] RDX: 0000000000000000 RSI: ffff94e017818c80 RDI: 0000000002000000 > [ 3125.874677] RBP: 0000000080800000 R08: 0000000000000000 R09: 0000000000000002 > [ 3125.875816] R10: 0000000300000000 R11: 0000000080900000 R12: 0000000000000000 > [ 3125.876742] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000002 > [ 3125.877657] FS: 00007f6de34208c0(0000) GS:ffff94e07d600000(0000) knlGS:0000000000000000 > [ 3125.878862] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 3125.880080] CR2: 00007ffe963d5ce8 CR3: 000000007659b000 CR4: 00000000000006e0 > [ 3125.881485] Call Trace: > [ 3125.882105] do_chunk_alloc+0x266/0x3e0 [btrfs] > [ 3125.882841] btrfs_inc_block_group_ro+0x10e/0x160 [btrfs] > [ 3125.883875] scrub_enumerate_chunks+0x18b/0x5d0 [btrfs] > [ 3125.884658] ? is_module_address+0x11/0x30 > [ 3125.885271] ? wait_for_completion+0x160/0x190 > [ 3125.885928] btrfs_scrub_dev+0x1b8/0x5a0 [btrfs] > [ 3125.887767] ? start_transaction+0xa1/0x470 [btrfs] > [ 3125.888648] btrfs_dev_replace_start.cold.19+0x155/0x17e [btrfs] > [ 3125.889459] btrfs_dev_replace_by_ioctl+0x35/0x60 [btrfs] > [ 3125.890251] btrfs_ioctl+0x2a94/0x31d0 [btrfs] > [ 3125.890885] ? do_sigaction+0x7c/0x210 > [ 3125.891731] ? do_vfs_ioctl+0xa2/0x6b0 > [ 3125.892652] do_vfs_ioctl+0xa2/0x6b0 > [ 3125.893642] ? do_sigaction+0x1a7/0x210 > [ 3125.894665] ksys_ioctl+0x3a/0x70 > [ 3125.895523] __x64_sys_ioctl+0x16/0x20 > [ 3125.896339] do_syscall_64+0x5a/0x1a0 > [ 3125.896949] entry_SYSCALL_64_after_hwframe+0x49/0xbe > [ 3125.897638] RIP: 0033:0x7f6de28ecaa7 > [ 3125.901313] RSP: 002b:00007ffe963da9c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > [ 3125.902486] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f6de28ecaa7 > [ 3125.903538] RDX: 00007ffe963dae00 RSI: 00000000ca289435 RDI: 0000000000000003 > [ 3125.904878] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > [ 3125.905788] R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffe963de26f > [ 3125.906700] R13: 0000000000000001 R14: 0000000000000004 R15: 000055fceeceb2a0 > [ 3125.907954] Modules linked in: btrfs libcrc32c xor zstd_decompress zstd_compress xxhash raid6_pq loop > [ 3125.909342] ---[ end trace 5492bb467d3be2da ]--- > [ 3125.910031] RIP: 0010:__btrfs_alloc_chunk+0x368/0xa70 [btrfs] > [ 3125.913600] RSP: 0018:ffffa4ea0409fa48 EFLAGS: 00010206 > [ 3125.914595] RAX: 0000000004000000 RBX: ffff94e074374508 RCX: 0000000000000002 > [ 3125.916209] RDX: 0000000000000000 RSI: ffff94e017818c80 RDI: 0000000002000000 > [ 3125.917701] RBP: 0000000080800000 R08: 0000000000000000 R09: 0000000000000002 > [ 3125.919209] R10: 0000000300000000 R11: 0000000080900000 R12: 0000000000000000 > [ 3125.920782] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000002 > [ 3125.922413] FS: 00007f6de34208c0(0000) GS:ffff94e07d600000(0000) knlGS:0000000000000000 > [ 3125.924264] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 3125.925627] CR2: 00007ffe963d5ce8 CR3: 000000007659b000 CR4: 00000000000006e0 >
On 10/11/2018 09:40 PM, Hans van Kranenburg wrote: > On 10/11/2018 05:13 PM, David Sterba wrote: >> On Thu, Oct 04, 2018 at 11:24:37PM +0200, Hans van Kranenburg wrote: >>> This patch set contains an additional fix for a newly exposed bug after >>> the previous attempt to fix a chunk allocator bug for new DUP chunks: >>> >>> https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 >>> >>> The DUP fix is "fix more DUP stripe size handling". I did that one >>> before starting to change more things so it can be applied to earlier >>> LTS kernels. >>> >>> Besides that patch, which is fixing the bug in a way that is least >>> intrusive, I added a bunch of other patches to help getting the chunk >>> allocator code in a state that is a bit less error-prone and >>> bug-attracting. >>> >>> When running this and trying the reproduction scenario, I can now see >>> that the created DUP device extent is 827326464 bytes long, which is >>> good. >>> >>> I wrote and tested this on top of linus 4.19-rc5. I still need to create >>> a list of related use cases and test more things to at least walk >>> through a bunch of obvious use cases to see if there's nothing exploding >>> too quickly with these changes. However, I'm quite confident about it, >>> so I'm sharing all of it already. >>> >>> Any feedback and review is appreciated. Be gentle and keep in mind that >>> I'm still very much in a learning stage regarding kernel development. >> >> The patches look good, thanks. Problem is explained, preparatory work is >> separated from the fix itself. > > \o/ > >>> The stable patches handling workflow is not 100% clear to me yet. I >>> guess I have to add a Fixes: in the DUP patch which points to the >>> previous commit 92e222df7b. >> >> Almost nobody does it right, no worries. If you can identify a single >> patch that introduces a bug then it's for Fixes:, otherwise a CC: stable >> with version where it makes sense & applies is enough. I do that check >> myself regardless of what's in the patch. > > It's 92e222df7b and the thing I'm not sure about is if it also will > catch the same patch which was already backported to LTS kernels since > 92e222df7b also has Fixes in it... So by now the new bug is in 4.19, > 4.14, 4.9, 4.4, 3.16... > >> I ran the patches in a VM and hit a division-by-zero in test >> fstests/btrfs/011, stacktrace below. First guess is that it's caused by >> patch 3/6. > > Ah interesting, dev replace. > > I'll play around with replace and find out how to run the tests properly > and then reproduce this. > > The code introduced in patch 3 is removed again in patch 6, so I don't > suspect that one. :) Actually, while writing this I realize that this means it should be tested separately (like, older kernel with only 3), because, well, obvious I guess. > But, I'll find out. > > Thanks for testing. > > Hans > >> [ 3116.065595] BTRFS: device fsid e3bd8db5-304f-4b1a-8488-7791ea94088f devid 1 transid 5 /dev/vdb >> [ 3116.071274] BTRFS: device fsid e3bd8db5-304f-4b1a-8488-7791ea94088f devid 2 transid 5 /dev/vdc >> [ 3116.087086] BTRFS info (device vdb): disk space caching is enabled >> [ 3116.088644] BTRFS info (device vdb): has skinny extents >> [ 3116.089796] BTRFS info (device vdb): flagging fs with big metadata feature >> [ 3116.093971] BTRFS info (device vdb): checking UUID tree >> [ 3125.853755] BTRFS info (device vdb): dev_replace from /dev/vdb (devid 1) to /dev/vdd started >> [ 3125.860269] divide error: 0000 [#1] PREEMPT SMP >> [ 3125.861264] CPU: 1 PID: 6477 Comm: btrfs Not tainted 4.19.0-rc7-default+ #288 >> [ 3125.862841] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 >> [ 3125.865385] RIP: 0010:__btrfs_alloc_chunk+0x368/0xa70 [btrfs] >> [ 3125.870541] RSP: 0018:ffffa4ea0409fa48 EFLAGS: 00010206 >> [ 3125.871862] RAX: 0000000004000000 RBX: ffff94e074374508 RCX: 0000000000000002 >> [ 3125.873587] RDX: 0000000000000000 RSI: ffff94e017818c80 RDI: 0000000002000000 >> [ 3125.874677] RBP: 0000000080800000 R08: 0000000000000000 R09: 0000000000000002 >> [ 3125.875816] R10: 0000000300000000 R11: 0000000080900000 R12: 0000000000000000 >> [ 3125.876742] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000002 >> [ 3125.877657] FS: 00007f6de34208c0(0000) GS:ffff94e07d600000(0000) knlGS:0000000000000000 >> [ 3125.878862] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 3125.880080] CR2: 00007ffe963d5ce8 CR3: 000000007659b000 CR4: 00000000000006e0 >> [ 3125.881485] Call Trace: >> [ 3125.882105] do_chunk_alloc+0x266/0x3e0 [btrfs] >> [ 3125.882841] btrfs_inc_block_group_ro+0x10e/0x160 [btrfs] >> [ 3125.883875] scrub_enumerate_chunks+0x18b/0x5d0 [btrfs] >> [ 3125.884658] ? is_module_address+0x11/0x30 >> [ 3125.885271] ? wait_for_completion+0x160/0x190 >> [ 3125.885928] btrfs_scrub_dev+0x1b8/0x5a0 [btrfs] >> [ 3125.887767] ? start_transaction+0xa1/0x470 [btrfs] >> [ 3125.888648] btrfs_dev_replace_start.cold.19+0x155/0x17e [btrfs] >> [ 3125.889459] btrfs_dev_replace_by_ioctl+0x35/0x60 [btrfs] >> [ 3125.890251] btrfs_ioctl+0x2a94/0x31d0 [btrfs] >> [ 3125.890885] ? do_sigaction+0x7c/0x210 >> [ 3125.891731] ? do_vfs_ioctl+0xa2/0x6b0 >> [ 3125.892652] do_vfs_ioctl+0xa2/0x6b0 >> [ 3125.893642] ? do_sigaction+0x1a7/0x210 >> [ 3125.894665] ksys_ioctl+0x3a/0x70 >> [ 3125.895523] __x64_sys_ioctl+0x16/0x20 >> [ 3125.896339] do_syscall_64+0x5a/0x1a0 >> [ 3125.896949] entry_SYSCALL_64_after_hwframe+0x49/0xbe >> [ 3125.897638] RIP: 0033:0x7f6de28ecaa7 >> [ 3125.901313] RSP: 002b:00007ffe963da9c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 >> [ 3125.902486] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f6de28ecaa7 >> [ 3125.903538] RDX: 00007ffe963dae00 RSI: 00000000ca289435 RDI: 0000000000000003 >> [ 3125.904878] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 >> [ 3125.905788] R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffe963de26f >> [ 3125.906700] R13: 0000000000000001 R14: 0000000000000004 R15: 000055fceeceb2a0 >> [ 3125.907954] Modules linked in: btrfs libcrc32c xor zstd_decompress zstd_compress xxhash raid6_pq loop >> [ 3125.909342] ---[ end trace 5492bb467d3be2da ]--- >> [ 3125.910031] RIP: 0010:__btrfs_alloc_chunk+0x368/0xa70 [btrfs] >> [ 3125.913600] RSP: 0018:ffffa4ea0409fa48 EFLAGS: 00010206 >> [ 3125.914595] RAX: 0000000004000000 RBX: ffff94e074374508 RCX: 0000000000000002 >> [ 3125.916209] RDX: 0000000000000000 RSI: ffff94e017818c80 RDI: 0000000002000000 >> [ 3125.917701] RBP: 0000000080800000 R08: 0000000000000000 R09: 0000000000000002 >> [ 3125.919209] R10: 0000000300000000 R11: 0000000080900000 R12: 0000000000000000 >> [ 3125.920782] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000002 >> [ 3125.922413] FS: 00007f6de34208c0(0000) GS:ffff94e07d600000(0000) knlGS:0000000000000000 >> [ 3125.924264] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 3125.925627] CR2: 00007ffe963d5ce8 CR3: 000000007659b000 CR4: 00000000000006e0 >> > >
On Thu, Oct 11, 2018 at 07:40:22PM +0000, Hans van Kranenburg wrote: > On 10/11/2018 05:13 PM, David Sterba wrote: > > On Thu, Oct 04, 2018 at 11:24:37PM +0200, Hans van Kranenburg wrote: > >> This patch set contains an additional fix for a newly exposed bug after > >> the previous attempt to fix a chunk allocator bug for new DUP chunks: > >> > >> https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 > >> > >> The DUP fix is "fix more DUP stripe size handling". I did that one > >> before starting to change more things so it can be applied to earlier > >> LTS kernels. > >> > >> Besides that patch, which is fixing the bug in a way that is least > >> intrusive, I added a bunch of other patches to help getting the chunk > >> allocator code in a state that is a bit less error-prone and > >> bug-attracting. > >> > >> When running this and trying the reproduction scenario, I can now see > >> that the created DUP device extent is 827326464 bytes long, which is > >> good. > >> > >> I wrote and tested this on top of linus 4.19-rc5. I still need to create > >> a list of related use cases and test more things to at least walk > >> through a bunch of obvious use cases to see if there's nothing exploding > >> too quickly with these changes. However, I'm quite confident about it, > >> so I'm sharing all of it already. > >> > >> Any feedback and review is appreciated. Be gentle and keep in mind that > >> I'm still very much in a learning stage regarding kernel development. > > > > The patches look good, thanks. Problem is explained, preparatory work is > > separated from the fix itself. > > \o/ > > >> The stable patches handling workflow is not 100% clear to me yet. I > >> guess I have to add a Fixes: in the DUP patch which points to the > >> previous commit 92e222df7b. > > > > Almost nobody does it right, no worries. If you can identify a single > > patch that introduces a bug then it's for Fixes:, otherwise a CC: stable > > with version where it makes sense & applies is enough. I do that check > > myself regardless of what's in the patch. > > It's 92e222df7b and the thing I'm not sure about is if it also will > catch the same patch which was already backported to LTS kernels since > 92e222df7b also has Fixes in it... So by now the new bug is in 4.19, > 4.14, 4.9, 4.4, 3.16... > > > I ran the patches in a VM and hit a division-by-zero in test > > fstests/btrfs/011, stacktrace below. First guess is that it's caused by > > patch 3/6. > > Ah interesting, dev replace. > > I'll play around with replace and find out how to run the tests properly > and then reproduce this. > > The code introduced in patch 3 is removed again in patch 6, so I don't > suspect that one. :) > > But, I'll find out. > > Thanks for testing. I've merged patches 1-5 to misc-next as they seem to be safe and pass fstests, please let me know when you have updates to the last one. Thanks.
On 11/13/18 4:03 PM, David Sterba wrote: > On Thu, Oct 11, 2018 at 07:40:22PM +0000, Hans van Kranenburg wrote: >> On 10/11/2018 05:13 PM, David Sterba wrote: >>> On Thu, Oct 04, 2018 at 11:24:37PM +0200, Hans van Kranenburg wrote: >>>> This patch set contains an additional fix for a newly exposed bug after >>>> the previous attempt to fix a chunk allocator bug for new DUP chunks: >>>> >>>> https://lore.kernel.org/linux-btrfs/782f6000-30c0-0085-abd2-74ec5827c903@mendix.com/T/#m609ccb5d32998e8ba5cfa9901c1ab56a38a6f374 >>>> >>>> The DUP fix is "fix more DUP stripe size handling". I did that one >>>> before starting to change more things so it can be applied to earlier >>>> LTS kernels. >>>> >>>> Besides that patch, which is fixing the bug in a way that is least >>>> intrusive, I added a bunch of other patches to help getting the chunk >>>> allocator code in a state that is a bit less error-prone and >>>> bug-attracting. >>>> >>>> When running this and trying the reproduction scenario, I can now see >>>> that the created DUP device extent is 827326464 bytes long, which is >>>> good. >>>> >>>> I wrote and tested this on top of linus 4.19-rc5. I still need to create >>>> a list of related use cases and test more things to at least walk >>>> through a bunch of obvious use cases to see if there's nothing exploding >>>> too quickly with these changes. However, I'm quite confident about it, >>>> so I'm sharing all of it already. >>>> >>>> Any feedback and review is appreciated. Be gentle and keep in mind that >>>> I'm still very much in a learning stage regarding kernel development. >>> >>> The patches look good, thanks. Problem is explained, preparatory work is >>> separated from the fix itself. >> >> \o/ >> >>>> The stable patches handling workflow is not 100% clear to me yet. I >>>> guess I have to add a Fixes: in the DUP patch which points to the >>>> previous commit 92e222df7b. >>> >>> Almost nobody does it right, no worries. If you can identify a single >>> patch that introduces a bug then it's for Fixes:, otherwise a CC: stable >>> with version where it makes sense & applies is enough. I do that check >>> myself regardless of what's in the patch. >> >> It's 92e222df7b and the thing I'm not sure about is if it also will >> catch the same patch which was already backported to LTS kernels since >> 92e222df7b also has Fixes in it... So by now the new bug is in 4.19, >> 4.14, 4.9, 4.4, 3.16... >> >>> I ran the patches in a VM and hit a division-by-zero in test >>> fstests/btrfs/011, stacktrace below. First guess is that it's caused by >>> patch 3/6. >> >> Ah interesting, dev replace. >> >> I'll play around with replace and find out how to run the tests properly >> and then reproduce this. >> >> The code introduced in patch 3 is removed again in patch 6, so I don't >> suspect that one. :) >> >> But, I'll find out. >> >> Thanks for testing. > > I've merged patches 1-5 to misc-next as they seem to be safe and pass > fstests, please let me know when you have updates to the last one. > Thanks. I'll definitely follow up on that. It has not happened because something about todo lists and ordering work and not doing too many things at the same time. Thanks for already confirming it was not patch 3 but 6 that has the problem, I already suspected that. For patch 3, the actual fix, how do we get that on top of old stable kernels where the previous fix (92e222df7b) is in? Because of the Fixes: line that one was picked again in 4.14, 4.9, 4.4 and even 3.16. How does this work? Does it need all the other commit ids from those branches in Fixes lines? Or is there magic somewhere that does this? From your "development cycle open" mails, I understand that for testing myself, I would test based on misc-next, for-next or for-x.y in that order depending on where the first 5 are yet at that moment? Hans