Message ID | 20170714134051.22756-1-dsterba@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Jul 14, 2017 at 9:40 PM, David Sterba <dsterba@suse.com> wrote: > We've switched to cloned bios in btrfs and hit a nasty bug leading to > corruptions, when cloned bios are iterated by bio_for_each_segment_all. No, you simply can't use bio_for_each_segment_all on cloned bio, and the reason is obviously. > > Report and fix: > https://patchwork.kernel.org/patch/9838535/ > > As a matter of precaution, we've added assertions to btrfs code to catch > the bad usage pattern: > > https://patchwork.kernel.org/patch/9839267/ > > The cloned/bi_vcnt behaviour seems tobe implementation dependent and is > not documented, so this patch at least warns about this one particular > case but this might still be insufficient. > > CC: linux-block@vger.kernel.org > Signed-off-by: David Sterba <dsterba@suse.com> > --- > include/linux/bio.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/include/linux/bio.h b/include/linux/bio.h > index 7b1cf4ba0902..f1ac84edcf39 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -156,6 +156,9 @@ static inline void *bio_data(struct bio *bio) > /* > * drivers should _never_ use the all version - the bio may have been split > * before it got to the driver and the driver won't own all of it > + * > + * Note that cloned bios must not use this as their bi_vcnt may be invalid and > + * this could lead to silent corruptions. > */ > #define bio_for_each_segment_all(bvl, bio, i) \ > for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++) > -- > 2.13.0 > Maybe we can add a warning here if it is a cloned bio.
On Fri, Jul 14, 2017 at 09:47:30PM +0800, Ming Lei wrote: > On Fri, Jul 14, 2017 at 9:40 PM, David Sterba <dsterba@suse.com> wrote: > > We've switched to cloned bios in btrfs and hit a nasty bug leading to > > corruptions, when cloned bios are iterated by bio_for_each_segment_all. > > No, you simply can't use bio_for_each_segment_all on cloned bio, and the > reason is obviously. This was not obvious to us, speaking for the btrfs developers trying to make more use of the of the bio API, so we had to find out the hard way. The proposed WARN_ON, possibly more sanity checks or documentation would help us not to trip over similar problems in the future. I try to take great care when dealing with code changing bios on our side so I read the headers, and partially the implementation, but still can miss something.
On 07/14/2017 04:03 PM, David Sterba wrote: > On Fri, Jul 14, 2017 at 09:47:30PM +0800, Ming Lei wrote: >> On Fri, Jul 14, 2017 at 9:40 PM, David Sterba <dsterba@suse.com> wrote: >>> We've switched to cloned bios in btrfs and hit a nasty bug leading to >>> corruptions, when cloned bios are iterated by bio_for_each_segment_all. >> >> No, you simply can't use bio_for_each_segment_all on cloned bio, and the >> reason is obviously. > > This was not obvious to us, speaking for the btrfs developers trying to > make more use of the of the bio API, so we had to find out the hard way. Yep, it might be obvious to those familiar with the block layer's internals, but for those not so familiar, it's not. There's no mention in bio_clone_fast() that the cloned bio's bi_vcnt shouldn't be used, and after finding that, one has to check which bio APIs use it and which don't. In this specific btrfs issue, it lead to silent write corruptions, making it harder to find (as opposed to crashes or other immediate failures). > > The proposed WARN_ON, possibly more sanity checks or documentation would > help us not to trip over similar problems in the future. I try to take > great care when dealing with code changing bios on our side so I read > the headers, and partially the implementation, but still can miss > something. >
On 07/14/2017 11:56 AM, Filipe Manana wrote: > > > On 07/14/2017 04:03 PM, David Sterba wrote: >> On Fri, Jul 14, 2017 at 09:47:30PM +0800, Ming Lei wrote: >>> On Fri, Jul 14, 2017 at 9:40 PM, David Sterba <dsterba@suse.com> wrote: >>>> We've switched to cloned bios in btrfs and hit a nasty bug leading to >>>> corruptions, when cloned bios are iterated by bio_for_each_segment_all. >>> >>> No, you simply can't use bio_for_each_segment_all on cloned bio, and the >>> reason is obviously. >> >> This was not obvious to us, speaking for the btrfs developers trying to >> make more use of the of the bio API, so we had to find out the hard way. > > Yep, it might be obvious to those familiar with the block layer's > internals, but for those not so familiar, it's not. There's no mention > in bio_clone_fast() that the cloned bio's bi_vcnt shouldn't be used, > and after finding that, one has to check which bio APIs use it and > which don't. In this specific btrfs issue, it lead to silent write > corruptions, making it harder to find (as opposed to crashes or other > immediate failures). It's hard to circulate info like that, but the WARN_ON() should have been there from the get-go. I just need someone to test that patch triggers for the problematic case, then I'd be happy to get it queued up.
diff --git a/include/linux/bio.h b/include/linux/bio.h index 7b1cf4ba0902..f1ac84edcf39 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -156,6 +156,9 @@ static inline void *bio_data(struct bio *bio) /* * drivers should _never_ use the all version - the bio may have been split * before it got to the driver and the driver won't own all of it + * + * Note that cloned bios must not use this as their bi_vcnt may be invalid and + * this could lead to silent corruptions. */ #define bio_for_each_segment_all(bvl, bio, i) \ for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
We've switched to cloned bios in btrfs and hit a nasty bug leading to corruptions, when cloned bios are iterated by bio_for_each_segment_all. Report and fix: https://patchwork.kernel.org/patch/9838535/ As a matter of precaution, we've added assertions to btrfs code to catch the bad usage pattern: https://patchwork.kernel.org/patch/9839267/ The cloned/bi_vcnt behaviour seems tobe implementation dependent and is not documented, so this patch at least warns about this one particular case but this might still be insufficient. CC: linux-block@vger.kernel.org Signed-off-by: David Sterba <dsterba@suse.com> --- include/linux/bio.h | 3 +++ 1 file changed, 3 insertions(+)