Message ID | 1408462393-3291-1-git-send-email-bo.li.liu@oracle.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On 08/19/2014 11:33 AM, Liu Bo wrote: > The crash is > > ------------[ cut here ]------------ > kernel BUG at fs/btrfs/extent_io.c:2124! > [...] > Workqueue: btrfs-endio normal_work_helper [btrfs] > RIP: 0010:[<ffffffffa02d6055>] [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs] > > This is in fact a regression. > > It is because we forgot to increase @offset properly in reading corrupted block, > so that the @offset remains, and this leads to checksum errors while reading > left blocks queued up in the same bio, and then ends up with hiting the above > BUG_ON. Thanks Chris and Liu, this is queued. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/19/14, 10:33 AM, Liu Bo wrote: > The crash is > > ------------[ cut here ]------------ > kernel BUG at fs/btrfs/extent_io.c:2124! > [...] > Workqueue: btrfs-endio normal_work_helper [btrfs] > RIP: 0010:[<ffffffffa02d6055>] [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs] > > This is in fact a regression. It'd be helpful to identify the commit, or at least kernel release, which caused the regression. > It is because we forgot to increase @offset properly in reading corrupted block, > so that the @offset remains, and this leads to checksum errors while reading > left blocks queued up in the same bio, and then ends up with hiting the above > BUG_ON. So does that mean that any checksum error on this path will crash the kernel? That sounds like this bug has exposed a more fundamental problem, no? Thanks, -Eric > Reported-by: Chris Murphy <lists@colorremedies.com> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > --- > fs/btrfs/extent_io.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > index 3af4966..be41e4d 100644 > --- a/fs/btrfs/extent_io.c > +++ b/fs/btrfs/extent_io.c > @@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err) > test_bit(BIO_UPTODATE, &bio->bi_flags); > if (err) > uptodate = 0; > + offset += len; > continue; > } > } > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 19, 2014 at 04:42:42PM -0500, Eric Sandeen wrote: > On 8/19/14, 10:33 AM, Liu Bo wrote: > > The crash is > > > > ------------[ cut here ]------------ > > kernel BUG at fs/btrfs/extent_io.c:2124! > > [...] > > Workqueue: btrfs-endio normal_work_helper [btrfs] > > RIP: 0010:[<ffffffffa02d6055>] [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs] > > > > This is in fact a regression. > > It'd be helpful to identify the commit, or at least kernel release, which caused > the regression. Okay, got it. > > > It is because we forgot to increase @offset properly in reading corrupted block, > > so that the @offset remains, and this leads to checksum errors while reading > > left blocks queued up in the same bio, and then ends up with hiting the above > > BUG_ON. > > So does that mean that any checksum error on this path will crash the kernel? > > That sounds like this bug has exposed a more fundamental problem, no? Eric, you're right, I was hiding some details, now writing a new commit log... thanks, -liubo > > Thanks, > -Eric > > > Reported-by: Chris Murphy <lists@colorremedies.com> > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > > --- > > fs/btrfs/extent_io.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > > index 3af4966..be41e4d 100644 > > --- a/fs/btrfs/extent_io.c > > +++ b/fs/btrfs/extent_io.c > > @@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err) > > test_bit(BIO_UPTODATE, &bio->bi_flags); > > if (err) > > uptodate = 0; > > + offset += len; > > continue; > > } > > } > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 3af4966..be41e4d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err) test_bit(BIO_UPTODATE, &bio->bi_flags); if (err) uptodate = 0; + offset += len; continue; } }
The crash is ------------[ cut here ]------------ kernel BUG at fs/btrfs/extent_io.c:2124! [...] Workqueue: btrfs-endio normal_work_helper [btrfs] RIP: 0010:[<ffffffffa02d6055>] [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs] This is in fact a regression. It is because we forgot to increase @offset properly in reading corrupted block, so that the @offset remains, and this leads to checksum errors while reading left blocks queued up in the same bio, and then ends up with hiting the above BUG_ON. Reported-by: Chris Murphy <lists@colorremedies.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/extent_io.c | 1 + 1 file changed, 1 insertion(+)