diff mbox series

[V2] block: fix .bi_size overflow

Message ID 20190701071446.22028-1-ming.lei@redhat.com (mailing list archive)
State Superseded, archived
Headers show
Series [V2] block: fix .bi_size overflow | expand

Commit Message

Ming Lei July 1, 2019, 7:14 a.m. UTC
'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1
bytes.

Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can
include very limited pages, and usually at most 256, so the fs bio
size won't be bigger than 1M bytes most of times.

Since we support multi-page bvec, in theory one fs bio really can
be added > 1M pages, especially in case of hugepage, or big writeback
with too many dirty pages. Then there is chance in which .bi_size
is overflowed.

Fixes this issue by using bio_full() to check if the added segment may
overflow .bi_size.

Cc: Liu Yiding <liuyd.fnst@cn.fujitsu.com>
Cc: kernel test robot <rong.a.chen@intel.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Fixes: 07173c3ec276 ("block: enable multipage bvecs")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/bio.c         | 10 +++++-----
 fs/iomap.c          |  2 +-
 fs/xfs/xfs_aops.c   |  2 +-
 include/linux/bio.h | 18 ++++++++++++++++--
 4 files changed, 23 insertions(+), 9 deletions(-)

Comments

Christoph Hellwig July 1, 2019, 8:25 a.m. UTC | #1
Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Jens Axboe July 1, 2019, 2:05 p.m. UTC | #2
On 7/1/19 1:14 AM, Ming Lei wrote:
> 'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1
> bytes.
> 
> Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can
> include very limited pages, and usually at most 256, so the fs bio
> size won't be bigger than 1M bytes most of times.
> 
> Since we support multi-page bvec, in theory one fs bio really can
> be added > 1M pages, especially in case of hugepage, or big writeback
> with too many dirty pages. Then there is chance in which .bi_size
> is overflowed.
> 
> Fixes this issue by using bio_full() to check if the added segment may
> overflow .bi_size.

Any objections to queuing this up for 5.3? It's not a new regression
this series.
Jens Axboe July 1, 2019, 2:14 p.m. UTC | #3
On 7/1/19 8:05 AM, Jens Axboe wrote:
> On 7/1/19 1:14 AM, Ming Lei wrote:
>> 'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1
>> bytes.
>>
>> Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can
>> include very limited pages, and usually at most 256, so the fs bio
>> size won't be bigger than 1M bytes most of times.
>>
>> Since we support multi-page bvec, in theory one fs bio really can
>> be added > 1M pages, especially in case of hugepage, or big writeback
>> with too many dirty pages. Then there is chance in which .bi_size
>> is overflowed.
>>
>> Fixes this issue by using bio_full() to check if the added segment may
>> overflow .bi_size.
> 
> Any objections to queuing this up for 5.3? It's not a new regression
> this series.

I took a closer look, and applied for 5.3 and removed the stable tag.
We'll need to apply your patch for stable, and I added an adapted
one for 5.3. I don't want a huge merge hassle because of this.
Jens Axboe July 1, 2019, 2:20 p.m. UTC | #4
On 7/1/19 8:14 AM, Jens Axboe wrote:
> On 7/1/19 8:05 AM, Jens Axboe wrote:
>> On 7/1/19 1:14 AM, Ming Lei wrote:
>>> 'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1
>>> bytes.
>>>
>>> Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can
>>> include very limited pages, and usually at most 256, so the fs bio
>>> size won't be bigger than 1M bytes most of times.
>>>
>>> Since we support multi-page bvec, in theory one fs bio really can
>>> be added > 1M pages, especially in case of hugepage, or big writeback
>>> with too many dirty pages. Then there is chance in which .bi_size
>>> is overflowed.
>>>
>>> Fixes this issue by using bio_full() to check if the added segment may
>>> overflow .bi_size.
>>
>> Any objections to queuing this up for 5.3? It's not a new regression
>> this series.
> 
> I took a closer look, and applied for 5.3 and removed the stable tag.
> We'll need to apply your patch for stable, and I added an adapted
> one for 5.3. I don't want a huge merge hassle because of this.

OK, so we still get conflicts with that, due to both the same page
merge fix, and Christophs 5.3 changes.

I ended up pulling in 5.2-rc6 in for-5.3/block, which resolves at
least most of it, and kept the stable tag since now it's possible
to backport without too much trouble.
Ming Lei July 2, 2019, 1:38 a.m. UTC | #5
On Mon, Jul 01, 2019 at 08:20:13AM -0600, Jens Axboe wrote:
> On 7/1/19 8:14 AM, Jens Axboe wrote:
> > On 7/1/19 8:05 AM, Jens Axboe wrote:
> >> On 7/1/19 1:14 AM, Ming Lei wrote:
> >>> 'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1
> >>> bytes.
> >>>
> >>> Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can
> >>> include very limited pages, and usually at most 256, so the fs bio
> >>> size won't be bigger than 1M bytes most of times.
> >>>
> >>> Since we support multi-page bvec, in theory one fs bio really can
> >>> be added > 1M pages, especially in case of hugepage, or big writeback
> >>> with too many dirty pages. Then there is chance in which .bi_size
> >>> is overflowed.
> >>>
> >>> Fixes this issue by using bio_full() to check if the added segment may
> >>> overflow .bi_size.
> >>
> >> Any objections to queuing this up for 5.3? It's not a new regression
> >> this series.
> > 
> > I took a closer look, and applied for 5.3 and removed the stable tag.
> > We'll need to apply your patch for stable, and I added an adapted
> > one for 5.3. I don't want a huge merge hassle because of this.
> 
> OK, so we still get conflicts with that, due to both the same page
> merge fix, and Christophs 5.3 changes.
> 
> I ended up pulling in 5.2-rc6 in for-5.3/block, which resolves at
> least most of it, and kept the stable tag since now it's possible
> to backport without too much trouble.

Thanks for merging it.

BTW, we need the -stable tag, since Yiding has test case to reproduce
the issue reliably, which just needs one big machine with enough memory,
and fast storage, I guess.

thanks, 
Ming
Jens Axboe July 2, 2019, 1:54 a.m. UTC | #6
On 7/1/19 7:38 PM, Ming Lei wrote:
> On Mon, Jul 01, 2019 at 08:20:13AM -0600, Jens Axboe wrote:
>> On 7/1/19 8:14 AM, Jens Axboe wrote:
>>> On 7/1/19 8:05 AM, Jens Axboe wrote:
>>>> On 7/1/19 1:14 AM, Ming Lei wrote:
>>>>> 'bio->bi_iter.bi_size' is 'unsigned int', which at most hold 4G - 1
>>>>> bytes.
>>>>>
>>>>> Before 07173c3ec276 ("block: enable multipage bvecs"), one bio can
>>>>> include very limited pages, and usually at most 256, so the fs bio
>>>>> size won't be bigger than 1M bytes most of times.
>>>>>
>>>>> Since we support multi-page bvec, in theory one fs bio really can
>>>>> be added > 1M pages, especially in case of hugepage, or big writeback
>>>>> with too many dirty pages. Then there is chance in which .bi_size
>>>>> is overflowed.
>>>>>
>>>>> Fixes this issue by using bio_full() to check if the added segment may
>>>>> overflow .bi_size.
>>>>
>>>> Any objections to queuing this up for 5.3? It's not a new regression
>>>> this series.
>>>
>>> I took a closer look, and applied for 5.3 and removed the stable tag.
>>> We'll need to apply your patch for stable, and I added an adapted
>>> one for 5.3. I don't want a huge merge hassle because of this.
>>
>> OK, so we still get conflicts with that, due to both the same page
>> merge fix, and Christophs 5.3 changes.
>>
>> I ended up pulling in 5.2-rc6 in for-5.3/block, which resolves at
>> least most of it, and kept the stable tag since now it's possible
>> to backport without too much trouble.
> 
> Thanks for merging it.
> 
> BTW, we need the -stable tag, since Yiding has test case to reproduce
> the issue reliably, which just needs one big machine with enough memory,
> and fast storage, I guess.

Just to be clear, I wasn't saying it shouldn't go to stable. But it's
pointless to mark something for stable if you know it'll reject, and
won't be easily fixable by the person applying it. For that case, it's
better to NOT CC stable, and just send in an appropriate patch instead.

But that's all moot now, as per last section in the email you are
replying to.
diff mbox series

Patch

diff --git a/block/bio.c b/block/bio.c
index ce797d73bb43..67bba12d273b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -731,7 +731,7 @@  static int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 		}
 	}
 
-	if (bio_full(bio))
+	if (bio_full(bio, len))
 		return 0;
 
 	if (bio->bi_phys_segments >= queue_max_segments(q))
@@ -807,7 +807,7 @@  void __bio_add_page(struct bio *bio, struct page *page,
 	struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt];
 
 	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));
-	WARN_ON_ONCE(bio_full(bio));
+	WARN_ON_ONCE(bio_full(bio, len));
 
 	bv->bv_page = page;
 	bv->bv_offset = off;
@@ -834,7 +834,7 @@  int bio_add_page(struct bio *bio, struct page *page,
 	bool same_page = false;
 
 	if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
-		if (bio_full(bio))
+		if (bio_full(bio, len))
 			return 0;
 		__bio_add_page(bio, page, len, offset);
 	}
@@ -922,7 +922,7 @@  static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 			if (same_page)
 				put_page(page);
 		} else {
-			if (WARN_ON_ONCE(bio_full(bio)))
+			if (WARN_ON_ONCE(bio_full(bio, len)))
                                 return -EINVAL;
 			__bio_add_page(bio, page, len, offset);
 		}
@@ -966,7 +966,7 @@  int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 			ret = __bio_iov_bvec_add_pages(bio, iter);
 		else
 			ret = __bio_iov_iter_get_pages(bio, iter);
-	} while (!ret && iov_iter_count(iter) && !bio_full(bio));
+	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
 
 	if (iov_iter_bvec_no_ref(iter))
 		bio_set_flag(bio, BIO_NO_PAGE_REF);
diff --git a/fs/iomap.c b/fs/iomap.c
index 12654c2e78f8..da961fca3180 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -333,7 +333,7 @@  iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	if (iop)
 		atomic_inc(&iop->read_count);
 
-	if (!ctx->bio || !is_contig || bio_full(ctx->bio)) {
+	if (!ctx->bio || !is_contig || bio_full(ctx->bio, plen)) {
 		gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
 		int nr_vecs = (length + PAGE_SIZE - 1) >> PAGE_SHIFT;
 
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 8da5e6637771..11f703d4a605 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -782,7 +782,7 @@  xfs_add_to_ioend(
 		atomic_inc(&iop->write_count);
 
 	if (!merged) {
-		if (bio_full(wpc->ioend->io_bio))
+		if (bio_full(wpc->ioend->io_bio, len))
 			xfs_chain_bio(wpc->ioend, wbc, bdev, sector);
 		bio_add_page(wpc->ioend->io_bio, page, len, poff);
 	}
diff --git a/include/linux/bio.h b/include/linux/bio.h
index f87abaa898f0..e36b8fc1b1c3 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -102,9 +102,23 @@  static inline void *bio_data(struct bio *bio)
 	return NULL;
 }
 
-static inline bool bio_full(struct bio *bio)
+/**
+ * bio_full - check if the bio is full
+ * @bio:	bio to check
+ * @len:	length of one segment to be added
+ *
+ * Return true if @bio is full and one segment with @len bytes can't be
+ * added to the bio, otherwise return false
+ */
+static inline bool bio_full(struct bio *bio, unsigned len)
 {
-	return bio->bi_vcnt >= bio->bi_max_vecs;
+	if (bio->bi_vcnt >= bio->bi_max_vecs)
+		return true;
+
+	if (bio->bi_iter.bi_size > UINT_MAX - len)
+		return true;
+
+	return false;
 }
 
 static inline bool bio_next_segment(const struct bio *bio,