Message ID | 1edfa6a2ffd66d55e6345a477df5387d2c1415d0.1626653825.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] bio: fix page leak bio_add_hw_page failure | expand |
On Mon, Jul 19, 2021 at 11:53:00AM +0100, Pavel Begunkov wrote: > __bio_iov_append_get_pages() doesn't put not appended pages on > bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do > the same for __bio_iov_iter_get_pages(), even though it looks like it > can't be triggered by userspace in this case. > > Fixes: 0512a75b98f8 ("block: Introduce REQ_OP_ZONE_APPEND") > Cc: stable@vger.kernel.org # 5.8+ > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > --- > > I haven't tested the fail path, thus RFC. Would be great if someone can > do it or take over the fix. > > block/bio.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 1fab762e079b..d95e3456ba0c 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -979,6 +979,14 @@ static int bio_iov_bvec_set_append(struct bio *bio, struct iov_iter *iter) > return 0; > } > > +static void bio_put_pages(struct page **pages, size_t size, size_t off) > +{ > + size_t i, nr = DIV_ROUND_UP(size + (off & ~PAGE_MASK), PAGE_SIZE); > + > + for (i = 0; i < nr; i++) > + put_page(pages[i]); > +} > + > #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *)) > > /** > @@ -1023,8 +1031,10 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > if (same_page) > put_page(page); > } else { > - if (WARN_ON_ONCE(bio_full(bio, len))) > - return -EINVAL; > + if (WARN_ON_ONCE(bio_full(bio, len))) { > + bio_put_pages(pages + i, left, offset); > + return -EINVAL; > + } It is unlikely to happen: unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; struct page **pages = (struct page **)bv; pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); > __bio_add_page(bio, page, len, offset); > } > offset = 0; > @@ -1069,6 +1079,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) > len = min_t(size_t, PAGE_SIZE - offset, left); > if (bio_add_hw_page(q, bio, page, len, offset, > max_append_sectors, &same_page) != len) { > + bio_put_pages(pages + i, left, offset); Same with above. Thanks, Ming
On 7/19/21 4:34 PM, Ming Lei wrote: > On Mon, Jul 19, 2021 at 11:53:00AM +0100, Pavel Begunkov wrote: >> __bio_iov_append_get_pages() doesn't put not appended pages on >> bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do >> the same for __bio_iov_iter_get_pages(), even though it looks like it >> can't be triggered by userspace in this case. >> >> Fixes: 0512a75b98f8 ("block: Introduce REQ_OP_ZONE_APPEND") >> Cc: stable@vger.kernel.org # 5.8+ >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >> --- >> >> I haven't tested the fail path, thus RFC. Would be great if someone can >> do it or take over the fix. >> >> block/bio.c | 15 +++++++++++++-- >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> diff --git a/block/bio.c b/block/bio.c >> index 1fab762e079b..d95e3456ba0c 100644 >> --- a/block/bio.c >> +++ b/block/bio.c >> @@ -979,6 +979,14 @@ static int bio_iov_bvec_set_append(struct bio *bio, struct iov_iter *iter) >> return 0; >> } >> >> +static void bio_put_pages(struct page **pages, size_t size, size_t off) >> +{ >> + size_t i, nr = DIV_ROUND_UP(size + (off & ~PAGE_MASK), PAGE_SIZE); >> + >> + for (i = 0; i < nr; i++) >> + put_page(pages[i]); >> +} >> + >> #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *)) >> >> /** >> @@ -1023,8 +1031,10 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) >> if (same_page) >> put_page(page); >> } else { >> - if (WARN_ON_ONCE(bio_full(bio, len))) >> - return -EINVAL; >> + if (WARN_ON_ONCE(bio_full(bio, len))) { >> + bio_put_pages(pages + i, left, offset); >> + return -EINVAL; >> + } > > It is unlikely to happen: > > unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; > struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; > struct page **pages = (struct page **)bv; > > pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); > size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); Agree, mentioned in the commit, however ... >> __bio_add_page(bio, page, len, offset); >> } >> offset = 0; >> @@ -1069,6 +1079,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) >> len = min_t(size_t, PAGE_SIZE - offset, left); >> if (bio_add_hw_page(q, bio, page, len, offset, >> max_append_sectors, &same_page) != len) { >> + bio_put_pages(pages + i, left, offset); > > Same with above. ... bio_add_hw_page() is more complex and additionally does checks against queue_max_zone_append_sectors(), queue_max_segments(), and queue_virt_boundary() in of bvec_gap_to_prev(). It may be unlikely, but are you sure that those are just safety checks? It's not so obvious to me, so would be great if you could point out the other place where the verification is done.
On Mon, Jul 19, 2021 at 06:06:49PM +0100, Pavel Begunkov wrote: > On 7/19/21 4:34 PM, Ming Lei wrote: > > On Mon, Jul 19, 2021 at 11:53:00AM +0100, Pavel Begunkov wrote: > >> __bio_iov_append_get_pages() doesn't put not appended pages on > >> bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do > >> the same for __bio_iov_iter_get_pages(), even though it looks like it > >> can't be triggered by userspace in this case. > >> > >> Fixes: 0512a75b98f8 ("block: Introduce REQ_OP_ZONE_APPEND") > >> Cc: stable@vger.kernel.org # 5.8+ > >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > >> --- > >> > >> I haven't tested the fail path, thus RFC. Would be great if someone can > >> do it or take over the fix. > >> > >> block/bio.c | 15 +++++++++++++-- > >> 1 file changed, 13 insertions(+), 2 deletions(-) > >> > >> diff --git a/block/bio.c b/block/bio.c > >> index 1fab762e079b..d95e3456ba0c 100644 > >> --- a/block/bio.c > >> +++ b/block/bio.c > >> @@ -979,6 +979,14 @@ static int bio_iov_bvec_set_append(struct bio *bio, struct iov_iter *iter) > >> return 0; > >> } > >> > >> +static void bio_put_pages(struct page **pages, size_t size, size_t off) > >> +{ > >> + size_t i, nr = DIV_ROUND_UP(size + (off & ~PAGE_MASK), PAGE_SIZE); > >> + > >> + for (i = 0; i < nr; i++) > >> + put_page(pages[i]); > >> +} > >> + > >> #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *)) > >> > >> /** > >> @@ -1023,8 +1031,10 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > >> if (same_page) > >> put_page(page); > >> } else { > >> - if (WARN_ON_ONCE(bio_full(bio, len))) > >> - return -EINVAL; > >> + if (WARN_ON_ONCE(bio_full(bio, len))) { > >> + bio_put_pages(pages + i, left, offset); > >> + return -EINVAL; > >> + } > > > > It is unlikely to happen: > > > > unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; > > struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; > > struct page **pages = (struct page **)bv; > > > > pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); > > size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); > > Agree, mentioned in the commit, however ... > > >> __bio_add_page(bio, page, len, offset); > >> } > >> offset = 0; > >> @@ -1069,6 +1079,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) > >> len = min_t(size_t, PAGE_SIZE - offset, left); > >> if (bio_add_hw_page(q, bio, page, len, offset, > >> max_append_sectors, &same_page) != len) { > >> + bio_put_pages(pages + i, left, offset); > > > > Same with above. > > ... bio_add_hw_page() is more complex and additionally does checks > against queue_max_zone_append_sectors(), queue_max_segments(), and > queue_virt_boundary() in of bvec_gap_to_prev(). > > It may be unlikely, but are you sure that those are just safety > checks? It's not so obvious to me, so would be great if you could > point out the other place where the verification is done. OK, bio_add_hw_page() is special, and it needs the handling, but __bio_iov_iter_get_pages() needn't that since it is so obvious. Thanks, Ming
On 7/20/21 3:11 AM, Ming Lei wrote: > On Mon, Jul 19, 2021 at 06:06:49PM +0100, Pavel Begunkov wrote: >> On 7/19/21 4:34 PM, Ming Lei wrote: >>> On Mon, Jul 19, 2021 at 11:53:00AM +0100, Pavel Begunkov wrote: >>>> __bio_iov_append_get_pages() doesn't put not appended pages on >>>> bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do >>>> the same for __bio_iov_iter_get_pages(), even though it looks like it >>>> can't be triggered by userspace in this case. >>>> >>>> Fixes: 0512a75b98f8 ("block: Introduce REQ_OP_ZONE_APPEND") >>>> Cc: stable@vger.kernel.org # 5.8+ >>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >>>> --- >>>> >>>> I haven't tested the fail path, thus RFC. Would be great if someone can >>>> do it or take over the fix. >>>> >>>> block/bio.c | 15 +++++++++++++-- >>>> 1 file changed, 13 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/block/bio.c b/block/bio.c >>>> index 1fab762e079b..d95e3456ba0c 100644 >>>> --- a/block/bio.c >>>> +++ b/block/bio.c >>>> @@ -979,6 +979,14 @@ static int bio_iov_bvec_set_append(struct bio *bio, struct iov_iter *iter) >>>> return 0; >>>> } >>>> >>>> +static void bio_put_pages(struct page **pages, size_t size, size_t off) >>>> +{ >>>> + size_t i, nr = DIV_ROUND_UP(size + (off & ~PAGE_MASK), PAGE_SIZE); >>>> + >>>> + for (i = 0; i < nr; i++) >>>> + put_page(pages[i]); >>>> +} >>>> + >>>> #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *)) >>>> >>>> /** >>>> @@ -1023,8 +1031,10 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) >>>> if (same_page) >>>> put_page(page); >>>> } else { >>>> - if (WARN_ON_ONCE(bio_full(bio, len))) >>>> - return -EINVAL; >>>> + if (WARN_ON_ONCE(bio_full(bio, len))) { >>>> + bio_put_pages(pages + i, left, offset); >>>> + return -EINVAL; >>>> + } >>> >>> It is unlikely to happen: >>> >>> unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; >>> struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; >>> struct page **pages = (struct page **)bv; >>> >>> pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); >>> size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); >> >> Agree, mentioned in the commit, however ... >> >>>> __bio_add_page(bio, page, len, offset); >>>> } >>>> offset = 0; >>>> @@ -1069,6 +1079,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) >>>> len = min_t(size_t, PAGE_SIZE - offset, left); >>>> if (bio_add_hw_page(q, bio, page, len, offset, >>>> max_append_sectors, &same_page) != len) { >>>> + bio_put_pages(pages + i, left, offset); >>> >>> Same with above. >> >> ... bio_add_hw_page() is more complex and additionally does checks >> against queue_max_zone_append_sectors(), queue_max_segments(), and >> queue_virt_boundary() in of bvec_gap_to_prev(). >> >> It may be unlikely, but are you sure that those are just safety >> checks? It's not so obvious to me, so would be great if you could >> point out the other place where the verification is done. > > OK, bio_add_hw_page() is special, and it needs the handling, but > __bio_iov_iter_get_pages() needn't that since it is so obvious. Right. I don't mind to drop the first chunk, but it doesn't hurt, and I'd guess the bug came from copy-pasting and editing __bio_iov_iter_get_pages(). That's the reason I added it in the first place.
On 7/19/21 11:53 AM, Pavel Begunkov wrote: > __bio_iov_append_get_pages() doesn't put not appended pages on > bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do > the same for __bio_iov_iter_get_pages(), even though it looks like it > can't be triggered by userspace in this case. Any comments? > > Fixes: 0512a75b98f8 ("block: Introduce REQ_OP_ZONE_APPEND") > Cc: stable@vger.kernel.org # 5.8+ > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > --- > > I haven't tested the fail path, thus RFC. Would be great if someone can > do it or take over the fix. > > block/bio.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 1fab762e079b..d95e3456ba0c 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -979,6 +979,14 @@ static int bio_iov_bvec_set_append(struct bio *bio, struct iov_iter *iter) > return 0; > } > > +static void bio_put_pages(struct page **pages, size_t size, size_t off) > +{ > + size_t i, nr = DIV_ROUND_UP(size + (off & ~PAGE_MASK), PAGE_SIZE); > + > + for (i = 0; i < nr; i++) > + put_page(pages[i]); > +} > + > #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *)) > > /** > @@ -1023,8 +1031,10 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > if (same_page) > put_page(page); > } else { > - if (WARN_ON_ONCE(bio_full(bio, len))) > - return -EINVAL; > + if (WARN_ON_ONCE(bio_full(bio, len))) { > + bio_put_pages(pages + i, left, offset); > + return -EINVAL; > + } > __bio_add_page(bio, page, len, offset); > } > offset = 0; > @@ -1069,6 +1079,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) > len = min_t(size_t, PAGE_SIZE - offset, left); > if (bio_add_hw_page(q, bio, page, len, offset, > max_append_sectors, &same_page) != len) { > + bio_put_pages(pages + i, left, offset); > ret = -EINVAL; > break; > } >
On 7/19/21 4:53 AM, Pavel Begunkov wrote: > __bio_iov_append_get_pages() doesn't put not appended pages on > bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do > the same for __bio_iov_iter_get_pages(), even though it looks like it > can't be triggered by userspace in this case. Applied for 5.15, thanks.
diff --git a/block/bio.c b/block/bio.c index 1fab762e079b..d95e3456ba0c 100644 --- a/block/bio.c +++ b/block/bio.c @@ -979,6 +979,14 @@ static int bio_iov_bvec_set_append(struct bio *bio, struct iov_iter *iter) return 0; } +static void bio_put_pages(struct page **pages, size_t size, size_t off) +{ + size_t i, nr = DIV_ROUND_UP(size + (off & ~PAGE_MASK), PAGE_SIZE); + + for (i = 0; i < nr; i++) + put_page(pages[i]); +} + #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *)) /** @@ -1023,8 +1031,10 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) if (same_page) put_page(page); } else { - if (WARN_ON_ONCE(bio_full(bio, len))) - return -EINVAL; + if (WARN_ON_ONCE(bio_full(bio, len))) { + bio_put_pages(pages + i, left, offset); + return -EINVAL; + } __bio_add_page(bio, page, len, offset); } offset = 0; @@ -1069,6 +1079,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) len = min_t(size_t, PAGE_SIZE - offset, left); if (bio_add_hw_page(q, bio, page, len, offset, max_append_sectors, &same_page) != len) { + bio_put_pages(pages + i, left, offset); ret = -EINVAL; break; }
__bio_iov_append_get_pages() doesn't put not appended pages on bio_add_hw_page() failure, so potentially leaking them, fix it. Also, do the same for __bio_iov_iter_get_pages(), even though it looks like it can't be triggered by userspace in this case. Fixes: 0512a75b98f8 ("block: Introduce REQ_OP_ZONE_APPEND") Cc: stable@vger.kernel.org # 5.8+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- I haven't tested the fail path, thus RFC. Would be great if someone can do it or take over the fix. block/bio.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)