Message ID | 878tewzjz3.fsf@notabene.neil.brown.name (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Nov 23 2017 at 5:52pm -0500, NeilBrown <neilb@suse.com> wrote: > > When we use bio_clone_bioset() to split off the front part of a bio > and chain the two together and submit the remainder to > generic_make_request(), it is important that the newly allocated > bio is used as the head to be processed immediately, and the original > bio gets "bio_advance()"d and sent to generic_make_request() as the > remainder. > > If the newly allocated bio is used as the remainder, and if it then > needs to be split again, then the next bio_clone_bioset() call will > be made while holding a reference a bio (result of the first clone) > from the same bioset. This can potentially exhaust the bioset mempool > and result in a memory allocation deadlock. > > So the result of the bio_clone_bioset() must be attached to the new > dm_io struct, and the original must be resubmitted. The current code > is backwards. > > Note that there is no race caused by reassigning cio.io->bio after already > calling __map_bio(). This bio will only be dereferenced again after > dec_pending() has found io->io_count to be zero, and this cannot happen > before the dec_pending() call at the end of __split_and_process_bio(). > > Reported-by: Mikulas Patocka <mpatocka@redhat.com> > Signed-off-by: NeilBrown <neilb@suse.com> > --- > > Hi, > I think this should resolve the problem Mikulas noticed that the > bios form a deep chain instead of a wide tree. I'm inclined to just fold this into the original commit. I'd update that header to make mention of the details captured in this header. Would you be OK with that? Mike
On Mon, Nov 27 2017, Mike Snitzer wrote: > On Thu, Nov 23 2017 at 5:52pm -0500, > NeilBrown <neilb@suse.com> wrote: > >> >> When we use bio_clone_bioset() to split off the front part of a bio >> and chain the two together and submit the remainder to >> generic_make_request(), it is important that the newly allocated >> bio is used as the head to be processed immediately, and the original >> bio gets "bio_advance()"d and sent to generic_make_request() as the >> remainder. >> >> If the newly allocated bio is used as the remainder, and if it then >> needs to be split again, then the next bio_clone_bioset() call will >> be made while holding a reference a bio (result of the first clone) >> from the same bioset. This can potentially exhaust the bioset mempool >> and result in a memory allocation deadlock. >> >> So the result of the bio_clone_bioset() must be attached to the new >> dm_io struct, and the original must be resubmitted. The current code >> is backwards. >> >> Note that there is no race caused by reassigning cio.io->bio after already >> calling __map_bio(). This bio will only be dereferenced again after >> dec_pending() has found io->io_count to be zero, and this cannot happen >> before the dec_pending() call at the end of __split_and_process_bio(). >> >> Reported-by: Mikulas Patocka <mpatocka@redhat.com> >> Signed-off-by: NeilBrown <neilb@suse.com> >> --- >> >> Hi, >> I think this should resolve the problem Mikulas noticed that the >> bios form a deep chain instead of a wide tree. > > I'm inclined to just fold this into the original commit. > I'd update that header to make mention of the details captured in this > header. > > Would you be OK with that? Perfectly OK with that. Thanks for asking. NeilBrown
diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 99ec215f7dcb..2e0e10a1c030 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1482,12 +1482,19 @@ static void __split_and_process_bio(struct mapped_device *md, * Remainder must be passed to generic_make_request() * so that it gets handled *after* bios already submitted * have been completely processed. + * We take a clone of the original to store in + * ci.io->bio to be used by end_io_acct() and + * for dec_pending to use for completion handling. + * As this path is not used for REQ_OP_ZONE_REPORT, + * the usage of io->bio in dm_remap_zone_report() + * won't be affected by this reassignment. */ struct bio *b = bio_clone_bioset(bio, GFP_NOIO, md->queue->bio_split); - bio_advance(b, (bio_sectors(b) - ci.sector_count) << 9); + ci.io->bio = b; + bio_advance(bio, (bio_sectors(bio) - ci.sector_count) << 9); bio_chain(b, bio); - generic_make_request(b); + generic_make_request(bio); break; } }
When we use bio_clone_bioset() to split off the front part of a bio and chain the two together and submit the remainder to generic_make_request(), it is important that the newly allocated bio is used as the head to be processed immediately, and the original bio gets "bio_advance()"d and sent to generic_make_request() as the remainder. If the newly allocated bio is used as the remainder, and if it then needs to be split again, then the next bio_clone_bioset() call will be made while holding a reference a bio (result of the first clone) from the same bioset. This can potentially exhaust the bioset mempool and result in a memory allocation deadlock. So the result of the bio_clone_bioset() must be attached to the new dm_io struct, and the original must be resubmitted. The current code is backwards. Note that there is no race caused by reassigning cio.io->bio after already calling __map_bio(). This bio will only be dereferenced again after dec_pending() has found io->io_count to be zero, and this cannot happen before the dec_pending() call at the end of __split_and_process_bio(). Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> --- Hi, I think this should resolve the problem Mikulas noticed that the bios form a deep chain instead of a wide tree. Thanks, NeilBrown drivers/md/dm.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)