Message ID | 1550722655-15102-3-git-send-email-dongli.zhang@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | loop: fix two issues introduced by prior commit | expand |
On Thu 21-02-19 12:17:35, Dongli Zhang wrote: > Commit 0da03cab87e6 > ("loop: Fix deadlock when calling blkdev_reread_part()") moves > blkdev_reread_part() out of the loop_ctl_mutex. However, > GENHD_FL_NO_PART_SCAN is set before __blkdev_reread_part(). As a result, > __blkdev_reread_part() will fail the check of GENHD_FL_NO_PART_SCAN and > will not rescan the loop device to delete all partitions. > > Below are steps to reproduce the issue: > > step1 # dd if=/dev/zero of=tmp.raw bs=1M count=100 > step2 # losetup -P /dev/loop0 tmp.raw > step3 # parted /dev/loop0 mklabel gpt > step4 # parted -a none -s /dev/loop0 mkpart primary 64s 1 > step5 # losetup -d /dev/loop0 Can you perhaps write a blktest for this? Thanks! > Step5 will not be able to delete /dev/loop0p1 (introduced by step4) and > there is below kernel warning message: > > [ 464.414043] __loop_clr_fd: partition scan of loop0 failed (rc=-22) > > This patch sets GENHD_FL_NO_PART_SCAN after blkdev_reread_part(). > > Fixes: 0da03cab87e6 ("loop: Fix deadlock when calling blkdev_reread_part()") > Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> > --- > drivers/block/loop.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index 7908673..736e55b 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -1034,6 +1034,15 @@ loop_init_xfer(struct loop_device *lo, struct loop_func_table *xfer, > return err; > } > > +static void loop_disable_partscan(struct loop_device *lo) > +{ > + mutex_lock(&loop_ctl_mutex); > + lo->lo_flags = 0; > + if (!part_shift) > + lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; > + mutex_unlock(&loop_ctl_mutex); > +} > + > static int __loop_clr_fd(struct loop_device *lo, bool release) > { > struct file *filp = NULL; > @@ -1096,9 +1105,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool release) > > partscan = lo->lo_flags & LO_FLAGS_PARTSCAN && bdev; > lo_number = lo->lo_number; > - lo->lo_flags = 0; > - if (!part_shift) > - lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; > loop_unprepare_queue(lo); > out_unlock: > mutex_unlock(&loop_ctl_mutex); > @@ -1121,6 +1127,9 @@ static int __loop_clr_fd(struct loop_device *lo, bool release) > /* Device is gone, no point in returning error */ > err = 0; > } > + > + loop_disable_partscan(lo); > + > /* > * Need not hold loop_ctl_mutex to fput backing file. > * Calling fput holding loop_ctl_mutex triggers a circular So I don't think this change is actually correct. The problem is that once lo->lo_state is set to Lo_unbound and loop_ctl_mutex is unlocked, the loop device structure can be reused for a new device (bound to a new file). So you cannot safely manipulate flags on lo->lo_disk anymore. But I think we can just move the setting of lo->lo_state to Lo_unbound after partscan has finished as well. There cannot be anybody else entering __loop_clr_fd() as lo->lo_backing_file is already cleared and Lo_rundown state protects us from all the other places trying to change the 'lo' device (please make this last sentence into a comment in the code explaining why setting lo->lo_state so late is fine). Thanks! Honza
On 02/21/2019 07:30 PM, Jan Kara wrote: > On Thu 21-02-19 12:17:35, Dongli Zhang wrote: >> Commit 0da03cab87e6 >> ("loop: Fix deadlock when calling blkdev_reread_part()") moves >> blkdev_reread_part() out of the loop_ctl_mutex. However, >> GENHD_FL_NO_PART_SCAN is set before __blkdev_reread_part(). As a result, >> __blkdev_reread_part() will fail the check of GENHD_FL_NO_PART_SCAN and >> will not rescan the loop device to delete all partitions. >> >> Below are steps to reproduce the issue: >> >> step1 # dd if=/dev/zero of=tmp.raw bs=1M count=100 >> step2 # losetup -P /dev/loop0 tmp.raw >> step3 # parted /dev/loop0 mklabel gpt >> step4 # parted -a none -s /dev/loop0 mkpart primary 64s 1 >> step5 # losetup -d /dev/loop0 > > Can you perhaps write a blktest for this? Thanks! I will write a blktest for above case. Thanks for the suggestion. > >> Step5 will not be able to delete /dev/loop0p1 (introduced by step4) and >> there is below kernel warning message: >> >> [ 464.414043] __loop_clr_fd: partition scan of loop0 failed (rc=-22) >> >> This patch sets GENHD_FL_NO_PART_SCAN after blkdev_reread_part(). >> >> Fixes: 0da03cab87e6 ("loop: Fix deadlock when calling blkdev_reread_part()") >> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> >> --- >> drivers/block/loop.c | 15 ++++++++++++--- >> 1 file changed, 12 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/block/loop.c b/drivers/block/loop.c >> index 7908673..736e55b 100644 >> --- a/drivers/block/loop.c >> +++ b/drivers/block/loop.c >> @@ -1034,6 +1034,15 @@ loop_init_xfer(struct loop_device *lo, struct loop_func_table *xfer, >> return err; >> } >> >> +static void loop_disable_partscan(struct loop_device *lo) >> +{ >> + mutex_lock(&loop_ctl_mutex); >> + lo->lo_flags = 0; >> + if (!part_shift) >> + lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; >> + mutex_unlock(&loop_ctl_mutex); >> +} >> + >> static int __loop_clr_fd(struct loop_device *lo, bool release) >> { >> struct file *filp = NULL; >> @@ -1096,9 +1105,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool release) >> >> partscan = lo->lo_flags & LO_FLAGS_PARTSCAN && bdev; >> lo_number = lo->lo_number; >> - lo->lo_flags = 0; >> - if (!part_shift) >> - lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; >> loop_unprepare_queue(lo); >> out_unlock: >> mutex_unlock(&loop_ctl_mutex); >> @@ -1121,6 +1127,9 @@ static int __loop_clr_fd(struct loop_device *lo, bool release) >> /* Device is gone, no point in returning error */ >> err = 0; >> } >> + >> + loop_disable_partscan(lo); >> + >> /* >> * Need not hold loop_ctl_mutex to fput backing file. >> * Calling fput holding loop_ctl_mutex triggers a circular > > So I don't think this change is actually correct. The problem is that once > lo->lo_state is set to Lo_unbound and loop_ctl_mutex is unlocked, the loop > device structure can be reused for a new device (bound to a new file). So > you cannot safely manipulate flags on lo->lo_disk anymore. But I think we > can just move the setting of lo->lo_state to Lo_unbound after partscan has > finished as well. There cannot be anybody else entering __loop_clr_fd() as > lo->lo_backing_file is already cleared and Lo_rundown state protects us > from all the other places trying to change the 'lo' device (please make > this last sentence into a comment in the code explaining why setting > lo->lo_state so late is fine). Thanks! I will set lo->lo_state to Lo_unbound after partscan in v2. Thank you very much! Dongli Zhang > > Honza >
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 7908673..736e55b 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1034,6 +1034,15 @@ loop_init_xfer(struct loop_device *lo, struct loop_func_table *xfer, return err; } +static void loop_disable_partscan(struct loop_device *lo) +{ + mutex_lock(&loop_ctl_mutex); + lo->lo_flags = 0; + if (!part_shift) + lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; + mutex_unlock(&loop_ctl_mutex); +} + static int __loop_clr_fd(struct loop_device *lo, bool release) { struct file *filp = NULL; @@ -1096,9 +1105,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool release) partscan = lo->lo_flags & LO_FLAGS_PARTSCAN && bdev; lo_number = lo->lo_number; - lo->lo_flags = 0; - if (!part_shift) - lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; loop_unprepare_queue(lo); out_unlock: mutex_unlock(&loop_ctl_mutex); @@ -1121,6 +1127,9 @@ static int __loop_clr_fd(struct loop_device *lo, bool release) /* Device is gone, no point in returning error */ err = 0; } + + loop_disable_partscan(lo); + /* * Need not hold loop_ctl_mutex to fput backing file. * Calling fput holding loop_ctl_mutex triggers a circular
Commit 0da03cab87e6 ("loop: Fix deadlock when calling blkdev_reread_part()") moves blkdev_reread_part() out of the loop_ctl_mutex. However, GENHD_FL_NO_PART_SCAN is set before __blkdev_reread_part(). As a result, __blkdev_reread_part() will fail the check of GENHD_FL_NO_PART_SCAN and will not rescan the loop device to delete all partitions. Below are steps to reproduce the issue: step1 # dd if=/dev/zero of=tmp.raw bs=1M count=100 step2 # losetup -P /dev/loop0 tmp.raw step3 # parted /dev/loop0 mklabel gpt step4 # parted -a none -s /dev/loop0 mkpart primary 64s 1 step5 # losetup -d /dev/loop0 Step5 will not be able to delete /dev/loop0p1 (introduced by step4) and there is below kernel warning message: [ 464.414043] __loop_clr_fd: partition scan of loop0 failed (rc=-22) This patch sets GENHD_FL_NO_PART_SCAN after blkdev_reread_part(). Fixes: 0da03cab87e6 ("loop: Fix deadlock when calling blkdev_reread_part()") Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> --- drivers/block/loop.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)