Message ID | 20230911040217.253905-3-dlemoal@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fix libata suspend/resume handling and code cleanup | expand |
On 9/11/23 06:02, Damien Le Moal wrote: > Whenever an ATA adapter driver is removed (e.g. rmmod), > ata_port_detach() is called repeatedly for all the adapter ports to > remove (unload) the devices attached to the port and delete the port > device itself. Removing of devices is done using libata EH with the > ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute > ata_eh_unload() which disables all devices attached to the port. > > ata_port_detach() finishes by calling scsi_remove_host() to remove the > scsi host associated with the port. This function will trigger the > removal of all scsi devices attached to the host and in the case of > disks, calls to sd_shutdown() which will flush the device write cache > and stop the device. However, given that the devices were already > disabled by ata_eh_unload(), the synchronize write cache command and > start stop unit commands fail. E.g. running "rmmod ahci" with first > removing sd_mod results in error messages like: > > ata13.00: disable device > sd 0:0:0:0: [sda] Synchronizing SCSI cache > sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > sd 0:0:0:0: [sda] Stopping disk > sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > > Fix this by removing all scsi devices of the ata devices connected to > the port before scheduling libata EH to disable the ATA devices. > > Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") > Cc: stable@vger.kernel.org > Signed-off-by: Damien Le Moal <dlemoal@kernel.org> > --- > drivers/ata/libata-core.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > index c4898483d716..693cb3cd70cd 100644 > --- a/drivers/ata/libata-core.c > +++ b/drivers/ata/libata-core.c > @@ -5952,11 +5952,30 @@ static void ata_port_detach(struct ata_port *ap) > struct ata_link *link; > struct ata_device *dev; > > - /* tell EH we're leaving & flush EH */ > + /* Wait for any ongoing EH */ > + ata_port_wait_eh(ap); > + > + mutex_lock(&ap->scsi_scan_mutex); > spin_lock_irqsave(ap->lock, flags); > + > + /* Remove scsi devices */ > + ata_for_each_link(link, ap, HOST_FIRST) { > + ata_for_each_dev(dev, link, ALL) { > + if (dev->sdev) { > + spin_unlock_irqrestore(ap->lock, flags); > + scsi_remove_device(dev->sdev); > + spin_lock_irqsave(ap->lock, flags); > + dev->sdev = NULL; > + } > + } > + } > + > + /* Tell EH to disable all devices */ > ap->pflags |= ATA_PFLAG_UNLOADING; > ata_port_schedule_eh(ap); > + > spin_unlock_irqrestore(ap->lock, flags); > + mutex_unlock(&ap->scsi_scan_mutex); > Are you sure about releasing scan_mutex after ata_port_schedule_eh()? Technically ata_port_schedule_eh() will be calling into SCSI rescan, which would take scan_mutex ... Not that it matter much, seeing that we've disconnected all devices, but still ... Cheers, Hannes
On 9/11/23 15:37, Hannes Reinecke wrote: > On 9/11/23 06:02, Damien Le Moal wrote: >> Whenever an ATA adapter driver is removed (e.g. rmmod), >> ata_port_detach() is called repeatedly for all the adapter ports to >> remove (unload) the devices attached to the port and delete the port >> device itself. Removing of devices is done using libata EH with the >> ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute >> ata_eh_unload() which disables all devices attached to the port. >> >> ata_port_detach() finishes by calling scsi_remove_host() to remove the >> scsi host associated with the port. This function will trigger the >> removal of all scsi devices attached to the host and in the case of >> disks, calls to sd_shutdown() which will flush the device write cache >> and stop the device. However, given that the devices were already >> disabled by ata_eh_unload(), the synchronize write cache command and >> start stop unit commands fail. E.g. running "rmmod ahci" with first >> removing sd_mod results in error messages like: >> >> ata13.00: disable device >> sd 0:0:0:0: [sda] Synchronizing SCSI cache >> sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: >> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK >> sd 0:0:0:0: [sda] Stopping disk >> sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET >> driverbyte=DRIVER_OK >> >> Fix this by removing all scsi devices of the ata devices connected to >> the port before scheduling libata EH to disable the ATA devices. >> >> Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") >> Cc: stable@vger.kernel.org >> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> >> --- >> drivers/ata/libata-core.c | 21 ++++++++++++++++++++- >> 1 file changed, 20 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c >> index c4898483d716..693cb3cd70cd 100644 >> --- a/drivers/ata/libata-core.c >> +++ b/drivers/ata/libata-core.c >> @@ -5952,11 +5952,30 @@ static void ata_port_detach(struct ata_port *ap) >> struct ata_link *link; >> struct ata_device *dev; >> - /* tell EH we're leaving & flush EH */ >> + /* Wait for any ongoing EH */ >> + ata_port_wait_eh(ap); >> + >> + mutex_lock(&ap->scsi_scan_mutex); >> spin_lock_irqsave(ap->lock, flags); >> + >> + /* Remove scsi devices */ >> + ata_for_each_link(link, ap, HOST_FIRST) { >> + ata_for_each_dev(dev, link, ALL) { >> + if (dev->sdev) { >> + spin_unlock_irqrestore(ap->lock, flags); >> + scsi_remove_device(dev->sdev); >> + spin_lock_irqsave(ap->lock, flags); >> + dev->sdev = NULL; >> + } >> + } >> + } >> + >> + /* Tell EH to disable all devices */ >> ap->pflags |= ATA_PFLAG_UNLOADING; >> ata_port_schedule_eh(ap); >> + >> spin_unlock_irqrestore(ap->lock, flags); >> + mutex_unlock(&ap->scsi_scan_mutex); >> > Are you sure about releasing scan_mutex after ata_port_schedule_eh()? > Technically ata_port_schedule_eh() will be calling into SCSI rescan, which > would take scan_mutex ... > Not that it matter much, seeing that we've disconnected all devices, but still ... UNLOADING is set and in that case, EH does nothing else than removing the devices. So I think this is OK. > > Cheers, > > Hannes
On 9/11/23 08:44, Damien Le Moal wrote: > On 9/11/23 15:37, Hannes Reinecke wrote: >> On 9/11/23 06:02, Damien Le Moal wrote: >>> Whenever an ATA adapter driver is removed (e.g. rmmod), >>> ata_port_detach() is called repeatedly for all the adapter ports to >>> remove (unload) the devices attached to the port and delete the port >>> device itself. Removing of devices is done using libata EH with the >>> ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute >>> ata_eh_unload() which disables all devices attached to the port. >>> >>> ata_port_detach() finishes by calling scsi_remove_host() to remove the >>> scsi host associated with the port. This function will trigger the >>> removal of all scsi devices attached to the host and in the case of >>> disks, calls to sd_shutdown() which will flush the device write cache >>> and stop the device. However, given that the devices were already >>> disabled by ata_eh_unload(), the synchronize write cache command and >>> start stop unit commands fail. E.g. running "rmmod ahci" with first >>> removing sd_mod results in error messages like: >>> >>> ata13.00: disable device >>> sd 0:0:0:0: [sda] Synchronizing SCSI cache >>> sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: >>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK >>> sd 0:0:0:0: [sda] Stopping disk >>> sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET >>> driverbyte=DRIVER_OK >>> >>> Fix this by removing all scsi devices of the ata devices connected to >>> the port before scheduling libata EH to disable the ATA devices. >>> >>> Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") >>> Cc: stable@vger.kernel.org >>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> >>> --- >>> drivers/ata/libata-core.c | 21 ++++++++++++++++++++- >>> 1 file changed, 20 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c >>> index c4898483d716..693cb3cd70cd 100644 >>> --- a/drivers/ata/libata-core.c >>> +++ b/drivers/ata/libata-core.c >>> @@ -5952,11 +5952,30 @@ static void ata_port_detach(struct ata_port *ap) >>> struct ata_link *link; >>> struct ata_device *dev; >>> - /* tell EH we're leaving & flush EH */ >>> + /* Wait for any ongoing EH */ >>> + ata_port_wait_eh(ap); >>> + >>> + mutex_lock(&ap->scsi_scan_mutex); >>> spin_lock_irqsave(ap->lock, flags); >>> + >>> + /* Remove scsi devices */ >>> + ata_for_each_link(link, ap, HOST_FIRST) { >>> + ata_for_each_dev(dev, link, ALL) { >>> + if (dev->sdev) { >>> + spin_unlock_irqrestore(ap->lock, flags); >>> + scsi_remove_device(dev->sdev); >>> + spin_lock_irqsave(ap->lock, flags); >>> + dev->sdev = NULL; >>> + } >>> + } >>> + } >>> + >>> + /* Tell EH to disable all devices */ >>> ap->pflags |= ATA_PFLAG_UNLOADING; >>> ata_port_schedule_eh(ap); >>> + >>> spin_unlock_irqrestore(ap->lock, flags); >>> + mutex_unlock(&ap->scsi_scan_mutex); >>> >> Are you sure about releasing scan_mutex after ata_port_schedule_eh()? >> Technically ata_port_schedule_eh() will be calling into SCSI rescan, which >> would take scan_mutex ... >> Not that it matter much, seeing that we've disconnected all devices, but still ... > > UNLOADING is set and in that case, EH does nothing else than removing the > devices. So I think this is OK. > Yeah, thought so. Just wanted to mention it. Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
On Mon, Sep 11, 2023 at 01:02:00PM +0900, Damien Le Moal wrote: > Whenever an ATA adapter driver is removed (e.g. rmmod), > ata_port_detach() is called repeatedly for all the adapter ports to > remove (unload) the devices attached to the port and delete the port > device itself. Removing of devices is done using libata EH with the > ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute > ata_eh_unload() which disables all devices attached to the port. > > ata_port_detach() finishes by calling scsi_remove_host() to remove the > scsi host associated with the port. This function will trigger the > removal of all scsi devices attached to the host and in the case of > disks, calls to sd_shutdown() which will flush the device write cache > and stop the device. However, given that the devices were already > disabled by ata_eh_unload(), the synchronize write cache command and > start stop unit commands fail. E.g. running "rmmod ahci" with first > removing sd_mod results in error messages like: > > ata13.00: disable device > sd 0:0:0:0: [sda] Synchronizing SCSI cache > sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > sd 0:0:0:0: [sda] Stopping disk > sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > > Fix this by removing all scsi devices of the ata devices connected to > the port before scheduling libata EH to disable the ATA devices. > > Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") > Cc: stable@vger.kernel.org > Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index c4898483d716..693cb3cd70cd 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -5952,11 +5952,30 @@ static void ata_port_detach(struct ata_port *ap) struct ata_link *link; struct ata_device *dev; - /* tell EH we're leaving & flush EH */ + /* Wait for any ongoing EH */ + ata_port_wait_eh(ap); + + mutex_lock(&ap->scsi_scan_mutex); spin_lock_irqsave(ap->lock, flags); + + /* Remove scsi devices */ + ata_for_each_link(link, ap, HOST_FIRST) { + ata_for_each_dev(dev, link, ALL) { + if (dev->sdev) { + spin_unlock_irqrestore(ap->lock, flags); + scsi_remove_device(dev->sdev); + spin_lock_irqsave(ap->lock, flags); + dev->sdev = NULL; + } + } + } + + /* Tell EH to disable all devices */ ap->pflags |= ATA_PFLAG_UNLOADING; ata_port_schedule_eh(ap); + spin_unlock_irqrestore(ap->lock, flags); + mutex_unlock(&ap->scsi_scan_mutex); /* wait till EH commits suicide */ ata_port_wait_eh(ap);
Whenever an ATA adapter driver is removed (e.g. rmmod), ata_port_detach() is called repeatedly for all the adapter ports to remove (unload) the devices attached to the port and delete the port device itself. Removing of devices is done using libata EH with the ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute ata_eh_unload() which disables all devices attached to the port. ata_port_detach() finishes by calling scsi_remove_host() to remove the scsi host associated with the port. This function will trigger the removal of all scsi devices attached to the host and in the case of disks, calls to sd_shutdown() which will flush the device write cache and stop the device. However, given that the devices were already disabled by ata_eh_unload(), the synchronize write cache command and start stop unit commands fail. E.g. running "rmmod ahci" with first removing sd_mod results in error messages like: ata13.00: disable device sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] Stopping disk sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Fix this by removing all scsi devices of the ata devices connected to the port before scheduling libata EH to disable the ATA devices. Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> --- drivers/ata/libata-core.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)