Message ID | 20210113063103.2698953-1-yebin10@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [PATH,v2] scsi: scsi_dh_rdac: Avoid crash during rdac_bus_attach | expand |
On 2021/1/13 14:31, Ye Bin wrote: > We get follow BUG_ON when rdac scan: > [595952.944297] kernel BUG at drivers/scsi/device_handler/scsi_dh_rdac.c:427! > [595952.951143] Internal error: Oops - BUG: 0 [#1] SMP > ...... > [595953.251065] Call trace: > [595953.259054] check_ownership+0xb0/0x118 > [595953.269794] rdac_bus_attach+0x1f0/0x4b0 > [595953.273787] scsi_dh_handler_attach+0x3c/0xe8 > [595953.278211] scsi_dh_add_device+0xc4/0xe8 > [595953.282291] scsi_sysfs_add_sdev+0x8c/0x2a8 > [595953.286544] scsi_probe_and_add_lun+0x9fc/0xd00 > [595953.291142] __scsi_scan_target+0x598/0x630 > [595953.295395] scsi_scan_target+0x120/0x130 > [595953.299481] fc_user_scan+0x1a0/0x1c0 [scsi_transport_fc] > [595953.304944] store_scan+0xb0/0x108 > [595953.308420] dev_attr_store+0x44/0x60 > [595953.312160] sysfs_kf_write+0x58/0x80 > [595953.315893] kernfs_fop_write+0xe8/0x1f0 > [595953.319888] __vfs_write+0x60/0x190 > [595953.323448] vfs_write+0xac/0x1c0 > [595953.326836] ksys_write+0x74/0xf0 > [595953.330221] __arm64_sys_write+0x24/0x30 > > BUG_ON code is in check_ownership: > list_for_each_entry_rcu(tmp, &h->ctlr->dh_list, node) { > /* h->sdev should always be valid */ > BUG_ON(!tmp->sdev); > tmp->sdev->access_state = access_state; > } > rdac_bus_attach > initialize_controller > list_add_rcu(&h->node, &h->ctlr->dh_list); > h->sdev = sdev; > rdac_bus_detach > list_del_rcu(&h->node); > h->sdev = NULL; > > Test as follow steps: > (1) Find IO error, remove disk; > (2) Insert disk back; > (3) trigger scan disk; > > There is race between rdac_bus_attach and rdac_bus_detach, maybe access > rdac_dh_data which h->sdev has been set NULL when process rdac attach. And also > find that "h->sdev" set value after add list, this may lead to reference NULL ptr. > > Signed-off-by: Ye Bin <yebin10@huawei.com> > --- > drivers/scsi/device_handler/scsi_dh_rdac.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c > index 5efc959493ec..85a71bafaea7 100644 > --- a/drivers/scsi/device_handler/scsi_dh_rdac.c > +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c > @@ -453,8 +453,8 @@ static int initialize_controller(struct scsi_device *sdev, > if (!h->ctlr) > err = SCSI_DH_RES_TEMP_UNAVAIL; > else { > - list_add_rcu(&h->node, &h->ctlr->dh_list); > h->sdev = sdev; > + list_add_rcu(&h->node, &h->ctlr->dh_list); > } > spin_unlock(&list_lock); > err = SCSI_DH_OK; > @@ -778,11 +778,11 @@ static void rdac_bus_detach( struct scsi_device *sdev ) > spin_lock(&list_lock); > if (h->ctlr) { > list_del_rcu(&h->node); > - h->sdev = NULL; > kref_put(&h->ctlr->kref, release_controller); > } > spin_unlock(&list_lock); > sdev->handler_data = NULL; > + synchronize_rcu(); > kfree(h); > } > ping ...
On 1/12/21 10:31 PM, Ye Bin wrote: > sdev->handler_data = NULL; > + synchronize_rcu(); > kfree(h); What is the purpose of the new synchronize_rcu() call? If its purpose is to wait until *h is no longer in use, please use kfree_rcu() instead. Thanks, Bart.
On 2021/7/23 12:04, Bart Van Assche wrote: > On 1/12/21 10:31 PM, Ye Bin wrote: >> sdev->handler_data = NULL; >> + synchronize_rcu(); >> kfree(h); > What is the purpose of the new synchronize_rcu() call? Thanks for your reply. Yes, I add new synchronize_rcu() call is to wait until *h is no longer in use. If free "h" right now , mybe lead to UAF. > If its purpose is > to wait until *h is no longer in use, please use kfree_rcu() instead. struct rdac_dh_data { struct list_head node; ..... } As rdac_dh_data.node type is "struct list_head", but kfree_rcu the first parameter type is "struct rcu_head". So we can only use synchronize_rcu() at here. > > Thanks, > > Bart. > . >
On 7/28/21 7:24 AM, yebin wrote: > On 2021/7/23 12:04, Bart Van Assche wrote: >> On 1/12/21 10:31 PM, Ye Bin wrote: >>> sdev->handler_data = NULL; >>> + synchronize_rcu(); >>> kfree(h); >> What is the purpose of the new synchronize_rcu() call? > Thanks for your reply. > Yes, I add new synchronize_rcu() call is to wait until *h is no longer > in use. If free > "h" right now , mybe lead to UAF. >> If its purpose is >> to wait until *h is no longer in use, please use kfree_rcu() instead. > struct rdac_dh_data { > struct list_head node; > ..... > } > As rdac_dh_data.node type is "struct list_head", but kfree_rcu the > first parameter type is > "struct rcu_head". So we can only use synchronize_rcu() at here. Ah, that's right. Hence: Reviewed-by: Bart Van Assche <bvanassche@acm.org>
On Wed, 13 Jan 2021 14:31:03 +0800, Ye Bin wrote: > We get follow BUG_ON when rdac scan: > [595952.944297] kernel BUG at drivers/scsi/device_handler/scsi_dh_rdac.c:427! > [595952.951143] Internal error: Oops - BUG: 0 [#1] SMP > ...... > [595953.251065] Call trace: > [595953.259054] check_ownership+0xb0/0x118 > [595953.269794] rdac_bus_attach+0x1f0/0x4b0 > [595953.273787] scsi_dh_handler_attach+0x3c/0xe8 > [595953.278211] scsi_dh_add_device+0xc4/0xe8 > [595953.282291] scsi_sysfs_add_sdev+0x8c/0x2a8 > [595953.286544] scsi_probe_and_add_lun+0x9fc/0xd00 > [595953.291142] __scsi_scan_target+0x598/0x630 > [595953.295395] scsi_scan_target+0x120/0x130 > [595953.299481] fc_user_scan+0x1a0/0x1c0 [scsi_transport_fc] > [595953.304944] store_scan+0xb0/0x108 > [595953.308420] dev_attr_store+0x44/0x60 > [595953.312160] sysfs_kf_write+0x58/0x80 > [595953.315893] kernfs_fop_write+0xe8/0x1f0 > [595953.319888] __vfs_write+0x60/0x190 > [595953.323448] vfs_write+0xac/0x1c0 > [595953.326836] ksys_write+0x74/0xf0 > [595953.330221] __arm64_sys_write+0x24/0x30 > > [...] Applied to 5.14/scsi-fixes, thanks! [1/1] scsi: scsi_dh_rdac: Avoid crash during rdac_bus_attach https://git.kernel.org/mkp/scsi/c/fb5d909021b4
diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c index 5efc959493ec..85a71bafaea7 100644 --- a/drivers/scsi/device_handler/scsi_dh_rdac.c +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c @@ -453,8 +453,8 @@ static int initialize_controller(struct scsi_device *sdev, if (!h->ctlr) err = SCSI_DH_RES_TEMP_UNAVAIL; else { - list_add_rcu(&h->node, &h->ctlr->dh_list); h->sdev = sdev; + list_add_rcu(&h->node, &h->ctlr->dh_list); } spin_unlock(&list_lock); err = SCSI_DH_OK; @@ -778,11 +778,11 @@ static void rdac_bus_detach( struct scsi_device *sdev ) spin_lock(&list_lock); if (h->ctlr) { list_del_rcu(&h->node); - h->sdev = NULL; kref_put(&h->ctlr->kref, release_controller); } spin_unlock(&list_lock); sdev->handler_data = NULL; + synchronize_rcu(); kfree(h); }
We get follow BUG_ON when rdac scan: [595952.944297] kernel BUG at drivers/scsi/device_handler/scsi_dh_rdac.c:427! [595952.951143] Internal error: Oops - BUG: 0 [#1] SMP ...... [595953.251065] Call trace: [595953.259054] check_ownership+0xb0/0x118 [595953.269794] rdac_bus_attach+0x1f0/0x4b0 [595953.273787] scsi_dh_handler_attach+0x3c/0xe8 [595953.278211] scsi_dh_add_device+0xc4/0xe8 [595953.282291] scsi_sysfs_add_sdev+0x8c/0x2a8 [595953.286544] scsi_probe_and_add_lun+0x9fc/0xd00 [595953.291142] __scsi_scan_target+0x598/0x630 [595953.295395] scsi_scan_target+0x120/0x130 [595953.299481] fc_user_scan+0x1a0/0x1c0 [scsi_transport_fc] [595953.304944] store_scan+0xb0/0x108 [595953.308420] dev_attr_store+0x44/0x60 [595953.312160] sysfs_kf_write+0x58/0x80 [595953.315893] kernfs_fop_write+0xe8/0x1f0 [595953.319888] __vfs_write+0x60/0x190 [595953.323448] vfs_write+0xac/0x1c0 [595953.326836] ksys_write+0x74/0xf0 [595953.330221] __arm64_sys_write+0x24/0x30 BUG_ON code is in check_ownership: list_for_each_entry_rcu(tmp, &h->ctlr->dh_list, node) { /* h->sdev should always be valid */ BUG_ON(!tmp->sdev); tmp->sdev->access_state = access_state; } rdac_bus_attach initialize_controller list_add_rcu(&h->node, &h->ctlr->dh_list); h->sdev = sdev; rdac_bus_detach list_del_rcu(&h->node); h->sdev = NULL; Test as follow steps: (1) Find IO error, remove disk; (2) Insert disk back; (3) trigger scan disk; There is race between rdac_bus_attach and rdac_bus_detach, maybe access rdac_dh_data which h->sdev has been set NULL when process rdac attach. And also find that "h->sdev" set value after add list, this may lead to reference NULL ptr. Signed-off-by: Ye Bin <yebin10@huawei.com> --- drivers/scsi/device_handler/scsi_dh_rdac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)