diff mbox series

ublk_drv: fix NULL pointer dereference in ublk_ctrl_start_recovery()

Message ID 20240529095313.2568595-1-linan666@huaweicloud.com (mailing list archive)
State New, archived
Headers show
Series ublk_drv: fix NULL pointer dereference in ublk_ctrl_start_recovery() | expand

Commit Message

Li Nan May 29, 2024, 9:53 a.m. UTC
From: Li Nan <linan122@huawei.com>

When two UBLK_CMD_START_USER_RECOVERY commands are submitted, the
first one sets 'ubq->ubq_daemon' to NULL, and the second one triggers
WARN in ublk_queue_reinit() and subsequently a NULL pointer dereference
issue.

Continuing execution after WARN is incorrect, as 'ubq->ubq_daemon' is
known to be NULL. Fix it by return directly if the WARN is triggered.

Note that WARN will still be triggered after the fix if anyone tries to
start recovery twice.

  BUG: kernel NULL pointer dereference, address: 0000000000000028
  RIP: 0010:ublk_ctrl_start_recovery.constprop.0+0x82/0x180
  Call Trace:
   <TASK>
   ? __die+0x20/0x70
   ? page_fault_oops+0x75/0x170
   ? exc_page_fault+0x64/0x140
   ? asm_exc_page_fault+0x22/0x30
   ? ublk_ctrl_start_recovery.constprop.0+0x82/0x180
   ublk_ctrl_uring_cmd+0x4f7/0x6c0
   ? pick_next_task_idle+0x26/0x40
   io_uring_cmd+0x9a/0x1b0
   io_issue_sqe+0x193/0x3f0
   io_wq_submit_work+0x9b/0x390
   io_worker_handle_work+0x165/0x360
   io_wq_worker+0xcb/0x2f0
   ? finish_task_switch.isra.0+0x203/0x290
   ? finish_task_switch.isra.0+0x203/0x290
   ? __pfx_io_wq_worker+0x10/0x10
   ret_from_fork+0x2d/0x50
   ? __pfx_io_wq_worker+0x10/0x10
   ret_from_fork_asm+0x1a/0x30
   </TASK>

Fixes: c732a852b419 ("ublk_drv: add START_USER_RECOVERY and END_USER_RECOVERY support")
Reported-and-tested-by: Changhui Zhong <czhong@redhat.com>
Closes: https://lore.kernel.org/all/CAGVVp+UvLiS+bhNXV-h2icwX1dyybbYHeQUuH7RYqUvMQf6N3w@mail.gmail.com
Signed-off-by: Li Nan <linan122@huawei.com>
---
 drivers/block/ublk_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Ming Lei June 3, 2024, 12:39 a.m. UTC | #1
On Wed, May 29, 2024 at 05:53:13PM +0800, linan666@huaweicloud.com wrote:
> From: Li Nan <linan122@huawei.com>
> 
> When two UBLK_CMD_START_USER_RECOVERY commands are submitted, the
> first one sets 'ubq->ubq_daemon' to NULL, and the second one triggers
> WARN in ublk_queue_reinit() and subsequently a NULL pointer dereference
> issue.
> 
> Continuing execution after WARN is incorrect, as 'ubq->ubq_daemon' is
> known to be NULL. Fix it by return directly if the WARN is triggered.
> 
> Note that WARN will still be triggered after the fix if anyone tries to
> start recovery twice.
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000028
>   RIP: 0010:ublk_ctrl_start_recovery.constprop.0+0x82/0x180
>   Call Trace:
>    <TASK>
>    ? __die+0x20/0x70
>    ? page_fault_oops+0x75/0x170
>    ? exc_page_fault+0x64/0x140
>    ? asm_exc_page_fault+0x22/0x30
>    ? ublk_ctrl_start_recovery.constprop.0+0x82/0x180
>    ublk_ctrl_uring_cmd+0x4f7/0x6c0
>    ? pick_next_task_idle+0x26/0x40
>    io_uring_cmd+0x9a/0x1b0
>    io_issue_sqe+0x193/0x3f0
>    io_wq_submit_work+0x9b/0x390
>    io_worker_handle_work+0x165/0x360
>    io_wq_worker+0xcb/0x2f0
>    ? finish_task_switch.isra.0+0x203/0x290
>    ? finish_task_switch.isra.0+0x203/0x290
>    ? __pfx_io_wq_worker+0x10/0x10
>    ret_from_fork+0x2d/0x50
>    ? __pfx_io_wq_worker+0x10/0x10
>    ret_from_fork_asm+0x1a/0x30
>    </TASK>
> 
> Fixes: c732a852b419 ("ublk_drv: add START_USER_RECOVERY and END_USER_RECOVERY support")
> Reported-and-tested-by: Changhui Zhong <czhong@redhat.com>
> Closes: https://lore.kernel.org/all/CAGVVp+UvLiS+bhNXV-h2icwX1dyybbYHeQUuH7RYqUvMQf6N3w@mail.gmail.com
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
>  drivers/block/ublk_drv.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 4e159948c912..99b621b2d40f 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
>  {
>  	int i;
>  
> -	WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
> +	if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
> +		return;

Yeah, it is one bug. However, it could be addressed by adding the check in
ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
what do you think about this way?


Thanks,
Ming
Li Nan June 3, 2024, 2:19 a.m. UTC | #2
在 2024/6/3 8:39, Ming Lei 写道:

[...]

>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
>> index 4e159948c912..99b621b2d40f 100644
>> --- a/drivers/block/ublk_drv.c
>> +++ b/drivers/block/ublk_drv.c
>> @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
>>   {
>>   	int i;
>>   
>> -	WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
>> +	if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
>> +		return;
> 
> Yeah, it is one bug. However, it could be addressed by adding the check in
> ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
> what do you think about this way?
> 

Check ub->nr_queues_ready seems better. How about:

@@ -2662,6 +2662,8 @@ static int ublk_ctrl_start_recovery(struct 
ublk_device *ub,
         mutex_lock(&ub->mutex);
         if (!ublk_can_use_recovery(ub))
                 goto out_unlock;
+       if (!ub->nr_queues_ready)
+               goto out_unlock;
         /*
          * START_RECOVERY is only allowd after:
          *

> 
> Thanks,
> Ming
Ming Lei June 3, 2024, 2:43 a.m. UTC | #3
On Mon, Jun 03, 2024 at 10:19:50AM +0800, Li Nan wrote:
> 
> 
> 在 2024/6/3 8:39, Ming Lei 写道:
> 
> [...]
> 
> > > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > > index 4e159948c912..99b621b2d40f 100644
> > > --- a/drivers/block/ublk_drv.c
> > > +++ b/drivers/block/ublk_drv.c
> > > @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
> > >   {
> > >   	int i;
> > > -	WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
> > > +	if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
> > > +		return;
> > 
> > Yeah, it is one bug. However, it could be addressed by adding the check in
> > ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
> > what do you think about this way?
> > 
> 
> Check ub->nr_queues_ready seems better. How about:
> 
> @@ -2662,6 +2662,8 @@ static int ublk_ctrl_start_recovery(struct ublk_device
> *ub,
>         mutex_lock(&ub->mutex);
>         if (!ublk_can_use_recovery(ub))
>                 goto out_unlock;
> +       if (!ub->nr_queues_ready)
> +               goto out_unlock;

Looks fine:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming
Changhui Zhong June 4, 2024, 1:32 a.m. UTC | #4
On Mon, Jun 3, 2024 at 10:20 AM Li Nan <linan666@huaweicloud.com> wrote:
>
>
>
> 在 2024/6/3 8:39, Ming Lei 写道:
>
> [...]
>
> >> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> >> index 4e159948c912..99b621b2d40f 100644
> >> --- a/drivers/block/ublk_drv.c
> >> +++ b/drivers/block/ublk_drv.c
> >> @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
> >>   {
> >>      int i;
> >>
> >> -    WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
> >> +    if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
> >> +            return;
> >
> > Yeah, it is one bug. However, it could be addressed by adding the check in
> > ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
> > what do you think about this way?
> >
>
> Check ub->nr_queues_ready seems better. How about:
>
> @@ -2662,6 +2662,8 @@ static int ublk_ctrl_start_recovery(struct
> ublk_device *ub,
>          mutex_lock(&ub->mutex);
>          if (!ublk_can_use_recovery(ub))
>                  goto out_unlock;
> +       if (!ub->nr_queues_ready)
> +               goto out_unlock;
>          /*
>           * START_RECOVERY is only allowd after:
>           *
>
> >
> > Thanks,
> > Ming
>
> --
> Thanks,
> Nan
>


Hi,Nan

After applying your new patch, I did not trigger "NULL pointer
dereference" and "Warning",
but hit task hung "Call Trace" info, please check

[13617.812306] running generic/004
[13622.293674] blk_print_req_error: 91 callbacks suppressed
[13622.293681] I/O error, dev ublkb4, sector 233256 op 0x1:(WRITE)
flags 0x8800 phys_seg 1 prio class 0
[13622.308145] I/O error, dev ublkb4, sector 233256 op 0x0:(READ)
flags 0x0 phys_seg 2 prio class 0
[13622.316923] I/O error, dev ublkb4, sector 233264 op 0x1:(WRITE)
flags 0x8800 phys_seg 1 prio class 0
[13622.326048] I/O error, dev ublkb4, sector 233272 op 0x0:(READ)
flags 0x0 phys_seg 1 prio class 0
[13622.334828] I/O error, dev ublkb4, sector 233272 op 0x1:(WRITE)
flags 0x8800 phys_seg 1 prio class 0
[13622.343954] I/O error, dev ublkb4, sector 233312 op 0x0:(READ)
flags 0x0 phys_seg 1 prio class 0
[13622.352733] I/O error, dev ublkb4, sector 233008 op 0x0:(READ)
flags 0x0 phys_seg 1 prio class 0
[13622.361514] I/O error, dev ublkb4, sector 233112 op 0x0:(READ)
flags 0x0 phys_seg 1 prio class 0
[13622.370292] I/O error, dev ublkb4, sector 233192 op 0x1:(WRITE)
flags 0x8800 phys_seg 1 prio class 0
[13622.379419] I/O error, dev ublkb4, sector 233120 op 0x0:(READ)
flags 0x0 phys_seg 1 prio class 0
[13641.069695] INFO: task fio:174413 blocked for more than 122 seconds.
[13641.076061]       Not tainted 6.10.0-rc1+ #1
[13641.080338] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[13641.088164] task:fio             state:D stack:0     pid:174413
tgid:174413 ppid:174386 flags:0x00004002
[13641.088168] Call Trace:
[13641.088170]  <TASK>
[13641.088171]  __schedule+0x221/0x670
[13641.088177]  schedule+0x23/0xa0
[13641.088179]  io_schedule+0x42/0x70
[13641.088181]  blk_mq_get_tag+0x118/0x2b0
[13641.088185]  ? gup_fast_pgd_range+0x280/0x370
[13641.088188]  ? __pfx_autoremove_wake_function+0x10/0x10
[13641.088192]  __blk_mq_alloc_requests+0x194/0x3a0
[13641.088194]  blk_mq_submit_bio+0x241/0x6c0
[13641.088196]  __submit_bio+0x8a/0x1f0
[13641.088199]  submit_bio_noacct_nocheck+0x168/0x250
[13641.088201]  ? submit_bio_noacct+0x45/0x560
[13641.088203]  __blkdev_direct_IO_async+0x167/0x1a0
[13641.088206]  blkdev_write_iter+0x1c8/0x270
[13641.088208]  aio_write+0x11c/0x240
[13641.088212]  ? __rq_qos_issue+0x21/0x40
[13641.088214]  ? blk_mq_start_request+0x34/0x1a0
[13641.088216]  ? io_submit_one+0x68/0x380
[13641.088218]  ? kmem_cache_alloc_noprof+0x4e/0x320
[13641.088221]  ? fget+0x7c/0xc0
[13641.088224]  ? io_submit_one+0xde/0x380
[13641.088226]  io_submit_one+0xde/0x380
[13641.088228]  __x64_sys_io_submit+0x80/0x160
[13641.088229]  do_syscall_64+0x79/0x150
[13641.088233]  ? syscall_exit_to_user_mode+0x6c/0x1f0
[13641.088237]  ? do_io_getevents+0x8b/0xe0
[13641.088238]  ? syscall_exit_work+0xf3/0x120
[13641.088241]  ? syscall_exit_to_user_mode+0x6c/0x1f0
[13641.088243]  ? do_syscall_64+0x85/0x150
[13641.088245]  ? do_syscall_64+0x85/0x150
[13641.088247]  ? blk_mq_flush_plug_list.part.0+0x108/0x160
[13641.088249]  ? rseq_get_rseq_cs+0x1d/0x220
[13641.088252]  ? rseq_ip_fixup+0x6d/0x1d0
[13641.088254]  ? blk_finish_plug+0x24/0x40
[13641.088256]  ? syscall_exit_to_user_mode+0x6c/0x1f0
[13641.088258]  ? do_syscall_64+0x85/0x150
[13641.088260]  ? syscall_exit_to_user_mode+0x6c/0x1f0
[13641.088262]  ? do_syscall_64+0x85/0x150
[13641.088264]  ? syscall_exit_to_user_mode+0x6c/0x1f0
[13641.088266]  ? do_syscall_64+0x85/0x150
[13641.088268]  ? do_syscall_64+0x85/0x150
[13641.088270]  ? do_syscall_64+0x85/0x150
[13641.088272]  ? clear_bhb_loop+0x45/0xa0
[13641.088275]  ? clear_bhb_loop+0x45/0xa0
[13641.088277]  ? clear_bhb_loop+0x45/0xa0
[13641.088279]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[13641.088281] RIP: 0033:0x7ff92150713d
[13641.088283] RSP: 002b:00007ffca1ef81f8 EFLAGS: 00000246 ORIG_RAX:
00000000000000d1
[13641.088285] RAX: ffffffffffffffda RBX: 00007ff9217e2f70 RCX: 00007ff92150713d
[13641.088286] RDX: 000055863b694fe0 RSI: 0000000000000010 RDI: 00007ff92164d000
[13641.088287] RBP: 00007ff92164d000 R08: 00007ff91936d000 R09: 0000000000000180
[13641.088288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
[13641.088289] R13: 0000000000000000 R14: 000055863b694fe0 R15: 000055863b6970c0
[13641.088291]  </TASK>

Thanks,
Changhui
Li Nan June 4, 2024, 1:14 p.m. UTC | #5
在 2024/6/4 9:32, Changhui Zhong 写道:
> On Mon, Jun 3, 2024 at 10:20 AM Li Nan <linan666@huaweicloud.com> wrote:
>>
>>
>>
>> 在 2024/6/3 8:39, Ming Lei 写道:
>>
>> [...]
>>
>>>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
>>>> index 4e159948c912..99b621b2d40f 100644
>>>> --- a/drivers/block/ublk_drv.c
>>>> +++ b/drivers/block/ublk_drv.c
>>>> @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
>>>>    {
>>>>       int i;
>>>>
>>>> -    WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
>>>> +    if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
>>>> +            return;
>>>
>>> Yeah, it is one bug. However, it could be addressed by adding the check in
>>> ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
>>> what do you think about this way?
>>>
>>
>> Check ub->nr_queues_ready seems better. How about:
>>
>> @@ -2662,6 +2662,8 @@ static int ublk_ctrl_start_recovery(struct
>> ublk_device *ub,
>>           mutex_lock(&ub->mutex);
>>           if (!ublk_can_use_recovery(ub))
>>                   goto out_unlock;
>> +       if (!ub->nr_queues_ready)
>> +               goto out_unlock;
>>           /*
>>            * START_RECOVERY is only allowd after:
>>            *
>>
>>>
>>> Thanks,
>>> Ming
>>
>> --
>> Thanks,
>> Nan
>>
> 
> 
> Hi,Nan
> 
> After applying your new patch, I did not trigger "NULL pointer
> dereference" and "Warning",
> but hit task hung "Call Trace" info, please check
> 
> [13617.812306] running generic/004
> [13622.293674] blk_print_req_error: 91 callbacks suppressed
> [13622.293681] I/O error, dev ublkb4, sector 233256 op 0x1:(WRITE)
> flags 0x8800 phys_seg 1 prio class 0
> [13622.308145] I/O error, dev ublkb4, sector 233256 op 0x0:(READ)
> flags 0x0 phys_seg 2 prio class 0
> [13622.316923] I/O error, dev ublkb4, sector 233264 op 0x1:(WRITE)
> flags 0x8800 phys_seg 1 prio class 0
> [13622.326048] I/O error, dev ublkb4, sector 233272 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13622.334828] I/O error, dev ublkb4, sector 233272 op 0x1:(WRITE)
> flags 0x8800 phys_seg 1 prio class 0
> [13622.343954] I/O error, dev ublkb4, sector 233312 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13622.352733] I/O error, dev ublkb4, sector 233008 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13622.361514] I/O error, dev ublkb4, sector 233112 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13622.370292] I/O error, dev ublkb4, sector 233192 op 0x1:(WRITE)
> flags 0x8800 phys_seg 1 prio class 0
> [13622.379419] I/O error, dev ublkb4, sector 233120 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13641.069695] INFO: task fio:174413 blocked for more than 122 seconds.
> [13641.076061]       Not tainted 6.10.0-rc1+ #1
> [13641.080338] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [13641.088164] task:fio             state:D stack:0     pid:174413
> tgid:174413 ppid:174386 flags:0x00004002
> [13641.088168] Call Trace:
> [13641.088170]  <TASK>
> [13641.088171]  __schedule+0x221/0x670
> [13641.088177]  schedule+0x23/0xa0
> [13641.088179]  io_schedule+0x42/0x70
> [13641.088181]  blk_mq_get_tag+0x118/0x2b0
> [13641.088185]  ? gup_fast_pgd_range+0x280/0x370
> [13641.088188]  ? __pfx_autoremove_wake_function+0x10/0x10
> [13641.088192]  __blk_mq_alloc_requests+0x194/0x3a0
> [13641.088194]  blk_mq_submit_bio+0x241/0x6c0
> [13641.088196]  __submit_bio+0x8a/0x1f0
> [13641.088199]  submit_bio_noacct_nocheck+0x168/0x250
> [13641.088201]  ? submit_bio_noacct+0x45/0x560
> [13641.088203]  __blkdev_direct_IO_async+0x167/0x1a0
> [13641.088206]  blkdev_write_iter+0x1c8/0x270
> [13641.088208]  aio_write+0x11c/0x240
> [13641.088212]  ? __rq_qos_issue+0x21/0x40
> [13641.088214]  ? blk_mq_start_request+0x34/0x1a0
> [13641.088216]  ? io_submit_one+0x68/0x380
> [13641.088218]  ? kmem_cache_alloc_noprof+0x4e/0x320
> [13641.088221]  ? fget+0x7c/0xc0
> [13641.088224]  ? io_submit_one+0xde/0x380
> [13641.088226]  io_submit_one+0xde/0x380
> [13641.088228]  __x64_sys_io_submit+0x80/0x160
> [13641.088229]  do_syscall_64+0x79/0x150
> [13641.088233]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088237]  ? do_io_getevents+0x8b/0xe0
> [13641.088238]  ? syscall_exit_work+0xf3/0x120
> [13641.088241]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088243]  ? do_syscall_64+0x85/0x150
> [13641.088245]  ? do_syscall_64+0x85/0x150
> [13641.088247]  ? blk_mq_flush_plug_list.part.0+0x108/0x160
> [13641.088249]  ? rseq_get_rseq_cs+0x1d/0x220
> [13641.088252]  ? rseq_ip_fixup+0x6d/0x1d0
> [13641.088254]  ? blk_finish_plug+0x24/0x40
> [13641.088256]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088258]  ? do_syscall_64+0x85/0x150
> [13641.088260]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088262]  ? do_syscall_64+0x85/0x150
> [13641.088264]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088266]  ? do_syscall_64+0x85/0x150
> [13641.088268]  ? do_syscall_64+0x85/0x150
> [13641.088270]  ? do_syscall_64+0x85/0x150
> [13641.088272]  ? clear_bhb_loop+0x45/0xa0
> [13641.088275]  ? clear_bhb_loop+0x45/0xa0
> [13641.088277]  ? clear_bhb_loop+0x45/0xa0
> [13641.088279]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [13641.088281] RIP: 0033:0x7ff92150713d
> [13641.088283] RSP: 002b:00007ffca1ef81f8 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000d1
> [13641.088285] RAX: ffffffffffffffda RBX: 00007ff9217e2f70 RCX: 00007ff92150713d
> [13641.088286] RDX: 000055863b694fe0 RSI: 0000000000000010 RDI: 00007ff92164d000
> [13641.088287] RBP: 00007ff92164d000 R08: 00007ff91936d000 R09: 0000000000000180
> [13641.088288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
> [13641.088289] R13: 0000000000000000 R14: 000055863b694fe0 R15: 000055863b6970c0
> [13641.088291]  </TASK>
> 
> Thanks,
> Changhui

Thanks for your test, I am trying to find the cause of this new issue.
Li Nan June 5, 2024, 1:40 a.m. UTC | #6
在 2024/6/4 9:32, Changhui Zhong 写道:
> On Mon, Jun 3, 2024 at 10:20 AM Li Nan <linan666@huaweicloud.com> wrote:
>>
>>
>>
>> 在 2024/6/3 8:39, Ming Lei 写道:
>>
>> [...]
>>
>>>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
>>>> index 4e159948c912..99b621b2d40f 100644
>>>> --- a/drivers/block/ublk_drv.c
>>>> +++ b/drivers/block/ublk_drv.c
>>>> @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
>>>>    {
>>>>       int i;
>>>>
>>>> -    WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
>>>> +    if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
>>>> +            return;
>>>
>>> Yeah, it is one bug. However, it could be addressed by adding the check in
>>> ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
>>> what do you think about this way?
>>>
>>
>> Check ub->nr_queues_ready seems better. How about:
>>
>> @@ -2662,6 +2662,8 @@ static int ublk_ctrl_start_recovery(struct
>> ublk_device *ub,
>>           mutex_lock(&ub->mutex);
>>           if (!ublk_can_use_recovery(ub))
>>                   goto out_unlock;
>> +       if (!ub->nr_queues_ready)
>> +               goto out_unlock;
>>           /*
>>            * START_RECOVERY is only allowd after:
>>            *
>>
>>>
>>> Thanks,
>>> Ming
>>
>> --
>> Thanks,
>> Nan
>>
> 
> 
> Hi,Nan
> 
> After applying your new patch, I did not trigger "NULL pointer
> dereference" and "Warning",
> but hit task hung "Call Trace" info, please check
> 
> [13617.812306] running generic/004
> [13622.293674] blk_print_req_error: 91 callbacks suppressed
> [13622.293681] I/O error, dev ublkb4, sector 233256 op 0x1:(WRITE)
> flags 0x8800 phys_seg 1 prio class 0
> [13622.308145] I/O error, dev ublkb4, sector 233256 op 0x0:(READ)
> flags 0x0 phys_seg 2 prio class 0
> [13622.316923] I/O error, dev ublkb4, sector 233264 op 0x1:(WRITE)
> flags 0x8800 phys_seg 1 prio class 0
> [13622.326048] I/O error, dev ublkb4, sector 233272 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13622.334828] I/O error, dev ublkb4, sector 233272 op 0x1:(WRITE)
> flags 0x8800 phys_seg 1 prio class 0
> [13622.343954] I/O error, dev ublkb4, sector 233312 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13622.352733] I/O error, dev ublkb4, sector 233008 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13622.361514] I/O error, dev ublkb4, sector 233112 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13622.370292] I/O error, dev ublkb4, sector 233192 op 0x1:(WRITE)
> flags 0x8800 phys_seg 1 prio class 0
> [13622.379419] I/O error, dev ublkb4, sector 233120 op 0x0:(READ)
> flags 0x0 phys_seg 1 prio class 0
> [13641.069695] INFO: task fio:174413 blocked for more than 122 seconds.
> [13641.076061]       Not tainted 6.10.0-rc1+ #1
> [13641.080338] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [13641.088164] task:fio             state:D stack:0     pid:174413
> tgid:174413 ppid:174386 flags:0x00004002
> [13641.088168] Call Trace:
> [13641.088170]  <TASK>
> [13641.088171]  __schedule+0x221/0x670
> [13641.088177]  schedule+0x23/0xa0
> [13641.088179]  io_schedule+0x42/0x70
> [13641.088181]  blk_mq_get_tag+0x118/0x2b0
> [13641.088185]  ? gup_fast_pgd_range+0x280/0x370
> [13641.088188]  ? __pfx_autoremove_wake_function+0x10/0x10
> [13641.088192]  __blk_mq_alloc_requests+0x194/0x3a0
> [13641.088194]  blk_mq_submit_bio+0x241/0x6c0
> [13641.088196]  __submit_bio+0x8a/0x1f0
> [13641.088199]  submit_bio_noacct_nocheck+0x168/0x250
> [13641.088201]  ? submit_bio_noacct+0x45/0x560
> [13641.088203]  __blkdev_direct_IO_async+0x167/0x1a0
> [13641.088206]  blkdev_write_iter+0x1c8/0x270
> [13641.088208]  aio_write+0x11c/0x240
> [13641.088212]  ? __rq_qos_issue+0x21/0x40
> [13641.088214]  ? blk_mq_start_request+0x34/0x1a0
> [13641.088216]  ? io_submit_one+0x68/0x380
> [13641.088218]  ? kmem_cache_alloc_noprof+0x4e/0x320
> [13641.088221]  ? fget+0x7c/0xc0
> [13641.088224]  ? io_submit_one+0xde/0x380
> [13641.088226]  io_submit_one+0xde/0x380
> [13641.088228]  __x64_sys_io_submit+0x80/0x160
> [13641.088229]  do_syscall_64+0x79/0x150
> [13641.088233]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088237]  ? do_io_getevents+0x8b/0xe0
> [13641.088238]  ? syscall_exit_work+0xf3/0x120
> [13641.088241]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088243]  ? do_syscall_64+0x85/0x150
> [13641.088245]  ? do_syscall_64+0x85/0x150
> [13641.088247]  ? blk_mq_flush_plug_list.part.0+0x108/0x160
> [13641.088249]  ? rseq_get_rseq_cs+0x1d/0x220
> [13641.088252]  ? rseq_ip_fixup+0x6d/0x1d0
> [13641.088254]  ? blk_finish_plug+0x24/0x40
> [13641.088256]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088258]  ? do_syscall_64+0x85/0x150
> [13641.088260]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088262]  ? do_syscall_64+0x85/0x150
> [13641.088264]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> [13641.088266]  ? do_syscall_64+0x85/0x150
> [13641.088268]  ? do_syscall_64+0x85/0x150
> [13641.088270]  ? do_syscall_64+0x85/0x150
> [13641.088272]  ? clear_bhb_loop+0x45/0xa0
> [13641.088275]  ? clear_bhb_loop+0x45/0xa0
> [13641.088277]  ? clear_bhb_loop+0x45/0xa0
> [13641.088279]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [13641.088281] RIP: 0033:0x7ff92150713d
> [13641.088283] RSP: 002b:00007ffca1ef81f8 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000d1
> [13641.088285] RAX: ffffffffffffffda RBX: 00007ff9217e2f70 RCX: 00007ff92150713d
> [13641.088286] RDX: 000055863b694fe0 RSI: 0000000000000010 RDI: 00007ff92164d000
> [13641.088287] RBP: 00007ff92164d000 R08: 00007ff91936d000 R09: 0000000000000180
> [13641.088288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
> [13641.088289] R13: 0000000000000000 R14: 000055863b694fe0 R15: 000055863b6970c0
> [13641.088291]  </TASK>
> 
> Thanks,
> Changhui
> 

After applying the previous patch, will the test environment continue to
execute test cases after WARN? I am not sure whether this issue has always
existed but was not tested becasue of WARN, or whether the new patch
introduced it.
Changhui Zhong June 5, 2024, 7:20 a.m. UTC | #7
On Wed, Jun 5, 2024 at 9:41 AM Li Nan <linan666@huaweicloud.com> wrote:
>
>
>
> 在 2024/6/4 9:32, Changhui Zhong 写道:
> > On Mon, Jun 3, 2024 at 10:20 AM Li Nan <linan666@huaweicloud.com> wrote:
> >>
> >>
> >>
> >> 在 2024/6/3 8:39, Ming Lei 写道:
> >>
> >> [...]
> >>
> >>>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> >>>> index 4e159948c912..99b621b2d40f 100644
> >>>> --- a/drivers/block/ublk_drv.c
> >>>> +++ b/drivers/block/ublk_drv.c
> >>>> @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
> >>>>    {
> >>>>       int i;
> >>>>
> >>>> -    WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
> >>>> +    if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
> >>>> +            return;
> >>>
> >>> Yeah, it is one bug. However, it could be addressed by adding the check in
> >>> ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
> >>> what do you think about this way?
> >>>
> >>
> >> Check ub->nr_queues_ready seems better. How about:
> >>
> >> @@ -2662,6 +2662,8 @@ static int ublk_ctrl_start_recovery(struct
> >> ublk_device *ub,
> >>           mutex_lock(&ub->mutex);
> >>           if (!ublk_can_use_recovery(ub))
> >>                   goto out_unlock;
> >> +       if (!ub->nr_queues_ready)
> >> +               goto out_unlock;
> >>           /*
> >>            * START_RECOVERY is only allowd after:
> >>            *
> >>
> >>>
> >>> Thanks,
> >>> Ming
> >>
> >> --
> >> Thanks,
> >> Nan
> >>
> >
> >
> > Hi,Nan
> >
> > After applying your new patch, I did not trigger "NULL pointer
> > dereference" and "Warning",
> > but hit task hung "Call Trace" info, please check
> >
> > [13617.812306] running generic/004
> > [13622.293674] blk_print_req_error: 91 callbacks suppressed
> > [13622.293681] I/O error, dev ublkb4, sector 233256 op 0x1:(WRITE)
> > flags 0x8800 phys_seg 1 prio class 0
> > [13622.308145] I/O error, dev ublkb4, sector 233256 op 0x0:(READ)
> > flags 0x0 phys_seg 2 prio class 0
> > [13622.316923] I/O error, dev ublkb4, sector 233264 op 0x1:(WRITE)
> > flags 0x8800 phys_seg 1 prio class 0
> > [13622.326048] I/O error, dev ublkb4, sector 233272 op 0x0:(READ)
> > flags 0x0 phys_seg 1 prio class 0
> > [13622.334828] I/O error, dev ublkb4, sector 233272 op 0x1:(WRITE)
> > flags 0x8800 phys_seg 1 prio class 0
> > [13622.343954] I/O error, dev ublkb4, sector 233312 op 0x0:(READ)
> > flags 0x0 phys_seg 1 prio class 0
> > [13622.352733] I/O error, dev ublkb4, sector 233008 op 0x0:(READ)
> > flags 0x0 phys_seg 1 prio class 0
> > [13622.361514] I/O error, dev ublkb4, sector 233112 op 0x0:(READ)
> > flags 0x0 phys_seg 1 prio class 0
> > [13622.370292] I/O error, dev ublkb4, sector 233192 op 0x1:(WRITE)
> > flags 0x8800 phys_seg 1 prio class 0
> > [13622.379419] I/O error, dev ublkb4, sector 233120 op 0x0:(READ)
> > flags 0x0 phys_seg 1 prio class 0
> > [13641.069695] INFO: task fio:174413 blocked for more than 122 seconds.
> > [13641.076061]       Not tainted 6.10.0-rc1+ #1
> > [13641.080338] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> > [13641.088164] task:fio             state:D stack:0     pid:174413
> > tgid:174413 ppid:174386 flags:0x00004002
> > [13641.088168] Call Trace:
> > [13641.088170]  <TASK>
> > [13641.088171]  __schedule+0x221/0x670
> > [13641.088177]  schedule+0x23/0xa0
> > [13641.088179]  io_schedule+0x42/0x70
> > [13641.088181]  blk_mq_get_tag+0x118/0x2b0
> > [13641.088185]  ? gup_fast_pgd_range+0x280/0x370
> > [13641.088188]  ? __pfx_autoremove_wake_function+0x10/0x10
> > [13641.088192]  __blk_mq_alloc_requests+0x194/0x3a0
> > [13641.088194]  blk_mq_submit_bio+0x241/0x6c0
> > [13641.088196]  __submit_bio+0x8a/0x1f0
> > [13641.088199]  submit_bio_noacct_nocheck+0x168/0x250
> > [13641.088201]  ? submit_bio_noacct+0x45/0x560
> > [13641.088203]  __blkdev_direct_IO_async+0x167/0x1a0
> > [13641.088206]  blkdev_write_iter+0x1c8/0x270
> > [13641.088208]  aio_write+0x11c/0x240
> > [13641.088212]  ? __rq_qos_issue+0x21/0x40
> > [13641.088214]  ? blk_mq_start_request+0x34/0x1a0
> > [13641.088216]  ? io_submit_one+0x68/0x380
> > [13641.088218]  ? kmem_cache_alloc_noprof+0x4e/0x320
> > [13641.088221]  ? fget+0x7c/0xc0
> > [13641.088224]  ? io_submit_one+0xde/0x380
> > [13641.088226]  io_submit_one+0xde/0x380
> > [13641.088228]  __x64_sys_io_submit+0x80/0x160
> > [13641.088229]  do_syscall_64+0x79/0x150
> > [13641.088233]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > [13641.088237]  ? do_io_getevents+0x8b/0xe0
> > [13641.088238]  ? syscall_exit_work+0xf3/0x120
> > [13641.088241]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > [13641.088243]  ? do_syscall_64+0x85/0x150
> > [13641.088245]  ? do_syscall_64+0x85/0x150
> > [13641.088247]  ? blk_mq_flush_plug_list.part.0+0x108/0x160
> > [13641.088249]  ? rseq_get_rseq_cs+0x1d/0x220
> > [13641.088252]  ? rseq_ip_fixup+0x6d/0x1d0
> > [13641.088254]  ? blk_finish_plug+0x24/0x40
> > [13641.088256]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > [13641.088258]  ? do_syscall_64+0x85/0x150
> > [13641.088260]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > [13641.088262]  ? do_syscall_64+0x85/0x150
> > [13641.088264]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > [13641.088266]  ? do_syscall_64+0x85/0x150
> > [13641.088268]  ? do_syscall_64+0x85/0x150
> > [13641.088270]  ? do_syscall_64+0x85/0x150
> > [13641.088272]  ? clear_bhb_loop+0x45/0xa0
> > [13641.088275]  ? clear_bhb_loop+0x45/0xa0
> > [13641.088277]  ? clear_bhb_loop+0x45/0xa0
> > [13641.088279]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [13641.088281] RIP: 0033:0x7ff92150713d
> > [13641.088283] RSP: 002b:00007ffca1ef81f8 EFLAGS: 00000246 ORIG_RAX:
> > 00000000000000d1
> > [13641.088285] RAX: ffffffffffffffda RBX: 00007ff9217e2f70 RCX: 00007ff92150713d
> > [13641.088286] RDX: 000055863b694fe0 RSI: 0000000000000010 RDI: 00007ff92164d000
> > [13641.088287] RBP: 00007ff92164d000 R08: 00007ff91936d000 R09: 0000000000000180
> > [13641.088288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
> > [13641.088289] R13: 0000000000000000 R14: 000055863b694fe0 R15: 000055863b6970c0
> > [13641.088291]  </TASK>
> >
> > Thanks,
> > Changhui
> >
>
> After applying the previous patch, will the test environment continue to
> execute test cases after WARN?

a few days ago,test with the previous patch, the test environment
continued to execute test cases after WARN,
and I terminated the test when I observed a WARN,so I did not observe
the subsequent situation.

> I am not sure whether this issue has always
> existed but was not tested becasue of WARN, or whether the new patch
> introduced it.

today, I re-test previous patch, and let it run for a long time,I
observed WARN and task hung,
looks this issue already existed and not introduced by new patch.

[ 3443.311492] ------------[ cut here ]------------
[ 3443.316128] WARNING: CPU: 0 PID: 35703 at
drivers/block/ublk_drv.c:2633
ublk_ctrl_start_recovery.constprop.0+0x60/0x1a0
[ 3443.326911] Modules linked in: ext4 mbcache jbd2 loop tls rfkill
sunrpc dm_multipath intel_rapl_msr intel_rapl_common
intel_uncore_frequency intel_uncore_frequency_common isst_if_common
skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp
kvm_intel kvm mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper
dcdbas rapl ipmi_ssif iTCO_wdt iTCO_vendor_support intel_cstate
intel_uncore wmi_bmof i2c_i801 acpi_power_meter mei_me ipmi_si
dell_smbios acpi_ipmi mei ipmi_devintf intel_pch_thermal
dell_wmi_descriptor i2c_smbus ipmi_msghandler pcspkr lpc_ich drm fuse
xfs libcrc32c sd_mod sr_mod t10_pi cdrom sg ahci crct10dif_pclmul
crc32_pclmul libahci crc32c_intel tg3 libata ghash_clmulni_intel
megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: null_blk]
[ 3443.395899] CPU: 0 PID: 35703 Comm: iou-wrk-35689 Not tainted 6.10.0-rc1+ #1
[ 3443.402951] Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS
2.13.3 12/13/2021
[ 3443.410516] RIP: 0010:ublk_ctrl_start_recovery.constprop.0+0x60/0x1a0
[ 3443.416964] Code: 85 48 01 00 00 66 41 83 7c 24 1c 02 0f 85 3b 01
00 00 66 41 83 7c 24 18 00 0f 84 b8 00 00 00 45 31 ed 41 be ff ff ff
ff eb 15 <0f> 0b 41 0f b7 44 24 18 41 83 c5 01 41 39 c5 0f 8d 98 00 00
00 44
[ 3443.435711] RSP: 0018:ffffb66a45517ce0 EFLAGS: 00010246
[ 3443.440936] RAX: 0000000000000002 RBX: ffff9c59054ce000 RCX: 0000000000000000
[ 3443.448069] RDX: ffff9c58c84f0000 RSI: ffffffffb644ee00 RDI: 0000000000000000
[ 3443.455203] RBP: ffff9c5905b65468 R08: 0000000000000000 R09: ffffffffb68e35e0
[ 3443.462334] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c5905b65000
[ 3443.469468] R13: 0000000000000000 R14: 00000000ffffffff R15: ffff9c58c99d4080
[ 3443.476601] FS:  00007f00c7e51740(0000) GS:ffff9c5c2fe00000(0000)
knlGS:0000000000000000
[ 3443.484688] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3443.490433] CR2: 00007f2379adb584 CR3: 0000000109a36005 CR4: 00000000007706f0
[ 3443.497567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3443.504702] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3443.511840] PKRU: 55555554
[ 3443.514552] Call Trace:
[ 3443.517007]  <TASK>
[ 3443.519112]  ? __warn+0x7f/0x120
[ 3443.522352]  ? ublk_ctrl_start_recovery.constprop.0+0x60/0x1a0
[ 3443.528184]  ? report_bug+0x18a/0x1a0
[ 3443.531851]  ? handle_bug+0x3c/0x70
[ 3443.535344]  ? exc_invalid_op+0x14/0x70
[ 3443.539182]  ? asm_exc_invalid_op+0x16/0x20
[ 3443.543371]  ? ublk_ctrl_start_recovery.constprop.0+0x60/0x1a0
[ 3443.549210]  ublk_ctrl_uring_cmd+0x4f7/0x6c0
[ 3443.553484]  ? pick_next_task_fair+0x41/0x520
[ 3443.557843]  ? put_prev_entity+0x1c/0xa0
[ 3443.561778]  io_uring_cmd+0x9a/0x1b0
[ 3443.565367]  io_issue_sqe+0x18f/0x3f0
[ 3443.569030]  io_wq_submit_work+0x9b/0x390
[ 3443.573045]  io_worker_handle_work+0x165/0x360
[ 3443.577499]  io_wq_worker+0xcb/0x2f0
[ 3443.581077]  ? finish_task_switch.isra.0+0x203/0x290
[ 3443.586045]  ? finish_task_switch.isra.0+0x203/0x290
[ 3443.591018]  ? __pfx_io_wq_worker+0x10/0x10
[ 3443.595204]  ret_from_fork+0x2d/0x50
[ 3443.598786]  ? __pfx_io_wq_worker+0x10/0x10
[ 3443.602980]  ret_from_fork_asm+0x1a/0x30
[ 3443.606906]  </TASK>
[ 3443.609097] ---[ end trace 0000000000000000 ]---




[ 3933.596384] INFO: task fio:35336 blocked for more than 491 seconds.
[ 3933.602659]       Tainted: G        W          6.10.0-rc1+ #1
[ 3933.608405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 3933.616233] task:fio             state:D stack:0     pid:35336
tgid:35336 ppid:35327  flags:0x00004002
[ 3933.616239] Call Trace:
[ 3933.616241]  <TASK>
[ 3933.616244]  __schedule+0x221/0x670
[ 3933.616253]  schedule+0x23/0xa0
[ 3933.616257]  io_schedule+0x42/0x70
[ 3933.616261]  blk_mq_get_tag+0x118/0x2b0
[ 3933.616268]  ? __pfx_autoremove_wake_function+0x10/0x10
[ 3933.616275]  __blk_mq_alloc_requests+0x194/0x3a0
[ 3933.616280]  blk_mq_submit_bio+0x241/0x6c0
[ 3933.616284]  __submit_bio+0x8a/0x1f0
[ 3933.616288]  ? bio_associate_blkg_from_css+0xca/0x320
[ 3933.616294]  submit_bio_noacct_nocheck+0x168/0x250
[ 3933.616298]  __blkdev_direct_IO_async+0x167/0x1a0
[ 3933.616303]  blkdev_read_iter+0xa2/0x130
[ 3933.616308]  aio_read+0xf2/0x1b0
[ 3933.616315]  ? rseq_get_rseq_cs+0x1d/0x220
[ 3933.616320]  ? rseq_ip_fixup+0x6d/0x1d0
[ 3933.616324]  ? kmem_cache_alloc_noprof+0x4e/0x320
[ 3933.616329]  ? fget+0x7c/0xc0
[ 3933.616335]  ? io_submit_one+0xde/0x380
[ 3933.616338]  io_submit_one+0xde/0x380
[ 3933.616341]  __x64_sys_io_submit+0x80/0x160
[ 3933.616345]  do_syscall_64+0x79/0x150
[ 3933.616352]  ? clear_bhb_loop+0x45/0xa0
[ 3933.616358]  ? clear_bhb_loop+0x45/0xa0
[ 3933.616361]  ? clear_bhb_loop+0x45/0xa0
[ 3933.616364]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 3933.616368] RIP: 0033:0x7fb92c70713d
[ 3933.616372] RSP: 002b:00007ffee155c528 EFLAGS: 00000246 ORIG_RAX:
00000000000000d1
[ 3933.616375] RAX: ffffffffffffffda RBX: 00007fb92c941f70 RCX: 00007fb92c70713d
[ 3933.616377] RDX: 000055b09201e360 RSI: 0000000000000010 RDI: 00007fb92c89f000
[ 3933.616379] RBP: 00007fb92c89f000 R08: 00007fb92c8d4000 R09: 0000000000000080
[ 3933.616381] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
[ 3933.616382] R13: 0000000000000000 R14: 000055b09201e360 R15: 000055b09201d240
[ 3933.616386]  </TASK>
Ming Lei June 5, 2024, 9:47 a.m. UTC | #8
On Wed, Jun 05, 2024 at 03:20:34PM +0800, Changhui Zhong wrote:
> On Wed, Jun 5, 2024 at 9:41 AM Li Nan <linan666@huaweicloud.com> wrote:
> >
> >
> >
> > 在 2024/6/4 9:32, Changhui Zhong 写道:
> > > On Mon, Jun 3, 2024 at 10:20 AM Li Nan <linan666@huaweicloud.com> wrote:
> > >>
> > >>
> > >>
> > >> 在 2024/6/3 8:39, Ming Lei 写道:
> > >>
> > >> [...]
> > >>
> > >>>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > >>>> index 4e159948c912..99b621b2d40f 100644
> > >>>> --- a/drivers/block/ublk_drv.c
> > >>>> +++ b/drivers/block/ublk_drv.c
> > >>>> @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
> > >>>>    {
> > >>>>       int i;
> > >>>>
> > >>>> -    WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
> > >>>> +    if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
> > >>>> +            return;
> > >>>
> > >>> Yeah, it is one bug. However, it could be addressed by adding the check in
> > >>> ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
> > >>> what do you think about this way?
> > >>>
> > >>
> > >> Check ub->nr_queues_ready seems better. How about:
> > >>
> > >> @@ -2662,6 +2662,8 @@ static int ublk_ctrl_start_recovery(struct
> > >> ublk_device *ub,
> > >>           mutex_lock(&ub->mutex);
> > >>           if (!ublk_can_use_recovery(ub))
> > >>                   goto out_unlock;
> > >> +       if (!ub->nr_queues_ready)
> > >> +               goto out_unlock;
> > >>           /*
> > >>            * START_RECOVERY is only allowd after:
> > >>            *
> > >>
> > >>>
> > >>> Thanks,
> > >>> Ming
> > >>
> > >> --
> > >> Thanks,
> > >> Nan
> > >>
> > >
> > >
> > > Hi,Nan
> > >
> > > After applying your new patch, I did not trigger "NULL pointer
> > > dereference" and "Warning",
> > > but hit task hung "Call Trace" info, please check
> > >
> > > [13617.812306] running generic/004
> > > [13622.293674] blk_print_req_error: 91 callbacks suppressed
> > > [13622.293681] I/O error, dev ublkb4, sector 233256 op 0x1:(WRITE)
> > > flags 0x8800 phys_seg 1 prio class 0
> > > [13622.308145] I/O error, dev ublkb4, sector 233256 op 0x0:(READ)
> > > flags 0x0 phys_seg 2 prio class 0
> > > [13622.316923] I/O error, dev ublkb4, sector 233264 op 0x1:(WRITE)
> > > flags 0x8800 phys_seg 1 prio class 0
> > > [13622.326048] I/O error, dev ublkb4, sector 233272 op 0x0:(READ)
> > > flags 0x0 phys_seg 1 prio class 0
> > > [13622.334828] I/O error, dev ublkb4, sector 233272 op 0x1:(WRITE)
> > > flags 0x8800 phys_seg 1 prio class 0
> > > [13622.343954] I/O error, dev ublkb4, sector 233312 op 0x0:(READ)
> > > flags 0x0 phys_seg 1 prio class 0
> > > [13622.352733] I/O error, dev ublkb4, sector 233008 op 0x0:(READ)
> > > flags 0x0 phys_seg 1 prio class 0
> > > [13622.361514] I/O error, dev ublkb4, sector 233112 op 0x0:(READ)
> > > flags 0x0 phys_seg 1 prio class 0
> > > [13622.370292] I/O error, dev ublkb4, sector 233192 op 0x1:(WRITE)
> > > flags 0x8800 phys_seg 1 prio class 0
> > > [13622.379419] I/O error, dev ublkb4, sector 233120 op 0x0:(READ)
> > > flags 0x0 phys_seg 1 prio class 0
> > > [13641.069695] INFO: task fio:174413 blocked for more than 122 seconds.
> > > [13641.076061]       Not tainted 6.10.0-rc1+ #1
> > > [13641.080338] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > > [13641.088164] task:fio             state:D stack:0     pid:174413
> > > tgid:174413 ppid:174386 flags:0x00004002
> > > [13641.088168] Call Trace:
> > > [13641.088170]  <TASK>
> > > [13641.088171]  __schedule+0x221/0x670
> > > [13641.088177]  schedule+0x23/0xa0
> > > [13641.088179]  io_schedule+0x42/0x70
> > > [13641.088181]  blk_mq_get_tag+0x118/0x2b0
> > > [13641.088185]  ? gup_fast_pgd_range+0x280/0x370
> > > [13641.088188]  ? __pfx_autoremove_wake_function+0x10/0x10
> > > [13641.088192]  __blk_mq_alloc_requests+0x194/0x3a0
> > > [13641.088194]  blk_mq_submit_bio+0x241/0x6c0
> > > [13641.088196]  __submit_bio+0x8a/0x1f0
> > > [13641.088199]  submit_bio_noacct_nocheck+0x168/0x250
> > > [13641.088201]  ? submit_bio_noacct+0x45/0x560
> > > [13641.088203]  __blkdev_direct_IO_async+0x167/0x1a0
> > > [13641.088206]  blkdev_write_iter+0x1c8/0x270
> > > [13641.088208]  aio_write+0x11c/0x240
> > > [13641.088212]  ? __rq_qos_issue+0x21/0x40
> > > [13641.088214]  ? blk_mq_start_request+0x34/0x1a0
> > > [13641.088216]  ? io_submit_one+0x68/0x380
> > > [13641.088218]  ? kmem_cache_alloc_noprof+0x4e/0x320
> > > [13641.088221]  ? fget+0x7c/0xc0
> > > [13641.088224]  ? io_submit_one+0xde/0x380
> > > [13641.088226]  io_submit_one+0xde/0x380
> > > [13641.088228]  __x64_sys_io_submit+0x80/0x160
> > > [13641.088229]  do_syscall_64+0x79/0x150
> > > [13641.088233]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > [13641.088237]  ? do_io_getevents+0x8b/0xe0
> > > [13641.088238]  ? syscall_exit_work+0xf3/0x120
> > > [13641.088241]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > [13641.088243]  ? do_syscall_64+0x85/0x150
> > > [13641.088245]  ? do_syscall_64+0x85/0x150
> > > [13641.088247]  ? blk_mq_flush_plug_list.part.0+0x108/0x160
> > > [13641.088249]  ? rseq_get_rseq_cs+0x1d/0x220
> > > [13641.088252]  ? rseq_ip_fixup+0x6d/0x1d0
> > > [13641.088254]  ? blk_finish_plug+0x24/0x40
> > > [13641.088256]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > [13641.088258]  ? do_syscall_64+0x85/0x150
> > > [13641.088260]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > [13641.088262]  ? do_syscall_64+0x85/0x150
> > > [13641.088264]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > [13641.088266]  ? do_syscall_64+0x85/0x150
> > > [13641.088268]  ? do_syscall_64+0x85/0x150
> > > [13641.088270]  ? do_syscall_64+0x85/0x150
> > > [13641.088272]  ? clear_bhb_loop+0x45/0xa0
> > > [13641.088275]  ? clear_bhb_loop+0x45/0xa0
> > > [13641.088277]  ? clear_bhb_loop+0x45/0xa0
> > > [13641.088279]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [13641.088281] RIP: 0033:0x7ff92150713d
> > > [13641.088283] RSP: 002b:00007ffca1ef81f8 EFLAGS: 00000246 ORIG_RAX:
> > > 00000000000000d1
> > > [13641.088285] RAX: ffffffffffffffda RBX: 00007ff9217e2f70 RCX: 00007ff92150713d
> > > [13641.088286] RDX: 000055863b694fe0 RSI: 0000000000000010 RDI: 00007ff92164d000
> > > [13641.088287] RBP: 00007ff92164d000 R08: 00007ff91936d000 R09: 0000000000000180
> > > [13641.088288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
> > > [13641.088289] R13: 0000000000000000 R14: 000055863b694fe0 R15: 000055863b6970c0
> > > [13641.088291]  </TASK>
> > >
> > > Thanks,
> > > Changhui
> > >
> >
> > After applying the previous patch, will the test environment continue to
> > execute test cases after WARN?
> 
> a few days ago,test with the previous patch, the test environment
> continued to execute test cases after WARN,
> and I terminated the test when I observed a WARN,so I did not observe
> the subsequent situation.
> 
> > I am not sure whether this issue has always
> > existed but was not tested becasue of WARN, or whether the new patch
> > introduced it.
> 
> today, I re-test previous patch, and let it run for a long time,I
> observed WARN and task hung,
> looks this issue already existed and not introduced by new patch.

Hi Changhui,

The hang is actually expected because recovery fails.

Please pull the latest ublksrv and check if the issue can still be
reproduced:

https://github.com/ublk-org/ublksrv

BTW, one ublksrv segfault and two test cleanup issues are fixed.

Thanks,
Ming
Changhui Zhong June 6, 2024, 4:48 a.m. UTC | #9
On Wed, Jun 5, 2024 at 5:48 PM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Wed, Jun 05, 2024 at 03:20:34PM +0800, Changhui Zhong wrote:
> > On Wed, Jun 5, 2024 at 9:41 AM Li Nan <linan666@huaweicloud.com> wrote:
> > >
> > >
> > >
> > > 在 2024/6/4 9:32, Changhui Zhong 写道:
> > > > On Mon, Jun 3, 2024 at 10:20 AM Li Nan <linan666@huaweicloud.com> wrote:
> > > >>
> > > >>
> > > >>
> > > >> 在 2024/6/3 8:39, Ming Lei 写道:
> > > >>
> > > >> [...]
> > > >>
> > > >>>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > > >>>> index 4e159948c912..99b621b2d40f 100644
> > > >>>> --- a/drivers/block/ublk_drv.c
> > > >>>> +++ b/drivers/block/ublk_drv.c
> > > >>>> @@ -2630,7 +2630,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
> > > >>>>    {
> > > >>>>       int i;
> > > >>>>
> > > >>>> -    WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
> > > >>>> +    if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
> > > >>>> +            return;
> > > >>>
> > > >>> Yeah, it is one bug. However, it could be addressed by adding the check in
> > > >>> ublk_ctrl_start_recovery() and return immediately in case of NULL ubq->ubq_daemon,
> > > >>> what do you think about this way?
> > > >>>
> > > >>
> > > >> Check ub->nr_queues_ready seems better. How about:
> > > >>
> > > >> @@ -2662,6 +2662,8 @@ static int ublk_ctrl_start_recovery(struct
> > > >> ublk_device *ub,
> > > >>           mutex_lock(&ub->mutex);
> > > >>           if (!ublk_can_use_recovery(ub))
> > > >>                   goto out_unlock;
> > > >> +       if (!ub->nr_queues_ready)
> > > >> +               goto out_unlock;
> > > >>           /*
> > > >>            * START_RECOVERY is only allowd after:
> > > >>            *
> > > >>
> > > >>>
> > > >>> Thanks,
> > > >>> Ming
> > > >>
> > > >> --
> > > >> Thanks,
> > > >> Nan
> > > >>
> > > >
> > > >
> > > > Hi,Nan
> > > >
> > > > After applying your new patch, I did not trigger "NULL pointer
> > > > dereference" and "Warning",
> > > > but hit task hung "Call Trace" info, please check
> > > >
> > > > [13617.812306] running generic/004
> > > > [13622.293674] blk_print_req_error: 91 callbacks suppressed
> > > > [13622.293681] I/O error, dev ublkb4, sector 233256 op 0x1:(WRITE)
> > > > flags 0x8800 phys_seg 1 prio class 0
> > > > [13622.308145] I/O error, dev ublkb4, sector 233256 op 0x0:(READ)
> > > > flags 0x0 phys_seg 2 prio class 0
> > > > [13622.316923] I/O error, dev ublkb4, sector 233264 op 0x1:(WRITE)
> > > > flags 0x8800 phys_seg 1 prio class 0
> > > > [13622.326048] I/O error, dev ublkb4, sector 233272 op 0x0:(READ)
> > > > flags 0x0 phys_seg 1 prio class 0
> > > > [13622.334828] I/O error, dev ublkb4, sector 233272 op 0x1:(WRITE)
> > > > flags 0x8800 phys_seg 1 prio class 0
> > > > [13622.343954] I/O error, dev ublkb4, sector 233312 op 0x0:(READ)
> > > > flags 0x0 phys_seg 1 prio class 0
> > > > [13622.352733] I/O error, dev ublkb4, sector 233008 op 0x0:(READ)
> > > > flags 0x0 phys_seg 1 prio class 0
> > > > [13622.361514] I/O error, dev ublkb4, sector 233112 op 0x0:(READ)
> > > > flags 0x0 phys_seg 1 prio class 0
> > > > [13622.370292] I/O error, dev ublkb4, sector 233192 op 0x1:(WRITE)
> > > > flags 0x8800 phys_seg 1 prio class 0
> > > > [13622.379419] I/O error, dev ublkb4, sector 233120 op 0x0:(READ)
> > > > flags 0x0 phys_seg 1 prio class 0
> > > > [13641.069695] INFO: task fio:174413 blocked for more than 122 seconds.
> > > > [13641.076061]       Not tainted 6.10.0-rc1+ #1
> > > > [13641.080338] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > > disables this message.
> > > > [13641.088164] task:fio             state:D stack:0     pid:174413
> > > > tgid:174413 ppid:174386 flags:0x00004002
> > > > [13641.088168] Call Trace:
> > > > [13641.088170]  <TASK>
> > > > [13641.088171]  __schedule+0x221/0x670
> > > > [13641.088177]  schedule+0x23/0xa0
> > > > [13641.088179]  io_schedule+0x42/0x70
> > > > [13641.088181]  blk_mq_get_tag+0x118/0x2b0
> > > > [13641.088185]  ? gup_fast_pgd_range+0x280/0x370
> > > > [13641.088188]  ? __pfx_autoremove_wake_function+0x10/0x10
> > > > [13641.088192]  __blk_mq_alloc_requests+0x194/0x3a0
> > > > [13641.088194]  blk_mq_submit_bio+0x241/0x6c0
> > > > [13641.088196]  __submit_bio+0x8a/0x1f0
> > > > [13641.088199]  submit_bio_noacct_nocheck+0x168/0x250
> > > > [13641.088201]  ? submit_bio_noacct+0x45/0x560
> > > > [13641.088203]  __blkdev_direct_IO_async+0x167/0x1a0
> > > > [13641.088206]  blkdev_write_iter+0x1c8/0x270
> > > > [13641.088208]  aio_write+0x11c/0x240
> > > > [13641.088212]  ? __rq_qos_issue+0x21/0x40
> > > > [13641.088214]  ? blk_mq_start_request+0x34/0x1a0
> > > > [13641.088216]  ? io_submit_one+0x68/0x380
> > > > [13641.088218]  ? kmem_cache_alloc_noprof+0x4e/0x320
> > > > [13641.088221]  ? fget+0x7c/0xc0
> > > > [13641.088224]  ? io_submit_one+0xde/0x380
> > > > [13641.088226]  io_submit_one+0xde/0x380
> > > > [13641.088228]  __x64_sys_io_submit+0x80/0x160
> > > > [13641.088229]  do_syscall_64+0x79/0x150
> > > > [13641.088233]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > > [13641.088237]  ? do_io_getevents+0x8b/0xe0
> > > > [13641.088238]  ? syscall_exit_work+0xf3/0x120
> > > > [13641.088241]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > > [13641.088243]  ? do_syscall_64+0x85/0x150
> > > > [13641.088245]  ? do_syscall_64+0x85/0x150
> > > > [13641.088247]  ? blk_mq_flush_plug_list.part.0+0x108/0x160
> > > > [13641.088249]  ? rseq_get_rseq_cs+0x1d/0x220
> > > > [13641.088252]  ? rseq_ip_fixup+0x6d/0x1d0
> > > > [13641.088254]  ? blk_finish_plug+0x24/0x40
> > > > [13641.088256]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > > [13641.088258]  ? do_syscall_64+0x85/0x150
> > > > [13641.088260]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > > [13641.088262]  ? do_syscall_64+0x85/0x150
> > > > [13641.088264]  ? syscall_exit_to_user_mode+0x6c/0x1f0
> > > > [13641.088266]  ? do_syscall_64+0x85/0x150
> > > > [13641.088268]  ? do_syscall_64+0x85/0x150
> > > > [13641.088270]  ? do_syscall_64+0x85/0x150
> > > > [13641.088272]  ? clear_bhb_loop+0x45/0xa0
> > > > [13641.088275]  ? clear_bhb_loop+0x45/0xa0
> > > > [13641.088277]  ? clear_bhb_loop+0x45/0xa0
> > > > [13641.088279]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > > [13641.088281] RIP: 0033:0x7ff92150713d
> > > > [13641.088283] RSP: 002b:00007ffca1ef81f8 EFLAGS: 00000246 ORIG_RAX:
> > > > 00000000000000d1
> > > > [13641.088285] RAX: ffffffffffffffda RBX: 00007ff9217e2f70 RCX: 00007ff92150713d
> > > > [13641.088286] RDX: 000055863b694fe0 RSI: 0000000000000010 RDI: 00007ff92164d000
> > > > [13641.088287] RBP: 00007ff92164d000 R08: 00007ff91936d000 R09: 0000000000000180
> > > > [13641.088288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
> > > > [13641.088289] R13: 0000000000000000 R14: 000055863b694fe0 R15: 000055863b6970c0
> > > > [13641.088291]  </TASK>
> > > >
> > > > Thanks,
> > > > Changhui
> > > >
> > >
> > > After applying the previous patch, will the test environment continue to
> > > execute test cases after WARN?
> >
> > a few days ago,test with the previous patch, the test environment
> > continued to execute test cases after WARN,
> > and I terminated the test when I observed a WARN,so I did not observe
> > the subsequent situation.
> >
> > > I am not sure whether this issue has always
> > > existed but was not tested becasue of WARN, or whether the new patch
> > > introduced it.
> >
> > today, I re-test previous patch, and let it run for a long time,I
> > observed WARN and task hung,
> > looks this issue already existed and not introduced by new patch.
>
> Hi Changhui,
>
> The hang is actually expected because recovery fails.
>
> Please pull the latest ublksrv and check if the issue can still be
> reproduced:
>
> https://github.com/ublk-org/ublksrv
>
> BTW, one ublksrv segfault and two test cleanup issues are fixed.
>
> Thanks,
> Ming
>

Hi,Ming and Nan

after applying the new patch and pulling the latest ublksrv,
I ran the test for 4 hours and did not observe any task hang.
the test results looks good!

Thanks,
Changhui
Li Nan June 6, 2024, 8:05 a.m. UTC | #10
在 2024/6/6 12:48, Changhui Zhong 写道:

[...]

>>
>> Hi Changhui,
>>
>> The hang is actually expected because recovery fails.
>>
>> Please pull the latest ublksrv and check if the issue can still be
>> reproduced:
>>
>> https://github.com/ublk-org/ublksrv
>>
>> BTW, one ublksrv segfault and two test cleanup issues are fixed.
>>
>> Thanks,
>> Ming
>>
> 
> Hi,Ming and Nan
> 
> after applying the new patch and pulling the latest ublksrv,
> I ran the test for 4 hours and did not observe any task hang.
> the test results looks good!
> 
> Thanks,
> Changhui
> 
> 
> .

Thanks for you test!

However, I got a NULL pointer dereference bug with ublksrv. It is not
introduced by this patch. It seems io was issued after deleting disk. And
it can be reproduced by:

   while true; do make test T=generic/004; done

[ 1524.286485] running generic/004
[ 1529.110875] blk_print_req_error: 109 callbacks suppressed
[ 1529.110881] I/O error, dev ublkb0, sector 164512 op 0x0:(READ) flags 0x0 
phys_seg 9 prio class 0
[ 1529.113801] I/O error, dev ublkb0, sector 161000 op 0x0:(READ) flags 0x0 
phys_seg 6 prio class 0
[ 1529.114711] I/O error, dev ublkb0, sector 164576 op 0x1:(WRITE) flags 
0x8800 phys_seg 7 prio class 0
[ 1529.117441] I/O error, dev ublkb0, sector 164632 op 0x1:(WRITE) flags 
0x8800 phys_seg 8 prio class 0
[ 1529.118400] I/O error, dev ublkb0, sector 164584 op 0x0:(READ) flags 0x0 
phys_seg 8 prio class 0
[ 1529.119314] I/O error, dev ublkb0, sector 161176 op 0x1:(WRITE) flags 
0x8800 phys_seg 10 prio class 0
[ 1529.120274] I/O error, dev ublkb0, sector 165136 op 0x1:(WRITE) flags 
0x8800 phys_seg 6 prio class 0
[ 1529.121213] I/O error, dev ublkb0, sector 165184 op 0x1:(WRITE) flags 
0x8800 phys_seg 10 prio class 0
[ 1529.122166] I/O error, dev ublkb0, sector 161256 op 0x1:(WRITE) flags 
0x8800 phys_seg 5 prio class 0
[ 1529.123101] I/O error, dev ublkb0, sector 161048 op 0x0:(READ) flags 0x0 
phys_seg 11 prio class 0
[ 1536.366869] blk_print_req_error: 181 callbacks suppressed
[ 1536.366885] I/O error, dev ublkb0, sector 181496 op 0x1:(WRITE) flags 
0x8800 phys_seg 9 prio class 0
[ 1536.368449] I/O error, dev ublkb0, sector 181568 op 0x1:(WRITE) flags 
0x8800 phys_seg 5 prio class 0
[ 1536.369398] I/O error, dev ublkb0, sector 181608 op 0x1:(WRITE) flags 
0x8800 phys_seg 10 prio class 0
[ 1536.370351] I/O error, dev ublkb0, sector 180976 op 0x0:(READ) flags 0x0 
phys_seg 7 prio class 0
[ 1536.371266] I/O error, dev ublkb0, sector 183696 op 0x1:(WRITE) flags 
0x8800 phys_seg 9 prio class 0
[ 1536.372217] I/O error, dev ublkb0, sector 175112 op 0x1:(WRITE) flags 
0x8800 phys_seg 5 prio class 0
[ 1536.373168] I/O error, dev ublkb0, sector 183768 op 0x1:(WRITE) flags 
0x8800 phys_seg 8 prio class 0
[ 1536.374120] I/O error, dev ublkb0, sector 175152 op 0x1:(WRITE) flags 
0x8800 phys_seg 5 prio class 0
[ 1536.375070] I/O error, dev ublkb0, sector 181032 op 0x0:(READ) flags 0x0 
phys_seg 8 prio class 0
[ 1536.375977] I/O error, dev ublkb0, sector 181096 op 0x0:(READ) flags 0x0 
phys_seg 8 prio class 0
[ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 1541.171734] #PF: supervisor write access in kernel mode
[ 1541.172271] #PF: error_code(0x0002) - not-present page
[ 1541.172798] PGD 0 P4D 0
[ 1541.173065] Oops: Oops: 0002 [#1] PREEMPT SMP
[ 1541.173515] CPU: 0 PID: 43707 Comm: ublk Not tainted 
6.9.0-next-20240523-00004-g9bc7e95c7323 #454
[ 1541.174417] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.16.1-2.fc37 04/01/2014
[ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300
[ 1541.175808] Code: ff 48 83 05 ef 7f 45 0c 01 e9 f7 fe ff ff 5b 48 83 05 
31 80 45 0c 01 5d 41 5c 41 5d c3 48 83 05 0b 4d 45 0c 01 49 8b 44 24 20 
<3e> 48 83 0c
[ 1541.177682] RSP: 0018:ffffc900036bbd00 EFLAGS: 00010202
[ 1541.178221] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000100001
[ 1541.178947] RDX: ffff888177180000 RSI: 0000000000000001 RDI: 
ffff88817727d000
[ 1541.179669] RBP: ffff8881345da290 R08: 0000000000100001 R09: 
0000000000000000
[ 1541.180393] R10: 0000000000000000 R11: 0000000000000000 R12: 
ffff888173324800
[ 1541.181120] R13: 0000000000000001 R14: ffff888177180000 R15: 
0000000000000008
[ 1541.181852] FS:  0000000000000000(0000) GS:ffff88842fc00000(0000) 
knlGS:0000000000000000
[ 1541.182666] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1541.183256] CR2: 0000000000000000 CR3: 00000001718bc004 CR4: 
0000000000770ef0
[ 1541.183978] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[ 1541.184683] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[ 1541.185392] PKRU: 55555554
[ 1541.185668] Call Trace:
[ 1541.185927]  <TASK>
[ 1541.186148]  ? show_regs+0x83/0x90
[ 1541.186497]  ? __die_body+0x22/0x90
[ 1541.186862]  ? __die+0x35/0x50
[ 1541.187184]  ? page_fault_oops+0x1cc/0x630
[ 1541.187604]  ? pick_eevdf+0x65/0x210
[ 1541.187979]  ? check_preempt_wakeup_fair+0x283/0x300
[ 1541.188483]  ? xfs_iext_lookup_extent+0xa4/0x2e0
[ 1541.188967]  ? do_user_addr_fault+0x118/0xab0
[ 1541.189418]  ? exc_page_fault+0xde/0x390
[ 1541.189825]  ? asm_exc_page_fault+0x22/0x30
[ 1541.190247]  ? io_fallback_tw+0x252/0x300
[ 1541.190652]  tctx_task_work_run+0x181/0x1c0
[ 1541.191090]  tctx_task_work+0x3f/0x80
[ 1541.191473]  task_work_run+0x81/0xf0
[ 1541.191841]  do_exit+0x53e/0x1360
[ 1541.192179]  ? tctx_task_work+0x3f/0x80
[ 1541.192578]  do_group_exit+0x34/0xc0
[ 1541.192953]  get_signal+0xe10/0xe20
[ 1541.193317]  ? blk_finish_plug+0x30/0x50
[ 1541.193730]  ? io_submit_sqes+0x9e0/0xd70
[ 1541.194133]  arch_do_signal_or_restart+0x32/0x400
[ 1541.194607]  ? __do_sys_io_uring_enter+0x170/0x8d0
[ 1541.195097]  syscall_exit_to_user_mode+0x2a1/0x430
[ 1541.195591]  do_syscall_64+0xb9/0x240
[ 1541.195973]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 1541.196489] RIP: 0033:0x7fcc6c8b34a5
[ 1541.196863] Code: Unable to access opcode bytes at 0x7fcc6c8b347b.
[ 1541.197490] RSP: 002b:00007fcc6c1ffd38 EFLAGS: 00000202 ORIG_RAX: 
00000000000001aa
[ 1541.198257] RAX: 000000000000003c RBX: 0000000000000000 RCX: 
00007fcc6c8b34a5
[ 1541.198984] RDX: 0000000000000001 RSI: 000000000000003c RDI: 
0000000000000000
[ 1541.199711] RBP: 00007fcc64002f68 R08: 00007fcc6c1ffd70 R09: 
0000000000000018
[ 1541.200438] R10: 0000000000000019 R11: 0000000000000202 R12: 
00007fcc6c1ffd90
[ 1541.201165] R13: 0000000000000000 R14: 0000000000000000 R15: 
00000000000001aa
[ 1541.202228]  </TASK>
[ 1541.202465] Modules linked in:
[ 1541.202798] CR2: 0000000000000000
[ 1541.203143] ---[ end trace 0000000000000000 ]---
Ming Lei June 6, 2024, 9:52 a.m. UTC | #11
On Thu, Jun 06, 2024 at 04:05:33PM +0800, Li Nan wrote:
> 
> 
> 在 2024/6/6 12:48, Changhui Zhong 写道:
> 
> [...]
> 
> > > 
> > > Hi Changhui,
> > > 
> > > The hang is actually expected because recovery fails.
> > > 
> > > Please pull the latest ublksrv and check if the issue can still be
> > > reproduced:
> > > 
> > > https://github.com/ublk-org/ublksrv
> > > 
> > > BTW, one ublksrv segfault and two test cleanup issues are fixed.
> > > 
> > > Thanks,
> > > Ming
> > > 
> > 
> > Hi,Ming and Nan
> > 
> > after applying the new patch and pulling the latest ublksrv,
> > I ran the test for 4 hours and did not observe any task hang.
> > the test results looks good!
> > 
> > Thanks,
> > Changhui
> > 
> > 
> > .
> 
> Thanks for you test!
> 
> However, I got a NULL pointer dereference bug with ublksrv. It is not

BTW, your patch isn't related with generic/004 which won't touch
recovery code path.

> introduced by this patch. It seems io was issued after deleting disk. And
> it can be reproduced by:
> 
>   while true; do make test T=generic/004; done

We didn't see that when running such test with linus tree, and usually
Changhui run generic test for hours.

> 
> [ 1524.286485] running generic/004
> [ 1529.110875] blk_print_req_error: 109 callbacks suppressed
...
> [ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 1541.171734] #PF: supervisor write access in kernel mode
> [ 1541.172271] #PF: error_code(0x0002) - not-present page
> [ 1541.172798] PGD 0 P4D 0
> [ 1541.173065] Oops: Oops: 0002 [#1] PREEMPT SMP
> [ 1541.173515] CPU: 0 PID: 43707 Comm: ublk Not tainted
> 6.9.0-next-20240523-00004-g9bc7e95c7323 #454
> [ 1541.174417] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.16.1-2.fc37 04/01/2014
> [ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300

This one looks one io_uring issue.

Care to provide which line of source code points to by 'io_fallback_tw+0x252'?

gdb> l *(io_fallback_tw+0x252)


Thanks,
Ming
Changhui Zhong June 6, 2024, 1:43 p.m. UTC | #12
On Thu, Jun 6, 2024 at 4:05 PM Li Nan <linan666@huaweicloud.com> wrote:
>
>
>
> 在 2024/6/6 12:48, Changhui Zhong 写道:
>
> [...]
>
> >>
> >> Hi Changhui,
> >>
> >> The hang is actually expected because recovery fails.
> >>
> >> Please pull the latest ublksrv and check if the issue can still be
> >> reproduced:
> >>
> >> https://github.com/ublk-org/ublksrv
> >>
> >> BTW, one ublksrv segfault and two test cleanup issues are fixed.
> >>
> >> Thanks,
> >> Ming
> >>
> >
> > Hi,Ming and Nan
> >
> > after applying the new patch and pulling the latest ublksrv,
> > I ran the test for 4 hours and did not observe any task hang.
> > the test results looks good!
> >
> > Thanks,
> > Changhui
> >
> >
> > .
>
> Thanks for you test!
>
> However, I got a NULL pointer dereference bug with ublksrv. It is not
> introduced by this patch. It seems io was issued after deleting disk. And
> it can be reproduced by:
>
>    while true; do make test T=generic/004; done
>

Hi,Nan

I can not reproduce this issue with repo
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-next
,
which kernel repo do you using for test?

Thanks,
Changhui

> [ 1524.286485] running generic/004
> [ 1529.110875] blk_print_req_error: 109 callbacks suppressed
> [ 1529.110881] I/O error, dev ublkb0, sector 164512 op 0x0:(READ) flags 0x0
> phys_seg 9 prio class 0
> [ 1529.113801] I/O error, dev ublkb0, sector 161000 op 0x0:(READ) flags 0x0
> phys_seg 6 prio class 0
> [ 1529.114711] I/O error, dev ublkb0, sector 164576 op 0x1:(WRITE) flags
> 0x8800 phys_seg 7 prio class 0
> [ 1529.117441] I/O error, dev ublkb0, sector 164632 op 0x1:(WRITE) flags
> 0x8800 phys_seg 8 prio class 0
> [ 1529.118400] I/O error, dev ublkb0, sector 164584 op 0x0:(READ) flags 0x0
> phys_seg 8 prio class 0
> [ 1529.119314] I/O error, dev ublkb0, sector 161176 op 0x1:(WRITE) flags
> 0x8800 phys_seg 10 prio class 0
> [ 1529.120274] I/O error, dev ublkb0, sector 165136 op 0x1:(WRITE) flags
> 0x8800 phys_seg 6 prio class 0
> [ 1529.121213] I/O error, dev ublkb0, sector 165184 op 0x1:(WRITE) flags
> 0x8800 phys_seg 10 prio class 0
> [ 1529.122166] I/O error, dev ublkb0, sector 161256 op 0x1:(WRITE) flags
> 0x8800 phys_seg 5 prio class 0
> [ 1529.123101] I/O error, dev ublkb0, sector 161048 op 0x0:(READ) flags 0x0
> phys_seg 11 prio class 0
> [ 1536.366869] blk_print_req_error: 181 callbacks suppressed
> [ 1536.366885] I/O error, dev ublkb0, sector 181496 op 0x1:(WRITE) flags
> 0x8800 phys_seg 9 prio class 0
> [ 1536.368449] I/O error, dev ublkb0, sector 181568 op 0x1:(WRITE) flags
> 0x8800 phys_seg 5 prio class 0
> [ 1536.369398] I/O error, dev ublkb0, sector 181608 op 0x1:(WRITE) flags
> 0x8800 phys_seg 10 prio class 0
> [ 1536.370351] I/O error, dev ublkb0, sector 180976 op 0x0:(READ) flags 0x0
> phys_seg 7 prio class 0
> [ 1536.371266] I/O error, dev ublkb0, sector 183696 op 0x1:(WRITE) flags
> 0x8800 phys_seg 9 prio class 0
> [ 1536.372217] I/O error, dev ublkb0, sector 175112 op 0x1:(WRITE) flags
> 0x8800 phys_seg 5 prio class 0
> [ 1536.373168] I/O error, dev ublkb0, sector 183768 op 0x1:(WRITE) flags
> 0x8800 phys_seg 8 prio class 0
> [ 1536.374120] I/O error, dev ublkb0, sector 175152 op 0x1:(WRITE) flags
> 0x8800 phys_seg 5 prio class 0
> [ 1536.375070] I/O error, dev ublkb0, sector 181032 op 0x0:(READ) flags 0x0
> phys_seg 8 prio class 0
> [ 1536.375977] I/O error, dev ublkb0, sector 181096 op 0x0:(READ) flags 0x0
> phys_seg 8 prio class 0
> [ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 1541.171734] #PF: supervisor write access in kernel mode
> [ 1541.172271] #PF: error_code(0x0002) - not-present page
> [ 1541.172798] PGD 0 P4D 0
> [ 1541.173065] Oops: Oops: 0002 [#1] PREEMPT SMP
> [ 1541.173515] CPU: 0 PID: 43707 Comm: ublk Not tainted
> 6.9.0-next-20240523-00004-g9bc7e95c7323 #454
> [ 1541.174417] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.16.1-2.fc37 04/01/2014
> [ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300
> [ 1541.175808] Code: ff 48 83 05 ef 7f 45 0c 01 e9 f7 fe ff ff 5b 48 83 05
> 31 80 45 0c 01 5d 41 5c 41 5d c3 48 83 05 0b 4d 45 0c 01 49 8b 44 24 20
> <3e> 48 83 0c
> [ 1541.177682] RSP: 0018:ffffc900036bbd00 EFLAGS: 00010202
> [ 1541.178221] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000100001
> [ 1541.178947] RDX: ffff888177180000 RSI: 0000000000000001 RDI:
> ffff88817727d000
> [ 1541.179669] RBP: ffff8881345da290 R08: 0000000000100001 R09:
> 0000000000000000
> [ 1541.180393] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff888173324800
> [ 1541.181120] R13: 0000000000000001 R14: ffff888177180000 R15:
> 0000000000000008
> [ 1541.181852] FS:  0000000000000000(0000) GS:ffff88842fc00000(0000)
> knlGS:0000000000000000
> [ 1541.182666] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1541.183256] CR2: 0000000000000000 CR3: 00000001718bc004 CR4:
> 0000000000770ef0
> [ 1541.183978] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 1541.184683] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [ 1541.185392] PKRU: 55555554
> [ 1541.185668] Call Trace:
> [ 1541.185927]  <TASK>
> [ 1541.186148]  ? show_regs+0x83/0x90
> [ 1541.186497]  ? __die_body+0x22/0x90
> [ 1541.186862]  ? __die+0x35/0x50
> [ 1541.187184]  ? page_fault_oops+0x1cc/0x630
> [ 1541.187604]  ? pick_eevdf+0x65/0x210
> [ 1541.187979]  ? check_preempt_wakeup_fair+0x283/0x300
> [ 1541.188483]  ? xfs_iext_lookup_extent+0xa4/0x2e0
> [ 1541.188967]  ? do_user_addr_fault+0x118/0xab0
> [ 1541.189418]  ? exc_page_fault+0xde/0x390
> [ 1541.189825]  ? asm_exc_page_fault+0x22/0x30
> [ 1541.190247]  ? io_fallback_tw+0x252/0x300
> [ 1541.190652]  tctx_task_work_run+0x181/0x1c0
> [ 1541.191090]  tctx_task_work+0x3f/0x80
> [ 1541.191473]  task_work_run+0x81/0xf0
> [ 1541.191841]  do_exit+0x53e/0x1360
> [ 1541.192179]  ? tctx_task_work+0x3f/0x80
> [ 1541.192578]  do_group_exit+0x34/0xc0
> [ 1541.192953]  get_signal+0xe10/0xe20
> [ 1541.193317]  ? blk_finish_plug+0x30/0x50
> [ 1541.193730]  ? io_submit_sqes+0x9e0/0xd70
> [ 1541.194133]  arch_do_signal_or_restart+0x32/0x400
> [ 1541.194607]  ? __do_sys_io_uring_enter+0x170/0x8d0
> [ 1541.195097]  syscall_exit_to_user_mode+0x2a1/0x430
> [ 1541.195591]  do_syscall_64+0xb9/0x240
> [ 1541.195973]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
> [ 1541.196489] RIP: 0033:0x7fcc6c8b34a5
> [ 1541.196863] Code: Unable to access opcode bytes at 0x7fcc6c8b347b.
> [ 1541.197490] RSP: 002b:00007fcc6c1ffd38 EFLAGS: 00000202 ORIG_RAX:
> 00000000000001aa
> [ 1541.198257] RAX: 000000000000003c RBX: 0000000000000000 RCX:
> 00007fcc6c8b34a5
> [ 1541.198984] RDX: 0000000000000001 RSI: 000000000000003c RDI:
> 0000000000000000
> [ 1541.199711] RBP: 00007fcc64002f68 R08: 00007fcc6c1ffd70 R09:
> 0000000000000018
> [ 1541.200438] R10: 0000000000000019 R11: 0000000000000202 R12:
> 00007fcc6c1ffd90
> [ 1541.201165] R13: 0000000000000000 R14: 0000000000000000 R15:
> 00000000000001aa
> [ 1541.202228]  </TASK>
> [ 1541.202465] Modules linked in:
> [ 1541.202798] CR2: 0000000000000000
> [ 1541.203143] ---[ end trace 0000000000000000 ]---
>
> --
> Thanks,
> Nan
>
Li Nan June 8, 2024, 6:34 a.m. UTC | #13
在 2024/6/6 17:52, Ming Lei 写道:
> On Thu, Jun 06, 2024 at 04:05:33PM +0800, Li Nan wrote:
>>
>>
>> 在 2024/6/6 12:48, Changhui Zhong 写道:
>>
>> [...]
>>
>>>>
>>>> Hi Changhui,
>>>>
>>>> The hang is actually expected because recovery fails.
>>>>
>>>> Please pull the latest ublksrv and check if the issue can still be
>>>> reproduced:
>>>>
>>>> https://github.com/ublk-org/ublksrv
>>>>
>>>> BTW, one ublksrv segfault and two test cleanup issues are fixed.
>>>>
>>>> Thanks,
>>>> Ming
>>>>
>>>
>>> Hi,Ming and Nan
>>>
>>> after applying the new patch and pulling the latest ublksrv,
>>> I ran the test for 4 hours and did not observe any task hang.
>>> the test results looks good!
>>>
>>> Thanks,
>>> Changhui
>>>
>>>
>>> .
>>
>> Thanks for you test!
>>
>> However, I got a NULL pointer dereference bug with ublksrv. It is not
> 
> BTW, your patch isn't related with generic/004 which won't touch
> recovery code path.
> 
>> introduced by this patch. It seems io was issued after deleting disk. And
>> it can be reproduced by:
>>
>>    while true; do make test T=generic/004; done
> 
> We didn't see that when running such test with linus tree, and usually
> Changhui run generic test for hours.
> 
>>
>> [ 1524.286485] running generic/004
>> [ 1529.110875] blk_print_req_error: 109 callbacks suppressed
> ...
>> [ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000
>> [ 1541.171734] #PF: supervisor write access in kernel mode
>> [ 1541.172271] #PF: error_code(0x0002) - not-present page
>> [ 1541.172798] PGD 0 P4D 0
>> [ 1541.173065] Oops: Oops: 0002 [#1] PREEMPT SMP
>> [ 1541.173515] CPU: 0 PID: 43707 Comm: ublk Not tainted
>> 6.9.0-next-20240523-00004-g9bc7e95c7323 #454
>> [ 1541.174417] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>> 1.16.1-2.fc37 04/01/2014
>> [ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300
> 
> This one looks one io_uring issue.
> 
> Care to provide which line of source code points to by 'io_fallback_tw+0x252'?
> 
> gdb> l *(io_fallback_tw+0x252)
> 
(gdb) list * io_fallback_tw+0x252
0xffffffff81d79dc2 is in io_fallback_tw 
(./arch/x86/include/asm/atomic64_64.h:25).
20              __WRITE_ONCE(v->counter, i);
21      }
22
23      static __always_inline void arch_atomic64_add(s64 i, atomic64_t *v)
24      {
25              asm volatile(LOCK_PREFIX "addq %1,%0"
26                           : "=m" (v->counter)
27                           : "er" (i), "m" (v->counter) : "memory");
28      }

The corresponding code is:
io_fallback_tw
   percpu_ref_get(&last_ctx->refs);

I have the vmcore of this issue. If you have any other needs, please let me
know.
Li Nan June 8, 2024, 6:44 a.m. UTC | #14
在 2024/6/6 21:43, Changhui Zhong 写道:
> On Thu, Jun 6, 2024 at 4:05 PM Li Nan <linan666@huaweicloud.com> wrote:
>>
>>
>>
>> 在 2024/6/6 12:48, Changhui Zhong 写道:
>>
>> [...]
>>
>>>>
>>>> Hi Changhui,
>>>>
>>>> The hang is actually expected because recovery fails.
>>>>
>>>> Please pull the latest ublksrv and check if the issue can still be
>>>> reproduced:
>>>>
>>>> https://github.com/ublk-org/ublksrv
>>>>
>>>> BTW, one ublksrv segfault and two test cleanup issues are fixed.
>>>>
>>>> Thanks,
>>>> Ming
>>>>
>>>
>>> Hi,Ming and Nan
>>>
>>> after applying the new patch and pulling the latest ublksrv,
>>> I ran the test for 4 hours and did not observe any task hang.
>>> the test results looks good!
>>>
>>> Thanks,
>>> Changhui
>>>
>>>
>>> .
>>
>> Thanks for you test!
>>
>> However, I got a NULL pointer dereference bug with ublksrv. It is not
>> introduced by this patch. It seems io was issued after deleting disk. And
>> it can be reproduced by:
>>
>>     while true; do make test T=generic/004; done
>>
> 
> Hi,Nan
> 
> I can not reproduce this issue with repo
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-next
> ,
> which kernel repo do you using for test?
> 
> Thanks,
> Changhui
> 

I tested with the latest block code, commit
2ab79514109578fc4b6df90633d500cf281eb689, this issue also can be
reproduced.

The space of the root path has been filled up by
ublksrv(tests/tmpublk_loop_data_xxx), which may the issue be related to this?
Ming Lei June 8, 2024, 2:16 p.m. UTC | #15
On Sat, Jun 08, 2024 at 02:34:47PM +0800, Li Nan wrote:
> 
> 
> 在 2024/6/6 17:52, Ming Lei 写道:
> > On Thu, Jun 06, 2024 at 04:05:33PM +0800, Li Nan wrote:
> > > 
> > > 
> > > 在 2024/6/6 12:48, Changhui Zhong 写道:
> > > 
> > > [...]
> > > 
> > > > > 
> > > > > Hi Changhui,
> > > > > 
> > > > > The hang is actually expected because recovery fails.
> > > > > 
> > > > > Please pull the latest ublksrv and check if the issue can still be
> > > > > reproduced:
> > > > > 
> > > > > https://github.com/ublk-org/ublksrv
> > > > > 
> > > > > BTW, one ublksrv segfault and two test cleanup issues are fixed.
> > > > > 
> > > > > Thanks,
> > > > > Ming
> > > > > 
> > > > 
> > > > Hi,Ming and Nan
> > > > 
> > > > after applying the new patch and pulling the latest ublksrv,
> > > > I ran the test for 4 hours and did not observe any task hang.
> > > > the test results looks good!
> > > > 
> > > > Thanks,
> > > > Changhui
> > > > 
> > > > 
> > > > .
> > > 
> > > Thanks for you test!
> > > 
> > > However, I got a NULL pointer dereference bug with ublksrv. It is not
> > 
> > BTW, your patch isn't related with generic/004 which won't touch
> > recovery code path.
> > 
> > > introduced by this patch. It seems io was issued after deleting disk. And
> > > it can be reproduced by:
> > > 
> > >    while true; do make test T=generic/004; done
> > 
> > We didn't see that when running such test with linus tree, and usually
> > Changhui run generic test for hours.
> > 
> > > 
> > > [ 1524.286485] running generic/004
> > > [ 1529.110875] blk_print_req_error: 109 callbacks suppressed
> > ...
> > > [ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > [ 1541.171734] #PF: supervisor write access in kernel mode
> > > [ 1541.172271] #PF: error_code(0x0002) - not-present page
> > > [ 1541.172798] PGD 0 P4D 0
> > > [ 1541.173065] Oops: Oops: 0002 [#1] PREEMPT SMP
> > > [ 1541.173515] CPU: 0 PID: 43707 Comm: ublk Not tainted
> > > 6.9.0-next-20240523-00004-g9bc7e95c7323 #454
> > > [ 1541.174417] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > 1.16.1-2.fc37 04/01/2014
> > > [ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300
> > 
> > This one looks one io_uring issue.
> > 
> > Care to provide which line of source code points to by 'io_fallback_tw+0x252'?
> > 
> > gdb> l *(io_fallback_tw+0x252)
> > 
> (gdb) list * io_fallback_tw+0x252
> 0xffffffff81d79dc2 is in io_fallback_tw
> (./arch/x86/include/asm/atomic64_64.h:25).
> 20              __WRITE_ONCE(v->counter, i);
> 21      }
> 22
> 23      static __always_inline void arch_atomic64_add(s64 i, atomic64_t *v)
> 24      {
> 25              asm volatile(LOCK_PREFIX "addq %1,%0"
> 26                           : "=m" (v->counter)
> 27                           : "er" (i), "m" (v->counter) : "memory");
> 28      }
> 
> The corresponding code is:
> io_fallback_tw
>   percpu_ref_get(&last_ctx->refs);

[ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000
...
[ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300

So looks the 'struct io_ring_ctx' instance is freed and ctx->refs.data
becomes NULL when calling percpu_ref_get(&last_ctx->refs), not clear how
it can happen since request grabs one ctx reference.

Maybe you can try kasan and see if more useful info can be dumped.

> 
> I have the vmcore of this issue. If you have any other needs, please let me
> know.

Not try remote vmcore debug yet, will think further if any useful hint
is needed.

> The space of the root path has been filled up by
> ublksrv(tests/tmpublk_loop_data_xxx), which may the issue be related to this?

It isn't supposed to be related, since the backing loop image is created by dd
and it isn't sparse image, and the test runs fio over ublk block
device only.

Thanks,
Ming
diff mbox series

Patch

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 4e159948c912..99b621b2d40f 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -2630,7 +2630,8 @@  static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
 {
 	int i;
 
-	WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
+	if (WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq))))
+		return;
 
 	/* All old ioucmds have to be completed */
 	ubq->nr_io_ready = 0;