mbox series

[V2,blktests,0/2] blktests: Add ublk testcases

Message ID 20230505032808.356768-1-ZiyangZhang@linux.alibaba.com (mailing list archive)
Headers show
Series blktests: Add ublk testcases | expand

Message

Ziyang Zhang May 5, 2023, 3:28 a.m. UTC
Hi,

ublk can passthrough I/O requests to userspce daemons. It is very important
to test ublk crash handling since the userspace part is not reliable.
Especially we should test removing device, killing ublk daemons and user
recovery feature.

The first patch add user recovery support in miniublk.

The second patch add five new tests for ublk to cover above cases.

V2:
- Check parameters in recovery
- Add one small delay before deleting device
- Write informative description

Ziyang Zhang (2):
  src/miniublk: add user recovery
  tests: Add ublk tests

 common/ublk        |  10 +-
 src/miniublk.c     | 269 ++++++++++++++++++++++++++++++++++++++++++---
 tests/ublk/001     |  48 ++++++++
 tests/ublk/001.out |   2 +
 tests/ublk/002     |  63 +++++++++++
 tests/ublk/002.out |   2 +
 tests/ublk/003     |  48 ++++++++
 tests/ublk/003.out |   2 +
 tests/ublk/004     |  50 +++++++++
 tests/ublk/004.out |   2 +
 tests/ublk/005     |  79 +++++++++++++
 tests/ublk/005.out |   2 +
 tests/ublk/006     |  83 ++++++++++++++
 tests/ublk/006.out |   2 +
 tests/ublk/rc      |  15 +++
 15 files changed, 661 insertions(+), 16 deletions(-)
 create mode 100755 tests/ublk/001
 create mode 100644 tests/ublk/001.out
 create mode 100755 tests/ublk/002
 create mode 100644 tests/ublk/002.out
 create mode 100755 tests/ublk/003
 create mode 100644 tests/ublk/003.out
 create mode 100755 tests/ublk/004
 create mode 100644 tests/ublk/004.out
 create mode 100755 tests/ublk/005
 create mode 100644 tests/ublk/005.out
 create mode 100755 tests/ublk/006
 create mode 100644 tests/ublk/006.out
 create mode 100644 tests/ublk/rc

Comments

Shin'ichiro Kawasaki May 16, 2023, 8:55 a.m. UTC | #1
On May 05, 2023 / 11:28, Ziyang Zhang wrote:
> Hi,
> 
> ublk can passthrough I/O requests to userspce daemons. It is very important
> to test ublk crash handling since the userspace part is not reliable.
> Especially we should test removing device, killing ublk daemons and user
> recovery feature.
> 
> The first patch add user recovery support in miniublk.
> 
> The second patch add five new tests for ublk to cover above cases.
> 
> V2:
> - Check parameters in recovery
> - Add one small delay before deleting device
> - Write informative description

Ziyang, thanks for the v2 patches and sorry for this slow response. Please find
my comments in line.

FYI, I also ran the new test cases on kernel v6.4-rc2, and observed failure of
ublk/001. The failure cause is the lockdep WARN [1]. The test case already found
an issue, so it proves that the test is valuable :)

[1]

[  204.288195] run blktests ublk/001 at 2023-05-16 17:52:14

[  206.755085] ======================================================
[  206.756063] WARNING: possible circular locking dependency detected
[  206.756595] 6.4.0-rc2 #6 Not tainted
[  206.756924] ------------------------------------------------------
[  206.757436] iou-wrk-1070/1071 is trying to acquire lock:
[  206.757891] ffff88811f1420a8 (&ctx->uring_lock){+.+.}-{3:3}, at: __io_req_complete_post+0x792/0xd50
[  206.758625] 
               but task is already holding lock:
[  206.759166] ffff88812c3f66c0 (&ub->mutex){+.+.}-{3:3}, at: ublk_stop_dev+0x2b/0x400 [ublk_drv]
[  206.759865] 
               which lock already depends on the new lock.

[  206.760623] 
               the existing dependency chain (in reverse order) is:
[  206.761282] 
               -> #1 (&ub->mutex){+.+.}-{3:3}:
[  206.761811]        __mutex_lock+0x185/0x18b0
[  206.762192]        ublk_ch_uring_cmd+0x511/0x1630 [ublk_drv]
[  206.762678]        io_uring_cmd+0x1ec/0x3d0
[  206.763081]        io_issue_sqe+0x461/0xb70
[  206.763477]        io_submit_sqes+0x794/0x1c50
[  206.763857]        __do_sys_io_uring_enter+0x736/0x1ce0
[  206.764368]        do_syscall_64+0x5c/0x90
[  206.764724]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  206.765244] 
               -> #0 (&ctx->uring_lock){+.+.}-{3:3}:
[  206.765813]        __lock_acquire+0x2f25/0x5f00
[  206.766272]        lock_acquire+0x1a9/0x4e0
[  206.766633]        __mutex_lock+0x185/0x18b0
[  206.767042]        __io_req_complete_post+0x792/0xd50
[  206.767500]        io_uring_cmd_done+0x27d/0x300
[  206.767918]        ublk_cancel_dev+0x1c6/0x410 [ublk_drv]
[  206.768416]        ublk_stop_dev+0x2ad/0x400 [ublk_drv]
[  206.768853]        ublk_ctrl_uring_cmd+0x14fd/0x3bf0 [ublk_drv]
[  206.769411]        io_uring_cmd+0x1ec/0x3d0
[  206.769772]        io_issue_sqe+0x461/0xb70
[  206.770175]        io_wq_submit_work+0x2b5/0x710
[  206.770600]        io_worker_handle_work+0x6b8/0x1620
[  206.771066]        io_wq_worker+0x4ef/0xb50
[  206.771461]        ret_from_fork+0x2c/0x50
[  206.771817] 
               other info that might help us debug this:

[  206.773807]  Possible unsafe locking scenario:

[  206.775596]        CPU0                    CPU1
[  206.776607]        ----                    ----
[  206.777604]   lock(&ub->mutex);
[  206.778496]                                lock(&ctx->uring_lock);
[  206.779601]                                lock(&ub->mutex);
[  206.780656]   lock(&ctx->uring_lock);
[  206.781561] 
                *** DEADLOCK ***

[  206.783778] 1 lock held by iou-wrk-1070/1071:
[  206.784697]  #0: ffff88812c3f66c0 (&ub->mutex){+.+.}-{3:3}, at: ublk_stop_dev+0x2b/0x400 [ublk_drv]
[  206.786005] 
               stack backtrace:
[  206.787493] CPU: 1 PID: 1071 Comm: iou-wrk-1070 Not tainted 6.4.0-rc2 #6
[  206.788576] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
[  206.789819] Call Trace:
[  206.790617]  <TASK>
[  206.791395]  dump_stack_lvl+0x57/0x90
[  206.792284]  check_noncircular+0x27b/0x310
[  206.793168]  ? __pfx_mark_lock+0x10/0x10
[  206.794068]  ? __pfx_check_noncircular+0x10/0x10
[  206.795017]  ? lock_acquire+0x1a9/0x4e0
[  206.795871]  ? lockdep_lock+0xca/0x1c0
[  206.796750]  ? __pfx_lockdep_lock+0x10/0x10
[  206.797665]  __lock_acquire+0x2f25/0x5f00
[  206.798569]  ? __pfx___lock_acquire+0x10/0x10
[  206.799492]  ? try_to_wake_up+0x806/0x1a30
[  206.800395]  ? __pfx_lock_release+0x10/0x10
[  206.801306]  lock_acquire+0x1a9/0x4e0
[  206.802143]  ? __io_req_complete_post+0x792/0xd50
[  206.803092]  ? __pfx_lock_acquire+0x10/0x10
[  206.803998]  ? lock_is_held_type+0xce/0x120
[  206.804866]  ? find_held_lock+0x2d/0x110
[  206.805760]  ? __pfx___might_resched+0x10/0x10
[  206.806684]  ? lock_release+0x378/0x650
[  206.807568]  __mutex_lock+0x185/0x18b0
[  206.808440]  ? __io_req_complete_post+0x792/0xd50
[  206.809379]  ? mark_held_locks+0x96/0xe0
[  206.810359]  ? __io_req_complete_post+0x792/0xd50
[  206.811294]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
[  206.812208]  ? lockdep_hardirqs_on+0x7d/0x100
[  206.813078]  ? __pfx___mutex_lock+0x10/0x10
[  206.813936]  ? __wake_up_common_lock+0xe8/0x150
[  206.814817]  ? __pfx___wake_up_common_lock+0x10/0x10
[  206.815736]  ? percpu_counter_add_batch+0x9f/0x160
[  206.816643]  ? __io_req_complete_post+0x792/0xd50
[  206.817541]  __io_req_complete_post+0x792/0xd50
[  206.818429]  ? mark_held_locks+0x96/0xe0
[  206.819276]  io_uring_cmd_done+0x27d/0x300
[  206.820129]  ? kasan_quarantine_put+0xd6/0x1e0
[  206.821015]  ? __pfx_io_uring_cmd_done+0x10/0x10
[  206.821915]  ? per_cpu_remove_cache+0x80/0x80
[  206.822794]  ? slab_free_freelist_hook+0x9e/0x1c0
[  206.823697]  ublk_cancel_dev+0x1c6/0x410 [ublk_drv]
[  206.824665]  ? kobject_put+0x190/0x4a0
[  206.825503]  ublk_stop_dev+0x2ad/0x400 [ublk_drv]
[  206.826410]  ublk_ctrl_uring_cmd+0x14fd/0x3bf0 [ublk_drv]
[  206.827377]  ? __pfx_ublk_ctrl_uring_cmd+0x10/0x10 [ublk_drv]
[  206.828376]  ? selinux_uring_cmd+0x1cc/0x260
[  206.829268]  ? __pfx_selinux_uring_cmd+0x10/0x10
[  206.830169]  ? lock_acquire+0x1a9/0x4e0
[  206.831007]  io_uring_cmd+0x1ec/0x3d0
[  206.831833]  io_issue_sqe+0x461/0xb70
[  206.832651]  io_wq_submit_work+0x2b5/0x710
[  206.833488]  io_worker_handle_work+0x6b8/0x1620
[  206.834345]  io_wq_worker+0x4ef/0xb50
[  206.835143]  ? __pfx_io_wq_worker+0x10/0x10
[  206.835979]  ? lock_release+0x378/0x650
[  206.836784]  ? ret_from_fork+0x12/0x50
[  206.837586]  ? __pfx_lock_release+0x10/0x10
[  206.838419]  ? do_raw_spin_lock+0x12e/0x270
[  206.839250]  ? __pfx_do_raw_spin_lock+0x10/0x10
[  206.840111]  ? __pfx_io_wq_worker+0x10/0x10
[  206.840947]  ret_from_fork+0x2c/0x50
[  206.841738]  </TASK>