diff mbox series

null_blk: fix queue mode Oops for membacked

Message ID 20230412040827.8082-1-kch@nvidia.com (mailing list archive)
State New, archived
Headers show
Series null_blk: fix queue mode Oops for membacked | expand

Commit Message

Chaitanya Kulkarni April 12, 2023, 4:08 a.m. UTC
Make sure to check device queue mode in the null_validate_conf()
and return error for NULL_Q_RQ as we don't allow legacy I/O path,
without this patch we get OOPs when queue mode is set to 1 from
configfs, following are repro steps :-

modprobe null_blk nr_devices=0
mkdir config/nullb/nullb0
echo 1 > config/nullb/nullb0/memory_backed
echo 4096 > config/nullb/nullb0/blocksize
echo 20480 > config/nullb/nullb0/size
echo 1 > config/nullb/nullb0/queue_mode
echo 1 > config/nullb/nullb0/power

Entering kdb (current=0xffff88810acdd080, pid 2372) on processor 42 Oops: (null)
due to oops @ 0xffffffffc041c329
CPU: 42 PID: 2372 Comm: sh Tainted: G           O     N 6.3.0-rc5lblk+ #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:null_add_dev.part.0+0xd9/0x720 [null_blk]
Code: 01 00 00 85 d2 0f 85 a1 03 00 00 48 83 bb 08 01 00 00 00 0f 85 f7 03 00 00 80 bb 62 01 00 00 00 48 8b 75 20 0f 85 6d 02 00 00 <48> 89 6e 60 48 8b 75 20 bf 06 00 00 00 e8 f5 37 2c c1 48 8b 75 20
RSP: 0018:ffffc900052cbde0 EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff88811084d800 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888100042e00
RBP: ffff8881053d8200 R08: ffffc900052cbd68 R09: ffff888105db2000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000002
R13: ffff888104765200 R14: ffff88810eec1748 R15: ffff88810eec1740
FS:  00007fd445fd1740(0000) GS:ffff8897dfc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000060 CR3: 0000000166a00000 CR4: 0000000000350ee0
DR0: ffffffff8437a488 DR1: ffffffff8437a489 DR2: ffffffff8437a48a
DR3: ffffffff8437a48b DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 nullb_device_power_store+0xd1/0x120 [null_blk]
 configfs_write_iter+0xb4/0x120
 vfs_write+0x2ba/0x3c0
 ksys_write+0x5f/0xe0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7fd4460c57a7
Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
RSP: 002b:00007ffd3792a4a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4460c57a7
RDX: 0000000000000002 RSI: 000055b43c02e4c0 RDI: 0000000000000001
RBP: 000055b43c02e4c0 R08: 000000000000000a R09: 00007fd44615b4e0
R10: 00007fd44615b3e0 R11: 0000000000000246 R12: 0000000000000002
R13: 00007fd446198520 R14: 0000000000000002 R15: 00007fd446198700
 </TASK>

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/block/null_blk/main.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Damien Le Moal April 12, 2023, 7:57 a.m. UTC | #1
On 4/12/23 13:08, Chaitanya Kulkarni wrote:
> Make sure to check device queue mode in the null_validate_conf()
> and return error for NULL_Q_RQ as we don't allow legacy I/O path,
> without this patch we get OOPs when queue mode is set to 1 from
> configfs, following are repro steps :-

Looks OK but the patch title is a little odd as I do not see what memory backing
has to do with checking the correctness of queue_mode. So what about something like:

null_blk: Always check queue mode setting from configfs

for the patch title ?

With that fixed,

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Chaitanya Kulkarni April 12, 2023, 8:09 a.m. UTC | #2
On 4/12/23 00:57, Damien Le Moal wrote:
> On 4/12/23 13:08, Chaitanya Kulkarni wrote:
>> Make sure to check device queue mode in the null_validate_conf()
>> and return error for NULL_Q_RQ as we don't allow legacy I/O path,
>> without this patch we get OOPs when queue mode is set to 1 from
>> configfs, following are repro steps :-
> Looks OK but the patch title is a little odd as I do not see what memory backing
> has to do with checking the correctness of queue_mode. So what about something like:
>
> null_blk: Always check queue mode setting from configfs

agree..

> for the patch title ?
>
> With that fixed,
>
> Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
>
>
>

will send a v2 with that fix with your review tag, thanks for the
review.

-ck
Nitesh Shetty April 12, 2023, 9:17 a.m. UTC | #3
On 23/04/11 09:08PM, Chaitanya Kulkarni wrote:
>Make sure to check device queue mode in the null_validate_conf()
>and return error for NULL_Q_RQ as we don't allow legacy I/O path,

Can't we do away with NULL_Q_RQ defination itself ?
I mean, since I see in code we are not using NULL_Q_RQ anywhere,
if we can remove NULL_Q_RQ defination from enum in null_blk.h,
we can remove your suggested check, as well as check in null_init.

--Nitesh Shetty
Damien Le Moal April 12, 2023, 9:43 a.m. UTC | #4
On 4/12/23 18:17, Nitesh Shetty wrote:
> On 23/04/11 09:08PM, Chaitanya Kulkarni wrote:
>> Make sure to check device queue mode in the null_validate_conf()
>> and return error for NULL_Q_RQ as we don't allow legacy I/O path,
> 
> Can't we do away with NULL_Q_RQ defination itself ?
> I mean, since I see in code we are not using NULL_Q_RQ anywhere,
> if we can remove NULL_Q_RQ defination from enum in null_blk.h,
> we can remove your suggested check, as well as check in null_init.

I think it is being kept around to avoid reusing this value to not confuse old
scripts.

> 
> --Nitesh Shetty
> 
>
Nitesh Shetty April 12, 2023, 10:29 a.m. UTC | #5
On 23/04/12 06:43PM, Damien Le Moal wrote:
>On 4/12/23 18:17, Nitesh Shetty wrote:
>> On 23/04/11 09:08PM, Chaitanya Kulkarni wrote:
>>> Make sure to check device queue mode in the null_validate_conf()
>>> and return error for NULL_Q_RQ as we don't allow legacy I/O path,
>>
>> Can't we do away with NULL_Q_RQ defination itself ?
>> I mean, since I see in code we are not using NULL_Q_RQ anywhere,
>> if we can remove NULL_Q_RQ defination from enum in null_blk.h,
>> we can remove your suggested check, as well as check in null_init.
>
>I think it is being kept around to avoid reusing this value to not confuse old
>scripts.
>
got it.

Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Chaitanya Kulkarni April 13, 2023, 1:30 a.m. UTC | #6
On 4/12/23 02:17, Nitesh Shetty wrote:
> On 23/04/11 09:08PM, Chaitanya Kulkarni wrote:
>> Make sure to check device queue mode in the null_validate_conf()
>> and return error for NULL_Q_RQ as we don't allow legacy I/O path,
>
> Can't we do away with NULL_Q_RQ defination itself ?
> I mean, since I see in code we are not using NULL_Q_RQ anywhere,
> if we can remove NULL_Q_RQ defination from enum in null_blk.h,
> we can remove your suggested check, as well as check in null_init.
>
> --Nitesh Shetty
>

Please don't do that.

-ck
Ming Lei April 13, 2023, 3:24 a.m. UTC | #7
On Wed, Apr 12, 2023 at 12:11 PM Chaitanya Kulkarni <kch@nvidia.com> wrote:
>
> Make sure to check device queue mode in the null_validate_conf()
> and return error for NULL_Q_RQ as we don't allow legacy I/O path,
> without this patch we get OOPs when queue mode is set to 1 from
> configfs, following are repro steps :-
>
> modprobe null_blk nr_devices=0
> mkdir config/nullb/nullb0
> echo 1 > config/nullb/nullb0/memory_backed
> echo 4096 > config/nullb/nullb0/blocksize
> echo 20480 > config/nullb/nullb0/size
> echo 1 > config/nullb/nullb0/queue_mode
> echo 1 > config/nullb/nullb0/power
>
> Entering kdb (current=0xffff88810acdd080, pid 2372) on processor 42 Oops: (null)
> due to oops @ 0xffffffffc041c329
> CPU: 42 PID: 2372 Comm: sh Tainted: G           O     N 6.3.0-rc5lblk+ #5
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> RIP: 0010:null_add_dev.part.0+0xd9/0x720 [null_blk]
> Code: 01 00 00 85 d2 0f 85 a1 03 00 00 48 83 bb 08 01 00 00 00 0f 85 f7 03 00 00 80 bb 62 01 00 00 00 48 8b 75 20 0f 85 6d 02 00 00 <48> 89 6e 60 48 8b 75 20 bf 06 00 00 00 e8 f5 37 2c c1 48 8b 75 20
> RSP: 0018:ffffc900052cbde0 EFLAGS: 00010246
> RAX: 0000000000000001 RBX: ffff88811084d800 RCX: 0000000000000001
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888100042e00
> RBP: ffff8881053d8200 R08: ffffc900052cbd68 R09: ffff888105db2000
> R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000002
> R13: ffff888104765200 R14: ffff88810eec1748 R15: ffff88810eec1740
> FS:  00007fd445fd1740(0000) GS:ffff8897dfc80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000060 CR3: 0000000166a00000 CR4: 0000000000350ee0
> DR0: ffffffff8437a488 DR1: ffffffff8437a489 DR2: ffffffff8437a48a
> DR3: ffffffff8437a48b DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  nullb_device_power_store+0xd1/0x120 [null_blk]
>  configfs_write_iter+0xb4/0x120
>  vfs_write+0x2ba/0x3c0
>  ksys_write+0x5f/0xe0
>  do_syscall_64+0x3b/0x90
>  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> RIP: 0033:0x7fd4460c57a7
> Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
> RSP: 002b:00007ffd3792a4a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4460c57a7
> RDX: 0000000000000002 RSI: 000055b43c02e4c0 RDI: 0000000000000001
> RBP: 000055b43c02e4c0 R08: 000000000000000a R09: 00007fd44615b4e0
> R10: 00007fd44615b3e0 R11: 0000000000000246 R12: 0000000000000002
> R13: 00007fd446198520 R14: 0000000000000002 R15: 00007fd446198700
>  </TASK>
>
> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>

Reviewed-by: Ming Lei <ming.lei@redhat.com>
diff mbox series

Patch

diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index bc2c58724df3..2f017969b79f 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -1949,6 +1949,11 @@  static int null_init_tag_set(struct nullb *nullb, struct blk_mq_tag_set *set)
 
 static int null_validate_conf(struct nullb_device *dev)
 {
+	if (dev->queue_mode == NULL_Q_RQ) {
+		pr_err("legacy IO path is no longer available\n");
+		return -EINVAL;
+	}
+
 	dev->blocksize = round_down(dev->blocksize, 512);
 	dev->blocksize = clamp_t(unsigned int, dev->blocksize, 512, 4096);