diff mbox series

[v3,1/2] nbd: fix use-after-freed crash for nbd->recv_workq

Message ID 20201103065156.342756-2-xiubli@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v3,1/2] nbd: fix use-after-freed crash for nbd->recv_workq | expand

Commit Message

Xiubo Li Nov. 3, 2020, 6:51 a.m. UTC
From: Xiubo Li <xiubli@redhat.com>

The crash call trace:

<6>[ 1012.319386] block nbd1: NBD_DISCONNECT
<1>[ 1012.319437] BUG: kernel NULL pointer dereference, address: 0000000000000020
<1>[ 1012.319439] #PF: supervisor write access in kernel mode
<1>[ 1012.319441] #PF: error_code(0x0002) - not-present page
<6>[ 1012.319442] PGD 0 P4D 0
<4>[ 1012.319448] Oops: 0002 [#1] SMP NOPTI
<4>[ 1012.319454] CPU: 9 PID: 25111 Comm: rbd-nbd Tainted: G            E     5.9.0+ #6
   [...]
<4>[ 1012.319505] PKRU: 55555554
<4>[ 1012.319506] Call Trace:
<4>[ 1012.319560]  flush_workqueue+0x81/0x440
<4>[ 1012.319598]  nbd_disconnect_and_put+0x50/0x70 [nbd]
<4>[ 1012.319607]  nbd_genl_disconnect+0xc7/0x170 [nbd]
<4>[ 1012.319627]  genl_rcv_msg+0x1dd/0x2f9
<4>[ 1012.319642]  ? genl_start+0x140/0x140
<4>[ 1012.319644]  netlink_rcv_skb+0x49/0x110
<4>[ 1012.319649]  genl_rcv+0x24/0x40
<4>[ 1012.319651]  netlink_unicast+0x1a5/0x280
<4>[ 1012.319653]  netlink_sendmsg+0x23d/0x470
<4>[ 1012.319667]  sock_sendmsg+0x5b/0x60
<4>[ 1012.319676]  ____sys_sendmsg+0x1ef/0x260
<4>[ 1012.319679]  ? copy_msghdr_from_user+0x5c/0x90
<4>[ 1012.319680]  ? ____sys_recvmsg+0xa5/0x180
<4>[ 1012.319682]  ___sys_sendmsg+0x7c/0xc0
<4>[ 1012.319683]  ? copy_msghdr_from_user+0x5c/0x90
<4>[ 1012.319685]  ? ___sys_recvmsg+0x89/0xc0
<4>[ 1012.319692]  ? __wake_up_common_lock+0x87/0xc0
<4>[ 1012.319715]  ? __check_object_size+0x46/0x173
<4>[ 1012.319727]  ? _copy_to_user+0x22/0x30
<4>[ 1012.319729]  ? move_addr_to_user+0xc3/0x100
<4>[ 1012.319731]  __sys_sendmsg+0x57/0xa0
<4>[ 1012.319744]  do_syscall_64+0x33/0x40
<4>[ 1012.319760]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4>[ 1012.319780] RIP: 0033:0x7f5baa8e3ad5

In case the reference of nbd->config has reached zero and the
related resource has been released, if another process tries to
send the DISCONENCT cmd by using the netlink, it will potentially
crash like this.

Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 drivers/block/nbd.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

Comments

Josef Bacik Nov. 3, 2020, 5 p.m. UTC | #1
On 11/3/20 1:51 AM, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> The crash call trace:
> 
> <6>[ 1012.319386] block nbd1: NBD_DISCONNECT
> <1>[ 1012.319437] BUG: kernel NULL pointer dereference, address: 0000000000000020
> <1>[ 1012.319439] #PF: supervisor write access in kernel mode
> <1>[ 1012.319441] #PF: error_code(0x0002) - not-present page
> <6>[ 1012.319442] PGD 0 P4D 0
> <4>[ 1012.319448] Oops: 0002 [#1] SMP NOPTI
> <4>[ 1012.319454] CPU: 9 PID: 25111 Comm: rbd-nbd Tainted: G            E     5.9.0+ #6
>     [...]
> <4>[ 1012.319505] PKRU: 55555554
> <4>[ 1012.319506] Call Trace:
> <4>[ 1012.319560]  flush_workqueue+0x81/0x440
> <4>[ 1012.319598]  nbd_disconnect_and_put+0x50/0x70 [nbd]
> <4>[ 1012.319607]  nbd_genl_disconnect+0xc7/0x170 [nbd]
> <4>[ 1012.319627]  genl_rcv_msg+0x1dd/0x2f9
> <4>[ 1012.319642]  ? genl_start+0x140/0x140
> <4>[ 1012.319644]  netlink_rcv_skb+0x49/0x110
> <4>[ 1012.319649]  genl_rcv+0x24/0x40
> <4>[ 1012.319651]  netlink_unicast+0x1a5/0x280
> <4>[ 1012.319653]  netlink_sendmsg+0x23d/0x470
> <4>[ 1012.319667]  sock_sendmsg+0x5b/0x60
> <4>[ 1012.319676]  ____sys_sendmsg+0x1ef/0x260
> <4>[ 1012.319679]  ? copy_msghdr_from_user+0x5c/0x90
> <4>[ 1012.319680]  ? ____sys_recvmsg+0xa5/0x180
> <4>[ 1012.319682]  ___sys_sendmsg+0x7c/0xc0
> <4>[ 1012.319683]  ? copy_msghdr_from_user+0x5c/0x90
> <4>[ 1012.319685]  ? ___sys_recvmsg+0x89/0xc0
> <4>[ 1012.319692]  ? __wake_up_common_lock+0x87/0xc0
> <4>[ 1012.319715]  ? __check_object_size+0x46/0x173
> <4>[ 1012.319727]  ? _copy_to_user+0x22/0x30
> <4>[ 1012.319729]  ? move_addr_to_user+0xc3/0x100
> <4>[ 1012.319731]  __sys_sendmsg+0x57/0xa0
> <4>[ 1012.319744]  do_syscall_64+0x33/0x40
> <4>[ 1012.319760]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> <4>[ 1012.319780] RIP: 0033:0x7f5baa8e3ad5
> 
> In case the reference of nbd->config has reached zero and the
> related resource has been released, if another process tries to
> send the DISCONENCT cmd by using the netlink, it will potentially
> crash like this.
> 
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Xiubo Li <xiubli@redhat.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef
diff mbox series

Patch

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index f46e26c9d9b3..e4c37677f3df 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -2003,16 +2003,31 @@  static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 
 static void nbd_disconnect_and_put(struct nbd_device *nbd)
 {
+	bool flush = true;
+
 	mutex_lock(&nbd->config_lock);
 	nbd_disconnect(nbd);
 	nbd_clear_sock(nbd);
-	mutex_unlock(&nbd->config_lock);
 	/*
-	 * Make sure recv thread has finished, so it does not drop the last
-	 * config ref and try to destroy the workqueue from inside the work
-	 * queue.
+	 * In very rare case that the nbd->recv_workq may already have been
+	 * released by the recv_work().
 	 */
-	flush_workqueue(nbd->recv_workq);
+	if (nbd->recv_workq)
+		refcount_inc(&nbd->config_refs);
+	else
+		flush = false;
+	mutex_unlock(&nbd->config_lock);
+
+	if (flush) {
+		/*
+		 * Make sure recv thread has finished, so it does not drop
+		 * the last config ref and try to destroy the workqueue
+		 * from inside the work queue.
+		 */
+		flush_workqueue(nbd->recv_workq);
+		nbd_config_put(nbd);
+	}
+
 	if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,
 			       &nbd->config->runtime_flags))
 		nbd_config_put(nbd);