Message ID | 20210628074834.161640-1-desmondcheongzx@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Bluetooth: fix inconsistent lock state in sco | expand |
This is automated email and please do not reply to this email! Dear submitter, Thank you for submitting the patches to the linux bluetooth mailing list. This is a CI test results with your patch series: PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=507835 ---Test result--- Test Summary: CheckPatch FAIL 0.77 seconds GitLint PASS 0.11 seconds BuildKernel PASS 467.91 seconds TestRunner: Setup PASS 300.86 seconds TestRunner: l2cap-tester PASS 2.45 seconds TestRunner: bnep-tester PASS 1.76 seconds TestRunner: mgmt-tester FAIL 29.44 seconds TestRunner: rfcomm-tester PASS 1.91 seconds TestRunner: sco-tester PASS 1.88 seconds TestRunner: smp-tester PASS 1.98 seconds TestRunner: userchan-tester PASS 1.83 seconds Details ############################## Test: CheckPatch - FAIL - 0.77 seconds Run checkpatch.pl script with rule in .checkpatch.conf Bluetooth: fix inconsistent lock state in sco WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line) #9: https://syzkaller.appspot.com/bug?id=9089d89de0502e120f234ca0fc8a703f7368b31e total: 0 errors, 1 warnings, 0 checks, 59 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. "[PATCH] Bluetooth: fix inconsistent lock state in sco" has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. ############################## Test: GitLint - PASS - 0.11 seconds Run gitlint with rule in .gitlint ############################## Test: BuildKernel - PASS - 467.91 seconds Build Kernel with minimal configuration supports Bluetooth ############################## Test: TestRunner: Setup - PASS - 300.86 seconds Setup environment for running Test Runner ############################## Test: TestRunner: l2cap-tester - PASS - 2.45 seconds Run test-runner with l2cap-tester Total: 40, Passed: 40 (100.0%), Failed: 0, Not Run: 0 ############################## Test: TestRunner: bnep-tester - PASS - 1.76 seconds Run test-runner with bnep-tester Total: 1, Passed: 1 (100.0%), Failed: 0, Not Run: 0 ############################## Test: TestRunner: mgmt-tester - FAIL - 29.44 seconds Run test-runner with mgmt-tester Total: 446, Passed: 436 (97.8%), Failed: 5, Not Run: 5 Failed Test Cases Read Ext Controller Info 1 Failed 0.019 seconds Read Ext Controller Info 2 Failed 0.021 seconds Read Ext Controller Info 3 Failed 0.019 seconds Read Ext Controller Info 4 Failed 0.011 seconds Read Ext Controller Info 5 Failed 0.017 seconds ############################## Test: TestRunner: rfcomm-tester - PASS - 1.91 seconds Run test-runner with rfcomm-tester Total: 9, Passed: 9 (100.0%), Failed: 0, Not Run: 0 ############################## Test: TestRunner: sco-tester - PASS - 1.88 seconds Run test-runner with sco-tester Total: 8, Passed: 8 (100.0%), Failed: 0, Not Run: 0 ############################## Test: TestRunner: smp-tester - PASS - 1.98 seconds Run test-runner with smp-tester Total: 8, Passed: 8 (100.0%), Failed: 0, Not Run: 0 ############################## Test: TestRunner: userchan-tester - PASS - 1.83 seconds Run test-runner with userchan-tester Total: 3, Passed: 3 (100.0%), Failed: 0, Not Run: 0 --- Regards, Linux Bluetooth
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index 3bd41563f118..d05629d7cc55 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -173,10 +173,22 @@ static void sco_conn_del(struct hci_conn *hcon, int err) if (sk) { sock_hold(sk); - bh_lock_sock(sk); + + spin_lock_bh(&sk->sk_lock.slock); sco_sock_clear_timer(sk); - sco_chan_del(sk, err); - bh_unlock_sock(sk); + sco_pi(sk)->conn = NULL; + if (conn->hcon) + hci_conn_drop(conn->hcon); + sk->sk_state = BT_CLOSED; + sk->sk_err = err; + sk->sk_state_change(sk); + sock_set_flag(sk, SOCK_ZAPPED); + spin_unlock_bh(&sk->sk_lock.slock); + + sco_conn_lock(conn); + conn->sk = NULL; + sco_conn_unlock(conn); + sco_sock_kill(sk); sock_put(sk); } @@ -1084,10 +1096,10 @@ static void sco_conn_ready(struct sco_conn *conn) if (sk) { sco_sock_clear_timer(sk); - bh_lock_sock(sk); + spin_lock_bh(&sk->sk_lock.slock); sk->sk_state = BT_CONNECTED; sk->sk_state_change(sk); - bh_unlock_sock(sk); + spin_unlock_bh(&sk->sk_lock.slock); } else { sco_conn_lock(conn); @@ -1102,12 +1114,12 @@ static void sco_conn_ready(struct sco_conn *conn) return; } - bh_lock_sock(parent); + spin_lock_bh(&parent->sk_lock.slock); sk = sco_sock_alloc(sock_net(parent), NULL, BTPROTO_SCO, GFP_ATOMIC, 0); if (!sk) { - bh_unlock_sock(parent); + spin_unlock_bh(&parent->sk_lock.slock); sco_conn_unlock(conn); return; } @@ -1128,7 +1140,7 @@ static void sco_conn_ready(struct sco_conn *conn) /* Wake up parent */ parent->sk_data_ready(parent); - bh_unlock_sock(parent); + spin_unlock_bh(&parent->sk_lock.slock); sco_conn_unlock(conn); }
Syzbot reported an inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} lock usage in sco_conn_del and sco_sock_timeout that could lead to deadlocks: https://syzkaller.appspot.com/bug?id=9089d89de0502e120f234ca0fc8a703f7368b31e This inconsistent lock state can also happen between sco_conn_ready and sco_sock_timeout. The issue is that these three functions take a spin lock on the socket, but sco_sock_timeout is called from an IRQ context. Since bh_lock_sock calls spin_lock but does not disable softirqs, this could lead to deadlocks: CPU0 ---- lock(slock-AF_BLUETOOTH-BTPROTO_SCO); <Interrupt> lock(slock-AF_BLUETOOTH-BTPROTO_SCO); *** DEADLOCK *** We fix this by replacing bh_lock_sock with spin_lock_bh in sco_conn_del and sco_conn_ready. Additionally, to avoid regressions, we pull the clean-up code out from sco_chan_del and use it directly in sco_conn_del. This is necessary because sco_chan_del makes a call to sco_conn_lock which takes an SOFTIRQ-unsafe lock. This means that calling sco_chan_del while holding the socket lock would result in a SOFTIRQ-safe -> SOFTIRQ-unsafe lock hierarchy between slock-AF_BLUETOOTH-BTPROTO_SCO and &conn->lock#2. This could lead to a deadlock as well: CPU0 CPU1 ---- ---- lock(&conn->lock#2); local_irq_disable(); lock(slock-AF_BLUETOOTH-BTPROTO_SCO); lock(&conn->lock#2); <Interrupt> lock(slock-AF_BLUETOOTH-BTPROTO_SCO); *** DEADLOCK *** Pulling out the code from sco_chan_del allows us to avoid this lock dependency by holding the two locks for only their required critical sections. Reported-and-tested-by: syzbot+2f6d7c28bb4bf7e82060@syzkaller.appspotmail.com Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> --- net/bluetooth/sco.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-)