From patchwork Sat Jun 3 12:49:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nicholas A. Bellinger" X-Patchwork-Id: 9764109 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C3D8460365 for ; Sat, 3 Jun 2017 12:49:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5C871FF1F for ; Sat, 3 Jun 2017 12:49:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 985911FF4A; Sat, 3 Jun 2017 12:49:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 80ADB1FF1F for ; Sat, 3 Jun 2017 12:49:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750876AbdFCMte (ORCPT ); Sat, 3 Jun 2017 08:49:34 -0400 Received: from mail.linux-iscsi.org ([67.23.28.174]:43352 "EHLO linux-iscsi.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750765AbdFCMte (ORCPT ); Sat, 3 Jun 2017 08:49:34 -0400 Received: from [192.168.1.66] (75-37-194-224.lightspeed.lsatca.sbcglobal.net [75.37.194.224]) (using SSLv3 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: nab) by linux-iscsi.org (Postfix) with ESMTPSA id 5A41040B11; Sat, 3 Jun 2017 12:52:35 +0000 (UTC) Message-ID: <1496494172.27407.325.camel@haakon3.risingtidesystems.com> Subject: Re: iSCSI target driver regression From: "Nicholas A. Bellinger" To: Bart Van Assche Cc: "target-devel@vger.kernel.org" Date: Sat, 03 Jun 2017 05:49:32 -0700 In-Reply-To: <1496466062.27407.276.camel@haakon3.risingtidesystems.com> References: <1496427274.1214.16.camel@sandisk.com> <1496466062.27407.276.camel@haakon3.risingtidesystems.com> X-Mailer: Evolution 3.4.4-1 Mime-Version: 1.0 Sender: target-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: target-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 2017-06-02 at 22:01 -0700, Nicholas A. Bellinger wrote: > On Fri, 2017-06-02 at 18:14 +0000, Bart Van Assche wrote: > > Hello Nic, > > > > When I reran the libiscsi test suite against your for-next branch a kernel oops > > appeared in the system log that I hadn't seen before. There are no iSCSI patches > > from me on that branch so this crash was likely introduced by one of the iSCSI > > target driver patches that were added to your for-next branch after kernel v4.11 > > was released. The topmost commit in the kernel tree that triggered this oops is > > commit acdd4716bc86 ("target: reject COMPARE_AND_WRITE if emulate_caw is not set"). > > > > Yep, nothing immediate comes to mind in the explicit logout path that > has changed recently. > > > [ 321.546438] iscsi_target_mod:lio_release_cmd: Entering lio_release_cmd for se_cmd: ffff880063134890 > > [ 323.013563] 1 connection(s) still exist for iSCSI session to iqn.2007-10.com.github:sahlberg:libiscsi:iscsi-test-2 > > [ 323.014358] ------------[ cut here ]------------ > > [ 323.014864] kernel BUG at drivers/target/iscsi/iscsi_target.c:4346! It turns out the only way to trigger this is to block tx thread context from reaching iscsit_logout_post_handler_closesession() for longer than SECONDS_FOR_LOGOUT_COMP (15 seconds), which causes sleep = 0 to be processed while iscsit_close_connection() from rx thread context has already cleared conn->tx_thread_active. As it was, I have no idea what you did in your VM to cause a 15+ second delay to reach iscsit_logout_post_handler_closesession(), but whatever it was it's certainly not a regression in upstream. ;) In any event, here's how to simulate the issue, and the proper fix to just let existing iscsit_close_connection() logic clean up the failed logout as if iscsit_logout_post_handler_closesession() was never reached. Enjoy. --- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 0d8f815..03a0224 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -4414,6 +4414,11 @@ static void iscsit_logout_post_handler_closesession( { struct iscsi_session *sess = conn->sess; int sleep = 1; + + printk("Simulating broken out-of-tree codebase.\n"); + ssleep(SECONDS_FOR_LOGOUT_COMP + 2); + printk("Simulation complete\n"); + /* * Traditional iscsi/tcp will invoke this logic from TX thread * context during session logout, so clear tx_thread_active and @@ -4423,8 +4428,11 @@ static void iscsit_logout_post_handler_closesession( * always sleep waiting for RX/TX thread shutdown to complete * within iscsit_close_connection(). */ - if (!conn->conn_transport->rdma_shutdown) + if (!conn->conn_transport->rdma_shutdown) { sleep = cmpxchg(&conn->tx_thread_active, true, false); + if (!sleep) + return; + } atomic_set(&conn->conn_logout_remove, 0); complete(&conn->conn_logout_comp); @@ -4440,8 +4448,11 @@ static void iscsit_logout_post_handler_samecid( { int sleep = 1; - if (!conn->conn_transport->rdma_shutdown) + if (!conn->conn_transport->rdma_shutdown) { sleep = cmpxchg(&conn->tx_thread_active, true, false); + if (!sleep) + return; + } atomic_set(&conn->conn_logout_remove, 0); complete(&conn->conn_logout_comp);