From patchwork Sat Jun 3 04:19:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nicholas A. Bellinger" X-Patchwork-Id: 9763861 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5AC5C602B6 for ; Sat, 3 Jun 2017 04:19:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3EEF028595 for ; Sat, 3 Jun 2017 04:19:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 318C4285F0; Sat, 3 Jun 2017 04:19:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B2E128595 for ; Sat, 3 Jun 2017 04:19:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750775AbdFCETk (ORCPT ); Sat, 3 Jun 2017 00:19:40 -0400 Received: from mail.linux-iscsi.org ([67.23.28.174]:46266 "EHLO linux-iscsi.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750707AbdFCETj (ORCPT ); Sat, 3 Jun 2017 00:19:39 -0400 Received: from [192.168.1.66] (75-37-194-224.lightspeed.lsatca.sbcglobal.net [75.37.194.224]) (using SSLv3 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: nab) by linux-iscsi.org (Postfix) with ESMTPSA id C355840B11; Sat, 3 Jun 2017 04:22:39 +0000 (UTC) Message-ID: <1496463577.27407.269.camel@haakon3.risingtidesystems.com> Subject: Re: Kernel crash with target-pending/for-next From: "Nicholas A. Bellinger" To: Bart Van Assche Cc: "target-devel@vger.kernel.org" Date: Fri, 02 Jun 2017 21:19:37 -0700 In-Reply-To: <1496421044.1214.5.camel@sandisk.com> References: <1496341047.3075.8.camel@sandisk.com> <1496373556.27407.210.camel@haakon3.risingtidesystems.com> <1496421044.1214.5.camel@sandisk.com> X-Mailer: Evolution 3.4.4-1 Mime-Version: 1.0 Sender: target-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: target-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 2017-06-02 at 16:30 +0000, Bart Van Assche wrote: > On Thu, 2017-06-01 at 20:19 -0700, Nicholas A. Bellinger wrote: > > Here's the updated version to restore original behavior for se_node_acl > > delete, but still avoid the endless loop with the iscsi-target specific > > case where se_node_acl->queue_depth changes. > > > > Care to verify on ib_srpt, or just a report and never confirm..? > > Hello Nic, > > This is what I ran into with commit 4f61e1e687c4 ("target: Avoid > target_shutdown_sessions loop during queue_depth change") merged with kernel > v4.12-rc3. This is a crash I had never seen before. This crash disappears if > I revert commit 4f61e1e687c4 so I think this indicates a bug introduced by > that commit: > Well, commit 4f61e1e687c4 does not change the original behavior to drain the list of active se_node_acl sessions: That is, it's doing the same thing as before in target_shutdown_sessions() walking se_node_acl->acl_sess_list, invoking ->close_session(), and immediately restarting the list walk after each one. How can this mean srpt..? > ib_srpt:srpt_close_ch: ib_srpt 0x0000000000000000e41d2d03000a6d51-1114: queued zerolength write > ib_srpt:srpt_release_channel_work: ib_srpt srpt_release_channel_work: 0x0000000000000000e41d2d03000a6d51-1114; release_done = (null) > ------------[ cut here ]------------ > kernel BUG at drivers/infiniband/ulp/srpt/ib_srpt.c:2770! Btw, looking at v4.12-rc3 there is not a BUG_ON() at line 2770. Perhaps BUG_ON(ch->release_done) at line 2719, which could indicate srpt_close_session() is being called twice... But if it is, then why isn't srpt_close_session() pr_debug shown anywhere in your output..? Can I have a look at the full debug with the missing srpt_close_sessions() messages to see if it's being called twice for the same se_session, and the code changes against v4.12-rc3 you're testing with that account for the ~50 lines offset..? --- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c index 3691373..1b2b60e 100644 --- a/drivers/target/target_core_tpg.c +++ b/drivers/target/target_core_tpg.c @@ -336,14 +336,14 @@ struct se_node_acl *core_tpg_add_initiator_node_acl( return acl; } -static void target_shutdown_sessions(struct se_node_acl *acl) +static void target_shutdown_sessions(struct se_node_acl *acl, bool do_restart) { - struct se_session *sess; + struct se_session *sess, *sess_tmp; unsigned long flags; restart: spin_lock_irqsave(&acl->nacl_sess_lock, flags); - list_for_each_entry(sess, &acl->acl_sess_list, sess_acl_list) { + list_for_each_entry_safe(sess, sess_tmp, &acl->acl_sess_list, sess_acl_list) { if (sess->sess_tearing_down) continue; @@ -352,7 +352,11 @@ static void target_shutdown_sessions(struct se_node_acl *acl) if (acl->se_tpg->se_tpg_tfo->close_session) acl->se_tpg->se_tpg_tfo->close_session(sess); - goto restart; + + if (do_restart) + goto restart; + + spin_lock_irqsave(&acl->nacl_sess_lock, flags); } spin_unlock_irqrestore(&acl->nacl_sess_lock, flags); }