From patchwork Thu Jan 18 15:10:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Westphal X-Patchwork-Id: 10173583 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EB4C760230 for ; Thu, 18 Jan 2018 15:13:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D499620500 for ; Thu, 18 Jan 2018 15:13:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C6FDC26E7B; Thu, 18 Jan 2018 15:13:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDD6720500 for ; Thu, 18 Jan 2018 15:13:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932610AbeARPNY (ORCPT ); Thu, 18 Jan 2018 10:13:24 -0500 Received: from Chamillionaire.breakpoint.cc ([146.0.238.67]:58468 "EHLO Chamillionaire.breakpoint.cc" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932567AbeARPNW (ORCPT ); Thu, 18 Jan 2018 10:13:22 -0500 Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.84_2) (envelope-from ) id 1ecBpz-0006C4-HN; Thu, 18 Jan 2018 16:10:43 +0100 Date: Thu, 18 Jan 2018 16:10:43 +0100 From: Florian Westphal To: "Nicholas A. Bellinger" Cc: Florian Westphal , Mike Christie , target-devel , Linux-netdev , David Miller Subject: Re: iscsi target regression due to "tcp: remove prequeue support" patch Message-ID: <20180118151043.GA21673@breakpoint.cc> References: <5A32128D.4050207@redhat.com> <20180115104145.GB27085@breakpoint.cc> <1516264698.24576.240.camel@haakon3.daterainc.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1516264698.24576.240.camel@haakon3.daterainc.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: target-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: target-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Nicholas A. Bellinger wrote: > On Mon, 2018-01-15 at 11:41 +0100, Florian Westphal wrote: > > Mike Christie wrote: > > > > > Dec 13 17:55:01 rhel73n1 kernel: Got Login Command, Flags 0x81, ITT: > > > 0x00000000, CmdSN: 0x00000000, ExpStatSN: 0xf86dc69b, CID: 0, Length: 65 > > > > > > we have got a login command and we seem to then go into > > > iscsit_do_rx_data -> sock_recvmsg > > > > > > We seem to get stuck in there though, because we stay blocked until: > > > > > > Dec 13 17:55:01 rhel73n1 kernel: Entering iscsi_target_sk_data_ready: > > > conn: ffff88b35cbb3000 > > > Dec 13 17:55:01 rhel73n1 kernel: Got LOGIN_FLAGS_READ_ACTIVE=1, conn: > > > ffff88b35cbb3000 >>>> > > > > > > where initiator side timeout fires 15 seconds later and it disconnects > > > the tcp connection, and we eventually break out of the recvmsg call: [..] > > > Dec 13 17:55:16 rhel73n1 kernel: rx_loop: 68, total_rx: 68, data: 68 > > > Dec 13 17:55:16 rhel73n1 kernel: iscsi_target_do_login_rx after > > > rx_login_io, ffff88b35cbb3000, kworker/2:2:1829 > > > > > > Is the iscsi target doing something incorrect in its use of > > > sk_data_ready and sock_recvmsg or is the tcp patch at fault? > > > > I have not received any bug reports except this one. > > > > I also have a hard time following iscsi code flow. > > > > > Dec 13 17:55:01 rhel73n1 kernel: Starting login_timer for kworker/2:2/1829 > > > Dec 13 17:55:01 rhel73n1 kernel: rx_loop: 48, total_rx: 48, data: 48 > > > Dec 13 17:55:01 rhel73n1 kernel: Got Login Command, Flags 0x81, ITT: 0x00000000, CmdSN: 0x00000000, ExpStatSN: 0xf86dc69b, CID: 0, Length: 65 > > > Dec 13 17:55:01 rhel73n1 kernel: Entering iscsi_target_sk_data_ready: conn: ffff88b35cbb3000 > > > > Looks like things are fine up to this point. > > > > > Dec 13 17:55:01 rhel73n1 kernel: Got LOGIN_FLAGS_READ_ACTIVE=1, conn: ffff88b35cbb3000 >>>> > > > > This makes things return early from sk_data_ready callback. > > Correct. > > This is existing behavior for individual iscsi_conn login delayed_work > contexts (conn->login_work) which have not yet returned from a previous > sock_recvmsg(..., MSG_WAITALL) blocking call. > > This causes the next iscsi_target_sk_data_ready() callback to hit > LOGIN_FLAGS_READ_ACTIVE=1, and return immediately without kicking > conn->login_work to process iscsi_target_do_login_rx() -> > sock_recvmsg(..., MSG_WAITALL). Who is responsible to remove the worker/sk from the wait queue? > > > Dec 13 17:55:16 rhel73n1 kernel: Entering iscsi_target_sk_state_change > > > Dec 13 17:55:16 rhel73n1 kernel: __iscsi_target_sk_check_close: TCP_CLOSE_WAIT|TCP_CLOSE,returning FALSE > > > Dec 13 17:55:16 rhel73n1 kernel: __iscsi_target_sk_close_change: state: 1 > > > Dec 13 17:55:16 rhel73n1 kernel: Got LOGIN_FLAGS_READ_ACTIVE=1 sk_state_change conn: ffff88b35cbb3000 > > > Dec 13 17:55:16 rhel73n1 kernel: rx_loop: 68, total_rx: 68, data: 68 > > > > So it looks like all data is there, and probably has been there all the > > past 15 seconds, but nothing noticed. > > > > Why is LOGIN_FLAGS_READ_ACTIVE set? Who sets this? Who is supposed to clear that? > > Why does it exist in first place? > > The bit is set in iscsi_target_sk_data_ready() when conn->login_work is > not already blocked by sock_recvmsg(..., MSG_WAITALL). Once it's set, > conn->login_work is kicked to run iscsi_target_do_login_rx() -> > sock_recvmsg(..., MSG_WAITALL) which blocks waiting for the next 48 byte > login request PDU + payload. > > Once the active conn->login_work context in iscsi_target_do_login_rx() > returns from sock_recvmsg(..., MSG_WAITALL) with full login request PDU > + payload bytes, the bit is cleared. > > AFAICT, there was a wake_up removed by commit e7942d063 that results in > multi iscsi login PDU authentication exchanges blocking on a incoming > login request payload. With you so far, BUT -- Mike has lowlatency=1 set -- so all the tcp_prequeue code paths should never be hit in first place. I just tried a 4.13 kernel and no tcp prequeue is path is hit when lowlatency sysctl is set afaics. > It would indicate users providing their own ->sk_data_ready() callback > must be responsible for waking up a kthread context blocked on > sock_recvmsg(..., MSG_WAITALL), when a second ->sk_data_ready() is > received before the first sock_recvmsg(..., MSG_WAITALL) completes. I agree, it looks like we need something like this? (not even build tested): --- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c index b686e2ce9c0e..3723f8f419aa 100644 --- a/drivers/target/iscsi/iscsi_target_nego.c +++ b/drivers/target/iscsi/iscsi_target_nego.c @@ -432,6 +432,9 @@ static void iscsi_target_sk_data_ready(struct sock *sk) if (test_and_set_bit(LOGIN_FLAGS_READ_ACTIVE, &conn->login_flags)) { write_unlock_bh(&sk->sk_callback_lock); pr_debug("Got LOGIN_FLAGS_READ_ACTIVE=1, conn: %p >>>>\n", conn); + if (WARN_ON(iscsi_target_sk_data_ready == conn->orig_data_ready)) + return; + conn->orig_data_ready(sk); return; }