From patchwork Tue Dec 19 12:20:50 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Honggang LI X-Patchwork-Id: 10123147 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 64386602CB for ; Tue, 19 Dec 2017 12:20:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C6E028B4D for ; Tue, 19 Dec 2017 12:20:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4120B28BD9; Tue, 19 Dec 2017 12:20:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C1B9928B4D for ; Tue, 19 Dec 2017 12:20:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965956AbdLSMUy (ORCPT ); Tue, 19 Dec 2017 07:20:54 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49562 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965944AbdLSMUy (ORCPT ); Tue, 19 Dec 2017 07:20:54 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1B8985FD4E; Tue, 19 Dec 2017 12:20:54 +0000 (UTC) Received: from localhost (unknown [10.66.128.195]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 80CA160C20; Tue, 19 Dec 2017 12:20:53 +0000 (UTC) Date: Tue, 19 Dec 2017 20:20:50 +0800 From: Honggang LI To: Jason Gunthorpe Cc: Bart Van Assche , "linux-rdma@vger.kernel.org" Subject: Re: [PATCH rdma-core] srp_daemon: Install signal handler for ibsrpdm Message-ID: <20171219122050.GA19682@dhcp-13-42.nay.redhat.com> References: <20171214110241.4701-1-honli@redhat.com> <1513263572.2986.2.camel@wdc.com> <20171215013628.GA743@dhcp-13-42.nay.redhat.com> <20171215172800.GA12434@ziepe.ca> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20171215172800.GA12434@ziepe.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 19 Dec 2017 12:20:54 +0000 (UTC) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, Dec 15, 2017 at 10:28:00AM -0700, Jason Gunthorpe wrote: > On Fri, Dec 15, 2017 at 09:36:28AM +0800, Honggang LI wrote: > > On Thu, Dec 14, 2017 at 02:59:34PM +0000, Bart Van Assche wrote: > > > On Thu, 2017-12-14 at 19:02 +0800, Honggang LI wrote: > > > > fd3005f0cd34 moves the signal handler setup from ibsrpdm path. So, > > > > default signal handler will be used when it receives signal SIGINT. > > > > ibsrpdm will exit with non-zero exit code as default signal handler > > > > killed it. > > > > > > Can you explain why you think that ibsrpdm needs a signal handler? > > > > main > > ibsrpdm > > alloc_res > > pthread_create(&res->res.async_ev_thread [1] > > .... > > free_res > > if (res->async_ev_thread) { pthread_kill(res->async_ev_thread, SIGINT); [2] > > > > > > The ibsrpdm program create a thread in [1], and send a SIGINT in [2]. > > The default behavior of SIGINT is to terminate the process. > > Yuk, no, using signals like this is horrifyingly buggy. OK, I agree with you. The "signal_handler", which calls "wake_up_main_loop", is not the right solution, because wakeup_pipe never used by ibsrpdm. void wake_up_main_loop(char ch) { int res; assert(wakeup_pipe[1] >= 0); // this 'assert' will be failed > Here is a sketch on how to fix it properly. All the users of > pthread_kill should be eliminated. > > Though overall, there is really no reason to even cleanup the threads, > just call exit? No, if one pthread just calls 'exit', the entire process will be terminated immediately. So, we need to cleanup the threads. I think the source of current issue is the async_ev_thread pthread. We should *NOT* create such pthread for ibsrpdm. I checked the old srptools git repo. git://git.openfabrics.org/~bvanassche/srptools.git Commit ab57a5b92eb3b8c9221f77235a028814a462d2cb merges "ibsrpdm" into "srp_daemon". The old ibsrpdm program is a single thread program. srp_daemon is multi-thread program. As ibsrpdm always only run once, because "config->once" has been assigned to 1. I would suggest to apply this patch. Reviewed-by: Bart Van Assche --- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/srp_daemon/srp_daemon.c b/srp_daemon/srp_daemon.c index cec36db2..4012a7db 100644 --- a/srp_daemon/srp_daemon.c +++ b/srp_daemon/srp_daemon.c @@ -1945,12 +1945,12 @@ static struct resources *alloc_res(void) run_thread_get_trap_notices, &res->res); if (ret) goto err; - } - ret = pthread_create(&res->res.async_ev_thread, NULL, - run_thread_listen_to_events, &res->res); - if (ret) - goto err; + ret = pthread_create(&res->res.async_ev_thread, NULL, + run_thread_listen_to_events, &res->res); + if (ret) + goto err; + } if (config->retry_timeout && !config->once) { ret = pthread_create(&res->res.reconnect_thread, NULL,