From patchwork Thu Feb 27 21:17:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11410573 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F27A138D for ; Thu, 27 Feb 2020 21:41:32 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 864A224690 for ; Thu, 27 Feb 2020 21:41:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 864A224690 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id DFC3834A9B8; Thu, 27 Feb 2020 13:33:44 -0800 (PST) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 285143489F6 for ; Thu, 27 Feb 2020 13:21:27 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 5F909A157; Thu, 27 Feb 2020 16:18:20 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 5E15B46C; Thu, 27 Feb 2020 16:18:20 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 27 Feb 2020 16:17:52 -0500 Message-Id: <1582838290-17243-605-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> References: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 604/622] lnet: fix small race in unloading klnd modules. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Reference counting of klnd modules is handled by the module itself. Currently, it is possible for a module to be completely unloaded between the time when the module called module_put(), and when it subsequently returns from the function that makes that call. During this time there may be one or two instructions to execute, and if the module is unmapped before they are executed, an exception will result. The module unload will call lnet_unregister_lnd() which takes the_lnet.ln_lnd_mutex, so module unload cannot complete while that is held. lnd_startup is called with this mutex held to avoid any races, but lnd_shutdown is not. Adding that protection will close the race. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: c087091cd901 ("LU-12678 lnet: fix small race in unloading klnd modules.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/36853 Reviewed-by: Serguei Smirnov Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/lnet/api-ni.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 0ca8bef..5df39aa 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -1983,7 +1983,14 @@ static void lnet_push_target_fini(void) islo = ni->ni_net->net_lnd->lnd_type == LOLND; LASSERT(!in_interrupt()); + /* Holding the mutex makes it safe for lnd_shutdown + * to call module_put(). Module unload cannot finish + * until lnet_unregister_lnd() completes, and that + * requires the mutex. + */ + mutex_lock(&the_lnet.ln_lnd_mutex); net->net_lnd->lnd_shutdown(ni); + mutex_unlock(&the_lnet.ln_lnd_mutex); if (!islo) CDEBUG(D_LNI, "Removed LNI %s\n",