From patchwork Wed Sep 26 02:48:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 10615141 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 482F813A4 for ; Wed, 26 Sep 2018 02:49:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 38D472A86E for ; Wed, 26 Sep 2018 02:49:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2CF6C2A879; Wed, 26 Sep 2018 02:49:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B69D02A69D for ; Wed, 26 Sep 2018 02:49:20 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9B24C21FAA3; Tue, 25 Sep 2018 19:48:59 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1108521F519 for ; Tue, 25 Sep 2018 19:48:25 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 4EC95100537A; Tue, 25 Sep 2018 22:48:19 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 4D790832; Tue, 25 Sep 2018 22:48:19 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Tue, 25 Sep 2018 22:48:06 -0400 Message-Id: <1537930097-11624-15-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1537930097-11624-1-git-send-email-jsimmons@infradead.org> References: <1537930097-11624-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 14/25] lustre: lnet: safe access to msg X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Amir Shehata , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata When tx credits are returned if there are pending messages they need to be sent. Messages could have different tx_cpts, so the correct one needs to be locked. After lnet_post_send_locked(), if we locked a different CPT then we need to relock the correct one However, as part of lnet_post_send_locked(), lnet_finalze() can be called which can free the message. Therefore, the cpt of the message being passed must be cached in order to prevent access to freed memory. Signed-off-by: Amir Shehata WC-bug-id: https://jira.whamcloud.com/browse/LU-9817 Reviewed-on: https://review.whamcloud.com/28308 Reviewed-by: Olaf Weber Reviewed-by: Sonia Sharma Reviewed-by: Dmitry Eremin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- drivers/staging/lustre/lnet/lnet/lib-move.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 4d74421..e8c0216 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -847,6 +847,8 @@ txpeer->lpni_txcredits++; if (txpeer->lpni_txcredits <= 0) { + int msg2_cpt; + msg2 = list_entry(txpeer->lpni_txq.next, struct lnet_msg, msg_list); list_del(&msg2->msg_list); @@ -855,13 +857,26 @@ LASSERT(msg2->msg_txpeer == txpeer); LASSERT(msg2->msg_tx_delayed); - if (msg2->msg_tx_cpt != msg->msg_tx_cpt) { + msg2_cpt = msg2->msg_tx_cpt; + + /* + * The msg_cpt can be different from the msg2_cpt + * so we need to make sure we lock the correct cpt + * for msg2. + * Once we call lnet_post_send_locked() it is no + * longer safe to access msg2, since it could've + * been freed by lnet_finalize(), but we still + * need to relock the correct cpt, so we cache the + * msg2_cpt for the purpose of the check that + * follows the call to lnet_pose_send_locked(). + */ + if (msg2_cpt != msg->msg_tx_cpt) { lnet_net_unlock(msg->msg_tx_cpt); - lnet_net_lock(msg2->msg_tx_cpt); + lnet_net_lock(msg2_cpt); } (void)lnet_post_send_locked(msg2, 1); - if (msg2->msg_tx_cpt != msg->msg_tx_cpt) { - lnet_net_unlock(msg2->msg_tx_cpt); + if (msg2_cpt != msg->msg_tx_cpt) { + lnet_net_unlock(msg2_cpt); lnet_net_lock(msg->msg_tx_cpt); } } else {