From patchwork Fri Dec 15 21:12:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10116015 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9B2AB60231 for ; Fri, 15 Dec 2017 21:12:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E1232A162 for ; Fri, 15 Dec 2017 21:12:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8270B2A172; Fri, 15 Dec 2017 21:12:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C8B32A162 for ; Fri, 15 Dec 2017 21:12:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755976AbdLOVMe (ORCPT ); Fri, 15 Dec 2017 16:12:34 -0500 Received: from mx1.redhat.com ([209.132.183.28]:50172 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755691AbdLOVMd (ORCPT ); Fri, 15 Dec 2017 16:12:33 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 75A21624DA; Fri, 15 Dec 2017 21:12:33 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-120-247.rdu2.redhat.com [10.10.120.247]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4917E7F1F7; Fri, 15 Dec 2017 21:12:33 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id D74B620433; Fri, 15 Dec 2017 16:12:32 -0500 (EST) From: Scott Mayhew To: trond.myklebust@primarydata.com, anna.schumaker@netapp.com Cc: bcodding@redhat.com, dwysocha@redhat.com, fsorenso@redhat.com, linux-nfs@vger.kernel.org Subject: [PATCH v2] nfs/pnfs: fix nfs_direct_req ref leak when i/o falls back to the mds Date: Fri, 15 Dec 2017 16:12:32 -0500 Message-Id: <20171215211232.18942-1-smayhew@redhat.com> In-Reply-To: <20171213182220.lrfbvkaj65fg6abk@tonberry.usersys.redhat.com> References: <20171213182220.lrfbvkaj65fg6abk@tonberry.usersys.redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 15 Dec 2017 21:12:33 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently when falling back to doing I/O through the MDS (via pnfs_{read|write}_through_mds), the client frees the nfs_pgio_header without releasing the reference taken on the dreq via pnfs_generic_pg_{read|write}pages -> nfs_pgheader_init -> nfs_direct_pgio_init. It then takes another reference on the dreq via nfs_generic_pg_pgios -> nfs_pgheader_init -> nfs_direct_pgio_init and as a result the requester will become stuck in inode_dio_wait. Once that happens, other processes accessing the inode will become stuck as well. Ensure that pnfs_read_through_mds() and pnfs_write_through_mds() clean up correctly by calling hdr->completion_ops->completion() instead of calling hdr->release() directly. This can be reproduced (sometimes) by performing "storage failover takeover" commands on NetApp filer while doing direct I/O from a client. This can also be reproduced using SystemTap to simulate a failure while doing direct I/O from a client (from Dave Wysochanski ): stap -v -g -e 'probe module("nfs_layout_nfsv41_files").function("nfs4_fl_prepare_ds").return { $return=NULL; exit(); }' Suggested-by: Trond Myklebust Signed-off-by: Scott Mayhew --- fs/nfs/pnfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index d602fe9..eb098cc 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -2255,7 +2255,7 @@ pnfs_write_through_mds(struct nfs_pageio_descriptor *desc, nfs_pageio_reset_write_mds(desc); mirror->pg_recoalesce = 1; } - hdr->release(hdr); + hdr->completion_ops->completion(hdr); } static enum pnfs_try_status @@ -2378,7 +2378,7 @@ pnfs_read_through_mds(struct nfs_pageio_descriptor *desc, nfs_pageio_reset_read_mds(desc); mirror->pg_recoalesce = 1; } - hdr->release(hdr); + hdr->completion_ops->completion(hdr); } /*