From patchwork Mon Aug 1 22:35:17 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 1026992 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.4) with ESMTP id p71MZBnL014679 for ; Mon, 1 Aug 2011 22:35:36 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751299Ab1HAWff (ORCPT ); Mon, 1 Aug 2011 18:35:35 -0400 Received: from mx2.netapp.com ([216.240.18.37]:15106 "EHLO mx2.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751211Ab1HAWfe convert rfc822-to-8bit (ORCPT ); Mon, 1 Aug 2011 18:35:34 -0400 X-IronPort-AV: E=Sophos;i="4.67,302,1309762800"; d="scan'208";a="567236684" Received: from smtp1.corp.netapp.com ([10.57.156.124]) by mx2-out.netapp.com with ESMTP; 01 Aug 2011 15:35:19 -0700 Received: from sacrsexc1-prd.hq.netapp.com (sacrsexc1-prd.hq.netapp.com [10.99.115.27]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p71MZIkX021702; Mon, 1 Aug 2011 15:35:19 -0700 (PDT) Received: from SACMVEXC2-PRD.hq.netapp.com ([10.99.115.17]) by sacrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 1 Aug 2011 15:35:18 -0700 Received: from 10.55.68.133 ([10.55.68.133]) by SACMVEXC2-PRD.hq.netapp.com ([10.99.115.16]) with Microsoft Exchange Server HTTP-DAV ; Mon, 1 Aug 2011 22:35:17 +0000 Received: from lade.trondhjem.org by SACMVEXC2-PRD.hq.netapp.com; 01 Aug 2011 18:35:17 -0400 Subject: Re: [PATCH v4 00/27] add block layout driver to pnfs client From: Trond Myklebust To: Peng Tao Cc: William Andros Adamson , Jim Rees , Christoph Hellwig , linux-nfs@vger.kernel.org, peter honeyman Date: Mon, 01 Aug 2011 18:35:17 -0400 In-Reply-To: <1312233006.23392.17.camel@lade.trondhjem.org> References: <1311874276-1386-1-git-send-email-rees@umich.edu> <20110729155136.GB28306@infradead.org> <20110729185415.GA23061@merit.edu> <20110729190133.GA10946@infradead.org> <20110729191341.GC23061@merit.edu> <1311988172.16078.15.camel@lade.trondhjem.org> <20110730032621.GB25188@merit.edu> <1312233006.23392.17.camel@lade.trondhjem.org> Organization: NetApp Inc X-Mailer: Evolution 3.0.2 (3.0.2-3.fc15) Message-ID: <1312238117.23392.19.camel@lade.trondhjem.org> Mime-Version: 1.0 X-OriginalArrivalTime: 01 Aug 2011 22:35:18.0498 (UTC) FILETIME=[4A61D420:01CC509B] Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Mon, 01 Aug 2011 22:35:36 +0000 (UTC) On Mon, 2011-08-01 at 17:10 -0400, Trond Myklebust wrote: > Looking at the callback code, I see that if tbl->highest_used_slotid != > 0, then we BUG() while holding the backchannel's tbl->slot_tbl_lock > spinlock. That seems a likely candidate for the above hang. > > Andy, how we are guaranteed that tbl->highest_used_slotid won't take > values other than 0, and why do we commit suicide when it does? As far > as I can see, there is no guarantee that we call nfs4_cb_take_slot() in > nfs4_callback_compound(), however we appear to unconditionally call > nfs4_cb_free_slot() provided there is a session. > > The other strangeness would be the fact that there is nothing enforcing > the NFS4_SESSION_DRAINING flag. If the session is draining, then the > back-channel simply ignores that and goes ahead with processing the > callback. Is this to avoid deadlocks with the server returning > NFS4ERR_BACK_CHAN_BUSY when the client does a DESTROY_SESSION? How about something like the following? 8<------------------------------------------------------------------------------- From c0c499b0ca9d0af8cbdc29c40effba38475461d9 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Mon, 1 Aug 2011 18:31:12 -0400 Subject: [PATCH] NFSv4.1: Fix the callback 'highest_used_slotid' behaviour Currently, there is no guarantee that we will call nfs4_cb_take_slot() even though nfs4_callback_compound() will consistently call nfs4_cb_free_slot() provided the cb_process_state has set the 'clp' field. The result is that we can trigger the BUG_ON() upon the next call to nfs4_cb_take_slot(). This patch fixes the above problem by using the slot id that was taken in the CB_SEQUENCE operation as a flag for whether or not we need to call nfs4_cb_free_slot(). It also fixes an atomicity problem: we need to set tbl->highest_used_slotid atomically with the check for NFS4_SESSION_DRAINING, otherwise we end up racing with the various tests in nfs4_begin_drain_session(). Signed-off-by: Trond Myklebust --- fs/nfs/callback.h | 2 +- fs/nfs/callback_proc.c | 18 +++++++++++++----- fs/nfs/callback_xdr.c | 24 +++++++----------------- 3 files changed, 21 insertions(+), 23 deletions(-) diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h index b257383..07df5f1 100644 --- a/fs/nfs/callback.h +++ b/fs/nfs/callback.h @@ -38,6 +38,7 @@ enum nfs4_callback_opnum { struct cb_process_state { __be32 drc_status; struct nfs_client *clp; + int slotid; }; struct cb_compound_hdr_arg { @@ -166,7 +167,6 @@ extern unsigned nfs4_callback_layoutrecall( void *dummy, struct cb_process_state *cps); extern void nfs4_check_drain_bc_complete(struct nfs4_session *ses); -extern void nfs4_cb_take_slot(struct nfs_client *clp); struct cb_devicenotifyitem { uint32_t cbd_notify_type; diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 74780f9..1bd2c81 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -348,7 +348,7 @@ validate_seqid(struct nfs4_slot_table *tbl, struct cb_sequenceargs * args) /* Normal */ if (likely(args->csa_sequenceid == slot->seq_nr + 1)) { slot->seq_nr++; - return htonl(NFS4_OK); + goto out_ok; } /* Replay */ @@ -367,11 +367,14 @@ validate_seqid(struct nfs4_slot_table *tbl, struct cb_sequenceargs * args) /* Wraparound */ if (args->csa_sequenceid == 1 && (slot->seq_nr + 1) == 0) { slot->seq_nr = 1; - return htonl(NFS4_OK); + goto out_ok; } /* Misordered request */ return htonl(NFS4ERR_SEQ_MISORDERED); +out_ok: + tbl->highest_used_slotid = args->csa_slotid; + return htonl(NFS4_OK); } /* @@ -433,26 +436,32 @@ __be32 nfs4_callback_sequence(struct cb_sequenceargs *args, struct cb_sequenceres *res, struct cb_process_state *cps) { + struct nfs4_slot_table *tbl; struct nfs_client *clp; int i; __be32 status = htonl(NFS4ERR_BADSESSION); - cps->clp = NULL; - clp = nfs4_find_client_sessionid(args->csa_addr, &args->csa_sessionid); if (clp == NULL) goto out; + tbl = &clp->cl_session->bc_slot_table; + + spin_lock(&tbl->slot_tbl_lock); /* state manager is resetting the session */ if (test_bit(NFS4_SESSION_DRAINING, &clp->cl_session->session_state)) { status = NFS4ERR_DELAY; + spin_unlock(&tbl->slot_tbl_lock); goto out; } status = validate_seqid(&clp->cl_session->bc_slot_table, args); + spin_unlock(&tbl->slot_tbl_lock); if (status) goto out; + cps->slotid = args->csa_slotid; + /* * Check for pending referring calls. If a match is found, a * related callback was received before the response to the original @@ -469,7 +478,6 @@ __be32 nfs4_callback_sequence(struct cb_sequenceargs *args, res->csr_slotid = args->csa_slotid; res->csr_highestslotid = NFS41_BC_MAX_CALLBACKS - 1; res->csr_target_highestslotid = NFS41_BC_MAX_CALLBACKS - 1; - nfs4_cb_take_slot(clp); out: cps->clp = clp; /* put in nfs4_callback_compound */ diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c index c6c86a7..918ad64 100644 --- a/fs/nfs/callback_xdr.c +++ b/fs/nfs/callback_xdr.c @@ -754,26 +754,15 @@ static void nfs4_callback_free_slot(struct nfs4_session *session) * Let the state manager know callback processing done. * A single slot, so highest used slotid is either 0 or -1 */ - tbl->highest_used_slotid--; + tbl->highest_used_slotid = -1; nfs4_check_drain_bc_complete(session); spin_unlock(&tbl->slot_tbl_lock); } -static void nfs4_cb_free_slot(struct nfs_client *clp) +static void nfs4_cb_free_slot(struct cb_process_state *cps) { - if (clp && clp->cl_session) - nfs4_callback_free_slot(clp->cl_session); -} - -/* A single slot, so highest used slotid is either 0 or -1 */ -void nfs4_cb_take_slot(struct nfs_client *clp) -{ - struct nfs4_slot_table *tbl = &clp->cl_session->bc_slot_table; - - spin_lock(&tbl->slot_tbl_lock); - tbl->highest_used_slotid++; - BUG_ON(tbl->highest_used_slotid != 0); - spin_unlock(&tbl->slot_tbl_lock); + if (cps->slotid != -1) + nfs4_callback_free_slot(cps->clp->cl_session); } #else /* CONFIG_NFS_V4_1 */ @@ -784,7 +773,7 @@ preprocess_nfs41_op(int nop, unsigned int op_nr, struct callback_op **op) return htonl(NFS4ERR_MINOR_VERS_MISMATCH); } -static void nfs4_cb_free_slot(struct nfs_client *clp) +static void nfs4_cb_free_slot(struct cb_process_state *cps) { } #endif /* CONFIG_NFS_V4_1 */ @@ -866,6 +855,7 @@ static __be32 nfs4_callback_compound(struct svc_rqst *rqstp, void *argp, void *r struct cb_process_state cps = { .drc_status = 0, .clp = NULL, + .slotid = -1, }; unsigned int nops = 0; @@ -906,7 +896,7 @@ static __be32 nfs4_callback_compound(struct svc_rqst *rqstp, void *argp, void *r *hdr_res.status = status; *hdr_res.nops = htonl(nops); - nfs4_cb_free_slot(cps.clp); + nfs4_cb_free_slot(&cps); nfs_put_client(cps.clp); dprintk("%s: done, status = %u\n", __func__, ntohl(status)); return rpc_success;