From patchwork Tue Jan 14 14:45:27 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boaz Harrosh X-Patchwork-Id: 3486531 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 975DE9F381 for ; Tue, 14 Jan 2014 15:54:02 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7C9C720219 for ; Tue, 14 Jan 2014 15:54:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6059120218 for ; Tue, 14 Jan 2014 15:54:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751472AbaANPx7 (ORCPT ); Tue, 14 Jan 2014 10:53:59 -0500 Received: from natasha.panasas.com ([209.166.131.148]:40940 "EHLO natasha.panasas.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751430AbaANPx7 (ORCPT ); Tue, 14 Jan 2014 10:53:59 -0500 Received: from seabiscuit.panasas.com (seabiscuit.panasas.com [172.17.132.204]) by natasha.panasas.com (8.13.1/8.13.1) with ESMTP id s0EEjWFM023582; Tue, 14 Jan 2014 09:45:34 -0500 Received: from [172.17.142.60] (172.17.142.60) by seabiscuit.int.panasas.com (172.17.132.204) with Microsoft SMTP Server (TLS) id 14.1.438.0; Tue, 14 Jan 2014 06:45:31 -0800 Message-ID: <52D54D87.7010100@panasas.com> Date: Tue, 14 Jan 2014 16:45:27 +0200 From: Boaz Harrosh User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7 MIME-Version: 1.0 To: Trond Myklebust , NFS list , open-osd Subject: [PATCH] pnfs-obj: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP An NFS4ERR_RECALLCONFLICT is returned by server from a GET_LAYOUT only when a Server Sent a RECALL do to that GET_LAYOUT, or the RECALL and GET_LAYOUT crossed on the wire. In any way this means we want to wait at most until in-flight IO is finished and the RECALL can be satisfied. So a proper wait here is more like 1/10 of a second, not 15 seconds like we have now. (We use NFS4_POLL_RETRY_MIN here) Current code totally craps out performance of very large files on most pnfs-objects layouts, because of how the map changes when the file has grown and spills into the next raid group. CC: Stable Tree Signed-off-by: Boaz Harrosh --- fs/nfs/nfs4proc.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index d53d678..3264fca 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -7058,7 +7058,7 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata) struct nfs4_state *state = NULL; unsigned long timeo, giveup; - dprintk("--> %s\n", __func__); + dprintk("--> %s tk_status => %d\n", __func__, task->tk_status); if (!nfs41_sequence_done(task, &lgp->res.seq_res)) goto out; @@ -7067,11 +7067,27 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata) case 0: goto out; case -NFS4ERR_LAYOUTTRYLATER: + /* NFS4ERR_RECALLCONFLICT is always a minimal delay (conflict with + * self) + * TODO: NFS4ERR_LAYOUTTRYLATER is a conflict with another client + * (or clients). What we should do is randomize a short delay like on a + * network broadcast burst, and raise the random max every failure. + * For now leave it stateless and do this polling. + */ case -NFS4ERR_RECALLCONFLICT: timeo = rpc_get_timeout(task->tk_client); giveup = lgp->args.timestamp + timeo; - if (time_after(giveup, jiffies)) - task->tk_status = -NFS4ERR_DELAY; + if (time_after(giveup, jiffies)) { + /* Do a minimum delay, We are actually waiting for our + * own IO to finish (In most cases) + */ + dprintk("%s: NFS4ERR_RECALLCONFLICT waiting\n", + __func__); + rpc_delay(task, NFS4_POLL_RETRY_MIN); + task->tk_status = 0; + rpc_restart_call_prepare(task); + goto out; /* Do not call nfs4_async_handle_error() */ + } break; case -NFS4ERR_EXPIRED: case -NFS4ERR_BAD_STATEID: