From patchwork Sun Nov 29 13:46:14 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 7718711 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 34A039F30B for ; Sun, 29 Nov 2015 13:46:24 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id DD46B204DE for ; Sun, 29 Nov 2015 13:46:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1DC362049E for ; Sun, 29 Nov 2015 13:46:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752774AbbK2NqU (ORCPT ); Sun, 29 Nov 2015 08:46:20 -0500 Received: from mail-qg0-f41.google.com ([209.85.192.41]:36215 "EHLO mail-qg0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750893AbbK2NqS (ORCPT ); Sun, 29 Nov 2015 08:46:18 -0500 Received: by qgcc31 with SMTP id c31so100469801qgc.3 for ; Sun, 29 Nov 2015 05:46:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=poochiereds-net.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; bh=rlojD7xxrjLohXkvhzcg525ZfndtobBUrT5A/j5cdRQ=; b=FhU13yGLJ3zHHkIVc7KBvyqvlbypFDuETVjTxQwOBkUO0eVD6VJb+zSoD66vYC0pql g5GyHtr+yxZkPq3nNE+rp09jxrbh9pa3xUnCZG+8ezPBSdhFuzarRCuokqGzcKCOi+4o XMtN+4cyS+ZZUpQ8Rg1GqlGhqd0+ssL5npUy5MvjSHVaN5xcddapKvoEbgnfMl3jOMcj TEY2EyQYgpnA8fsoJ4H8JaaBdIrHuM6nBI85kDgRJyifuwJQPtvYVRlBolj+EjUe2GwB exj5LiyPpNd3+TkHxzPOPlnaLUkZvqD5yHLQPTb5ypOT0EmelXEvrMPlTqs615rIG9+d GA4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=rlojD7xxrjLohXkvhzcg525ZfndtobBUrT5A/j5cdRQ=; b=MpCFp8JqKwdhGAMObvPaBGhfPZ00UHh+vM670Tfe77ZGMi/pOXFnRlf7uQsleVeH4z /GHHEJCZxrIFG3dc0F8wq5vPBYE+wBE1yra5hTfhkhjrfMjnrmw9m3Ena2CXO2kfUloR 5/OXxW7WucJb8027Fms9VmIn/+Khf/Sq2uh8hIQtHTjWRCOZ9bHxIgdjrxlu+L0kTpMl qqBKg9i4JAraqqeKy8rKuqOf82dRIcgfsRMQLt0E5qskRUGyNXcJFas5nRv4Pmk7lsH5 lJquI6zxIOI9BTllF9jrEE2ev9ZBm85BD/fDmOdQNYPREZ+azOcuDBdr0tMWvSGimRZ1 rmGw== X-Gm-Message-State: ALoCoQmhB0FVW9B07aa2ZJpndtge0SSEH+5C4xDgNyvFQvukbHPvAIPt4UpphzHcSEh+W1bhE3PM X-Received: by 10.140.89.201 with SMTP id v67mr67121068qgd.38.1448804777555; Sun, 29 Nov 2015 05:46:17 -0800 (PST) Received: from tlielax.poochiereds.net ([2606:a000:1125:4075::d5a]) by smtp.gmail.com with ESMTPSA id 138sm11348454qhi.37.2015.11.29.05.46.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 29 Nov 2015 05:46:17 -0800 (PST) Date: Sun, 29 Nov 2015 08:46:14 -0500 From: Jeff Layton To: Kinglong Mee Cc: bfields@fieldses.org, Christoph Hellwig , linux-nfs@vger.kernel.org Subject: Re: [PATCH RFC] nfsd: serialize layout stateid morphing operations Message-ID: <20151129084614.42fb1272@tlielax.poochiereds.net> In-Reply-To: <565A7A14.4010902@gmail.com> References: <1442491104-30080-1-git-send-email-jeff.layton@primarydata.com> <565A7A14.4010902@gmail.com> X-Mailer: Claws Mail 3.12.0 (GTK+ 2.24.28; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID,T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Sun, 29 Nov 2015 12:07:48 +0800 Kinglong Mee wrote: > I meet two problems with this patch, > Ok, I was less sure of this patch than the other one. I think we will need to serialize these operations, but this may not be the best way to do it. Bruce if you want to back this one for now, then I'm fine with that. It may be a bit before I can get to it. > 1. a dump_stack messages printed in process_one_work() at kernel/workqueue.c. > > BUG: workqueue leaked lock or atomic: kworker/u2:4/0x00000000/98#012 last function: nfsd4_run_cb_work [nfsd] > 1 lock held by kworker/u2:4/98: > #0: (&ls->ls_mutex){......}, at: [] nfsd4_cb_layout_prepare+0x24/0x40 [nfsd] > CPU: 0 PID: 98 Comm: kworker/u2:4 Not tainted 4.4.0-rc2+ #333 > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 > Workqueue: nfsd4_callbacks nfsd4_run_cb_work [nfsd] > ffff8800362b9e40 000000007fe9394f ffff880036353d58 ffffffff8136dc64 > ffff880036353dd8 ffffffff810a3f12 ffffffff810a3cbd 000000000000000a > ffffffffa0261d78 ffffffff82902e20 0000000000000000 ffffffffa0259241 > Call Trace: > [] dump_stack+0x19/0x25 > [] process_one_work+0x3c2/0x4c0 > [] ? process_one_work+0x16d/0x4c0 > [] worker_thread+0x4a/0x440 > [] ? process_one_work+0x4c0/0x4c0 > [] ? process_one_work+0x4c0/0x4c0 > [] kthread+0xf5/0x110 > [] ? kthread_create_on_node+0x240/0x240 > [] ret_from_fork+0x3f/0x70 > [] ? kthread_create_on_node+0x240/0x240 > I guess this is just a general "hey you left a mutex locked" warning when finishing a workqueue job? This patch actually does want to do that, but I trying to tell that to lockdep may be tricky... > 2. a mutex race between layoutrecall and layoutcommit, > > callback thread, > nfsd4_cb_layout_prepare > --->mutex_lock(&ls->ls_mutex); > > layoutcommit thread, > nfsd4_layoutcommit > ---> nfsd4_preprocess_layout_stateid > --> mutex_lock(&ls->ls_mutex); <---------------- hang > > [ 600.645142] INFO: task nfsd:11623 blocked for more than 120 seconds. > [ 600.646337] Not tainted 4.4.0-rc2+ #332 > [ 600.647404] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 600.648546] nfsd D ffff880064277b80 0 11623 2 0x00000000 > [ 600.649803] ffff880064277b80 ffff880064278000 00000000ffffffff ffff88005dd241a8 > [ 600.651021] ffffffffa025e77c 0000000000000246 ffff880064277b98 ffffffff81733103 > [ 600.652255] ffff880063d7e100 ffff880064277ba8 ffffffff8173330e ffff880064277c28 > [ 600.653512] Call Trace: > [ 600.654765] [] ? nfsd4_preprocess_layout_stateid+0x37c/0x400 [nfsd] > [ 600.656084] [] schedule+0x33/0x80 > [ 600.657405] [] schedule_preempt_disabled+0xe/0x10 > [ 600.658741] [] mutex_lock_nested+0x145/0x330 > [ 600.660094] [] ? nfsd4_preprocess_layout_stateid+0x37c/0x400 [nfsd] > [ 600.661696] [] nfsd4_preprocess_layout_stateid+0x37c/0x400 [nfsd] > [ 600.663129] [] ? nfsd4_preprocess_layout_stateid+0x5/0x400 [nfsd] > [ 600.664558] [] ? printk+0x56/0x72 > [ 600.665990] [] nfsd4_layoutcommit+0x13c/0x200 [nfsd] > [ 600.667365] [] nfsd4_proc_compound+0x388/0x660 [nfsd] > [ 600.668835] [] nfsd_dispatch+0xb8/0x200 [nfsd] > [ 600.670323] [] svc_process_common+0x409/0x650 [sunrpc] > [ 600.671836] [] svc_process+0xf4/0x190 [sunrpc] > [ 600.673328] [] nfsd+0x135/0x1a0 [nfsd] > [ 600.674825] [] ? nfsd+0x5/0x1a0 [nfsd] > [ 600.676388] [] ? nfsd_destroy+0xb0/0xb0 [nfsd] > [ 600.677884] [] kthread+0xf5/0x110 > [ 600.679373] [] ? kthread_create_on_node+0x240/0x240 > [ 600.680874] [] ret_from_fork+0x3f/0x70 > [ 600.682398] [] ? kthread_create_on_node+0x240/0x240 > [ 600.683893] 1 lock held by nfsd/11623: > [ 600.685449] #0: (&ls->ls_mutex){......}, at: [] nfsd4_preprocess_layout_stateid+0x37c/0x400 [nfsd] > [ 600.688778] Sending NMI to all CPUs: > [ 600.690854] NMI backtrace for cpu 0 > [ 600.691909] CPU: 0 PID: 11 Comm: khungtaskd Not tainted 4.4.0-rc2+ #332 > [ 600.692523] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 > [ 600.693821] task: ffff88007b900000 ti: ffff88007b8fc000 task.ti: ffff88007b8fc000 > [ 600.694496] RIP: 0010:[] [] native_write_msr_safe+0xa/0x10 > [ 600.695185] RSP: 0018:ffff88007b8ffd70 EFLAGS: 00000046 > [ 600.695861] RAX: 0000000000000400 RBX: 0000000000000286 RCX: 0000000000000830 > [ 600.696539] RDX: 0000000000000000 RSI: 0000000000000400 RDI: 0000000000000830 > [ 600.697204] RBP: ffff88007b8ffd70 R08: 0000000000000400 R09: 0000000000000000 > [ 600.697862] R10: 00000000000000e4 R11: 0000000000000001 R12: ffff880063d7e100 > [ 600.698513] R13: 00000000003fff3c R14: ffff880063d7e308 R15: 0000000000000004 > [ 600.699156] FS: 0000000000000000(0000) GS:ffffffff81c27000(0000) knlGS:0000000000000000 > [ 600.699823] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 600.700459] CR2: 00007f366afd4000 CR3: 000000005dd56000 CR4: 00000000001406f0 > [ 600.701106] Stack: > [ 600.701745] ffff88007b8ffd88 ffffffff8104a860 ffffffff81047340 ffff88007b8ffd98 > [ 600.702404] ffffffff8104a885 ffff88007b8ffda8 ffffffff8104735b ffff88007b8ffdd8 > [ 600.703058] ffffffff813723ad 0000000000000078 ffff880063d7e100 00000000003fff3c > [ 600.703712] Call Trace: > [ 600.704355] [] __x2apic_send_IPI_mask.isra.2+0x60/0x70 > [ 600.705017] [] ? setup_vector_irq+0x130/0x130 > [ 600.705676] [] x2apic_send_IPI_mask+0x15/0x20 > [ 600.706335] [] nmi_raise_cpu_backtrace+0x1b/0x20 > [ 600.706989] [] nmi_trigger_all_cpu_backtrace+0x14d/0x1c0 > [ 600.707693] [] arch_trigger_all_cpu_backtrace+0x19/0x20 > [ 600.708362] [] watchdog+0x32f/0x370 > [ 600.709031] [] ? watchdog+0x81/0x370 > [ 600.709725] [] ? reset_hung_task_detector+0x20/0x20 > [ 600.710398] [] kthread+0xf5/0x110 > [ 600.711067] [] ? kthread_create_on_node+0x240/0x240 > [ 600.711739] [] ret_from_fork+0x3f/0x70 > [ 600.712405] [] ? kthread_create_on_node+0x240/0x240 > [ 600.713073] Code: 00 55 89 f9 48 89 e5 0f 32 45 31 c0 48 c1 e2 20 44 89 06 48 09 d0 5d c3 66 0f 1f 84 00 00 00 00 00 55 89 f0 89 f9 48 89 e5 0f 30 <31> c0 5d c3 66 90 55 89 f9 48 89 e5 0f 33 48 c1 e2 20 48 09 d0 > [ 600.715196] Kernel panic - not syncing: hung_task: blocked tasks > [ 600.715889] CPU: 0 PID: 11 Comm: khungtaskd Not tainted 4.4.0-rc2+ #332 > [ 600.716540] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 > [ 600.717910] ffffffff81a4d19c 00000000480750be ffff88007b8ffd60 ffffffff8136dbf4 > [ 600.718610] ffff88007b8ffde8 ffffffff81173559 ffff880000000008 ffff88007b8ffdf8 > [ 600.719302] ffff88007b8ffd90 00000000480750be 0000000000000001 0000000000000001 > [ 600.719984] Call Trace: > [ 600.720646] [] dump_stack+0x19/0x25 > [ 600.721330] [] panic+0xd3/0x212 > [ 600.722009] [] watchdog+0x33b/0x370 > [ 600.722686] [] ? watchdog+0x81/0x370 > [ 600.723213] [] ? reset_hung_task_detector+0x20/0x20 > [ 600.723674] [] kthread+0xf5/0x110 > [ 600.724107] [] ? kthread_create_on_node+0x240/0x240 > [ 600.724509] [] ret_from_fork+0x3f/0x70 > [ 600.724903] [] ? kthread_create_on_node+0x240/0x240 > > thanks, > Kinglong Mee > This is the bigger problem, I think. The question of course is why the client didn't respond to the cb request? Still, holding a mutex across the callback RPC is pretty ugly and now that I think about it, I don't think it's really necessary anyway. Once we've copied the stateid, we aren't really changing anything else so we can let other layout ops proceed. Would a patch like this fix it? -----------------------8<-------------------------- [PATCH] nfsd: don't hold ls_mutex across a layout recall We do need to serialize layout stateid morphing operations, but we currently hold the ls_mutex across a layout recall which is pretty ugly. It's also unnecessary -- once we've bumped the seqid and copied it, we don't need to serialize the rest of the CB_LAYOUTRECALL vs. anything else. Just drop the mutex once the copy is done. Fixes: cc8a55320b5f "nfsd: serialize layout stateid morphing operations" Cc: stable@vger.kernel.org Reported-by: Kinglong Mee Signed-off-by: Jeff Layton Tested-by: Kinglong Mee --- fs/nfsd/nfs4layouts.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c index 9ffef06b30d5..c9d6c715c0fb 100644 --- a/fs/nfsd/nfs4layouts.c +++ b/fs/nfsd/nfs4layouts.c @@ -616,6 +616,7 @@ nfsd4_cb_layout_prepare(struct nfsd4_callback *cb) mutex_lock(&ls->ls_mutex); nfs4_inc_and_copy_stateid(&ls->ls_recall_sid, &ls->ls_stid); + mutex_unlock(&ls->ls_mutex); } static int @@ -659,7 +660,6 @@ nfsd4_cb_layout_release(struct nfsd4_callback *cb) trace_layout_recall_release(&ls->ls_stid.sc_stateid); - mutex_unlock(&ls->ls_mutex); nfsd4_return_all_layouts(ls, &reaplist); nfsd4_free_layouts(&reaplist); nfs4_put_stid(&ls->ls_stid);