From patchwork Tue Mar 21 18:16:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kees Cook X-Patchwork-Id: 9637261 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5C44B60216 for ; Tue, 21 Mar 2017 18:16:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5043E205F8 for ; Tue, 21 Mar 2017 18:16:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 42B4127E22; Tue, 21 Mar 2017 18:16:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 992C6205F8 for ; Tue, 21 Mar 2017 18:16:23 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8785489B29; Tue, 21 Mar 2017 18:16:22 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-it0-x231.google.com (mail-it0-x231.google.com [IPv6:2607:f8b0:4001:c0b::231]) by gabe.freedesktop.org (Postfix) with ESMTPS id D018B89B27 for ; Tue, 21 Mar 2017 18:16:20 +0000 (UTC) Received: by mail-it0-x231.google.com with SMTP id y18so13379658itc.1 for ; Tue, 21 Mar 2017 11:16:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=G05EUNlAFytVCrmkue5M3LQVbgkQe2bKrdKDXpeyXjo=; b=aKL61mqdbBt89pJiogxKXUV/eMiOMd/1+7h/vUITU25UKoE2iyz5qPKcmz6brvAsat RBhzUbXGSNJixrdUCxamN3tKtD0NeTaSYKoBY4SLuNywjcxhT9SAYWaopMOphmQa1sp3 zyCsCjTxolp44eqXCQJpDv5zQpvhRfjgukc/r7w//wkGqYl1VzNzQgdxb54T024IdYNh CL68/RGafWfSbxl/Vbkjb3xlAVxX5Y89L7oFvcZgdpJQqgJnP9NJp+bCqd3E0KNK8WHR 9lk99iogRDasdzRo9fyr/DNezw1iMtbzCH5HeYSd3iZ5uApTSqaO2uX1CIBTY3hOIuwg GgqA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=G05EUNlAFytVCrmkue5M3LQVbgkQe2bKrdKDXpeyXjo=; b=n1cIvcOmeLYkWoOYLzEeMZBc0RR9Z9+/Ykou/9C5tNYL59+HC1ZBGO+jhPlWoJ6vI8 57bYJX9UfylA6OUfvCNQ5NnSZjBA4iipxWS0QYELFI1YQkMCIf2ROumolDKR69wUGQFt 8j68nOvKiO0QzVsC6k8InrAAr1iRshKTNnQMg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=G05EUNlAFytVCrmkue5M3LQVbgkQe2bKrdKDXpeyXjo=; b=NH2Aq00pPQnZ41YEGK1ycvIfIzTEIuRCZKv5Ixf1DVWeSCWAEI4fjwAq/PGlthEwp/ KTxSBGgqHxZarVa95Ng9PNfD7McVQETMSdjVzQCNk7D0YyWgq1ybz19RvMslYctubwud zjuvuTKyJR/PE/dHtZmBNjKx/qwsHlCZCZu7BevoxR6PM3wRv0lqk5gdsfaDx1n24IUh K43V0+DJygwT2wUhG3khbWmEs+4mlJTxMl3GDTkMihZWhqG0weUO1iSv+J40lJFJe2LH JgSTOKJZTUM4Nyj63Rmmkhl5JPBO6ZcNAR5T/7KsmZ78i47oem9lj2t+aDsEy7QJuSh3 8ugA== X-Gm-Message-State: AFeK/H2ddRRD0EB5H95cjZGw3+VkGGoZQ/BoRgNMRpQS2EmMznFPh+XufSmwrET/0qtXDZG50n4JtbVQ2ayaZZik X-Received: by 10.36.152.196 with SMTP id n187mr4228052itd.28.1490120179900; Tue, 21 Mar 2017 11:16:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.131.155 with HTTP; Tue, 21 Mar 2017 11:16:19 -0700 (PDT) In-Reply-To: <20170321055848.GA15831@danjae.aot.lge.com> References: <20170317095223.15080-1-chris@chris-wilson.co.uk> <20170321055848.GA15831@danjae.aot.lge.com> From: Kees Cook Date: Tue, 21 Mar 2017 11:16:19 -0700 X-Google-Sender-Auth: i7JS1JVUH6AZX7xmPapVbBWB9jc Message-ID: To: Namhyung Kim Cc: Tony Luck , Tomi Sarvela , intel-gfx@lists.freedesktop.org, Anton Vorontsov , LKML , "# v4 . 10+" , kernel-team@lge.com, Stefan Hajnoczi , Colin Cross Subject: Re: [Intel-gfx] [PATCH] fs/pstore: Perform erase from a worker X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP On Mon, Mar 20, 2017 at 10:58 PM, Namhyung Kim wrote: > Hello, > > On Mon, Mar 20, 2017 at 10:49:16AM -0700, Kees Cook wrote: >> On Fri, Mar 17, 2017 at 2:52 AM, Chris Wilson wrote: >> > In order to prevent a cyclic recursion between psi->read_mutex and the >> > inode_lock, we need to move the pse->erase to a worker. >> > >> > [ 605.374955] ====================================================== >> > [ 605.381281] [ INFO: possible circular locking dependency detected ] >> > [ 605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted >> > [ 605.392826] ------------------------------------------------------- >> > [ 605.399196] rm/7298 is trying to acquire lock: >> > [ 605.403720] (&psinfo->read_mutex){+.+.+.}, at: [] pstore_unlink+0x3f/0xa0 >> > [ 605.412300] >> > [ 605.412300] but task is already holding lock: >> > [ 605.418237] (&sb->s_type->i_mutex_key#14){++++++}, at: [] vfs_unlink+0x4c/0x19 >> > 0 >> > [ 605.427397] >> > [ 605.427397] which lock already depends on the new lock. >> > [ 605.427397] >> > [ 605.435770] >> > [ 605.435770] the existing dependency chain (in reverse order) is: >> > [ 605.443396] >> > [ 605.443396] -> #1 (&sb->s_type->i_mutex_key#14){++++++}: >> > [ 605.450347] lock_acquire+0xc9/0x220 >> > [ 605.454551] down_write+0x3f/0x70 >> > [ 605.458484] pstore_mkfile+0x1f4/0x460 >> > [ 605.462835] pstore_get_records+0x17a/0x320 >> > [ 605.467664] pstore_fill_super+0xa4/0xc0 >> > [ 605.472205] mount_single+0x89/0xb0 >> > [ 605.476314] pstore_mount+0x13/0x20 >> > [ 605.480411] mount_fs+0xf/0x90 >> > [ 605.484122] vfs_kern_mount+0x66/0x170 >> > [ 605.488464] do_mount+0x190/0xd50 >> > [ 605.492397] SyS_mount+0x90/0xd0 >> > [ 605.496212] entry_SYSCALL_64_fastpath+0x1c/0xb1 >> > [ 605.501496] >> > [ 605.501496] -> #0 (&psinfo->read_mutex){+.+.+.}: >> > [ 605.507747] __lock_acquire+0x1ac0/0x1bb0 >> > [ 605.512401] lock_acquire+0xc9/0x220 >> > [ 605.516594] __mutex_lock+0x6e/0x990 >> > [ 605.520755] mutex_lock_nested+0x16/0x20 >> > [ 605.525279] pstore_unlink+0x3f/0xa0 >> > [ 605.529465] vfs_unlink+0xb5/0x190 >> > [ 605.533477] do_unlinkat+0x24c/0x2a0 >> > [ 605.537672] SyS_unlinkat+0x16/0x30 >> > [ 605.541781] entry_SYSCALL_64_fastpath+0x1c/0xb1 >> >> If I'm reading this right it's a race between mount and unlink... >> that's quite a corner case. :) >> >> > [ 605.547067] >> > [ 605.547067] other info that might help us debug this: >> > [ 605.547067] >> > [ 605.555221] Possible unsafe locking scenario: >> > [ 605.555221] >> > [ 605.561280] CPU0 CPU1 >> > [ 605.565883] ---- ---- >> > [ 605.570502] lock(&sb->s_type->i_mutex_key#14); >> > [ 605.575217] lock(&psinfo->read_mutex); >> > [ 605.581803] lock(&sb->s_type->i_mutex_key#14); >> > [ 605.589159] lock(&psinfo->read_mutex); >> >> I haven't had time to dig much yet, but I wonder if the locking order >> on unlink could just be reversed, and the deadlock would go away? > > IIUC, the unlink path locks a file in the root directory, while the > mount path locks the root directory. Maybe we can use a subclass? > (not tested) > > Thanks, > Namhyung > > > diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c > index 06504b69575b..6eea6bcf90c8 100644 > --- a/fs/pstore/inode.c > +++ b/fs/pstore/inode.c > @@ -394,7 +394,7 @@ int pstore_mkfile(struct pstore_record *record) > break; > } > > - inode_lock(d_inode(root)); > + inode_lock_nested(d_inode(root), I_MUTEX_PARENT); > > dentry = d_alloc_name(root, name); > if (!dentry) No luck, unfortunately... In looking at other examples, I don't think the inode_lock is needed at all? I see other uses of d_alloc_name() and d_add() without an inode lock (proc, libfs, etc), and the locking documentation doesn't seem to imply it either? This solves the lockdep, though it's unclear to me if it is somehow unsafe (apologies for whitespace damage...): -Kees diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c index 06504b69575b..3d83b13d2338 100644 --- a/fs/pstore/inode.c +++ b/fs/pstore/inode.c @@ -394,11 +394,9 @@ int pstore_mkfile(struct pstore_record *record) break; } - inode_lock(d_inode(root)); - dentry = d_alloc_name(root, name); if (!dentry) - goto fail_lockedalloc; + goto fail_private; inode->i_size = private->total_size = size; @@ -413,16 +411,12 @@ int pstore_mkfile(struct pstore_record *record) list_add(&private->list, &allpstore); spin_unlock_irqrestore(&allpstore_lock, flags); - inode_unlock(d_inode(root)); - return 0; -fail_lockedalloc: - inode_unlock(d_inode(root)); +fail_private: free_pstore_private(private); fail_alloc: iput(inode); - fail: return rc; }