From patchwork Tue Mar 21 01:53:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Satoru Takeuchi X-Patchwork-Id: 9635853 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8127D60216 for ; Tue, 21 Mar 2017 02:02:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6FB2226224 for ; Tue, 21 Mar 2017 02:02:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6241526E78; Tue, 21 Mar 2017 02:02:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 955B126224 for ; Tue, 21 Mar 2017 02:02:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756079AbdCUCCT (ORCPT ); Mon, 20 Mar 2017 22:02:19 -0400 Received: from mail-pg0-f66.google.com ([74.125.83.66]:33683 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755279AbdCUCCR (ORCPT ); Mon, 20 Mar 2017 22:02:17 -0400 Received: by mail-pg0-f66.google.com with SMTP id 79so14073929pgf.0 for ; Mon, 20 Mar 2017 19:02:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:from:to:cc:cc:subject:user-agent:mime-version; bh=1ILG5jHxeUTBQHJ16hTxm40oMDaH+av1+5e0USdLvyo=; b=NQvCFo2Hj/uwbj/qZyHz2WNr0VSt9bMCHAkXNH33SYxoujE5Q4imePgD+IY5krzqj7 ZycKEDOnkwED0XHN2EJaQ/VVuCgcuPEl35m9ZW8LHELRCMiOo0t95iHs9EIihzp1hE8q 4JKpQI0QU8IEYEGITr6sHWRf/6FTrAZuP3V/XvVSCzRI9O27reJO2XzNJ5BZIi1VppXU EQiePVu8m7DtzfgImM91UyQnR3Xtb5eUcswet7MbsUMqXgyGneaGa6EceDLzqlJPDleB tWocqS7eHn6K98qQhv0HilgzHXnXAylG185ZQAD67Dr9Vl0D1Sw7OPXecE/qMYOAOvvQ 7Vpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:from:to:cc:cc:subject:user-agent :mime-version; bh=1ILG5jHxeUTBQHJ16hTxm40oMDaH+av1+5e0USdLvyo=; b=CYiYZ30ZW4GO6hFkNCkRMj6ld5z8yFYP6TUaMGO9ejw/FsN8qeihQGbpDp6zBPJ7CL GE/YzzRxPxWVKAvVNsTIKIy0nyGtRbsvtRF8hc8uXRt/pN/uKOvUssVSnhwvULWRP6ax YH++1vNeY+3Lk/zqbMkralTyHTlozTSoCtHckYSvIhlJ63qb/OAgwQpBxAmVMmOxdwRN elVSAasgYWFKv4FVt4y9LSZuUDlTKTXkm/UktNk+arHKJJGdVfj1p1xbnTfxCFQdi45C TzSa/FLi6RcdCUQizUpE4FUlE/I/y9CEe1RyUbdA2RRIIgyAQdARZzbYhDpPK0DME4iI FoEA== X-Gm-Message-State: AFeK/H2oQxzuM5Kis9oFquRDa7o6884eeqZzeBSQb35lmua7GDL+7Au3D4noJQLn5fCvSA== X-Received: by 10.98.13.197 with SMTP id 66mr36870457pfn.91.1490061196668; Mon, 20 Mar 2017 18:53:16 -0700 (PDT) Received: from www2455ue.gmail.com (satoru-takeuchi.org. [219.94.241.229]) by smtp.gmail.com with ESMTPSA id s26sm35849261pge.33.2017.03.20.18.53.15 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 20 Mar 2017 18:53:16 -0700 (PDT) Date: Tue, 21 Mar 2017 10:53:09 +0900 Message-ID: <871strcs56.wl-satoru.takeuchi@gmail.com> From: Satoru Takeuchi To: linux-btrfs CC: Mark Fasheh CC: David Sterba Subject: [PATCH] fix inconsistent device between /proc/pid/maps and stat User-Agent: Wanderlust/2.15.9 (Almost Unreal) Emacs/25.1 Mule/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP There have been some discussions about inconsistent device between /proc/pid/maps and stat(2). http://thr3ads.net/btrfs-devel/2011/05/2346176-RFC-PATCH-0-2-btrfs-vfs-Return-same-device-in-stat-2-and-proc-pid-maps https://www.spinics.net/lists/linux-btrfs/msg09044.html https://patchwork.kernel.org/patch/2825842/ https://patchwork.kernel.org/patch/2831525/ It's important since it breaks user programs like lsof(8). There was a patche by Mark to fix this problem. However, unfortunately, that patch is not merged so far. NOTE: 1) This patch is inspired by the Mark's one and I tweak it as follows. - move a flag for this fix from sb->s_flags to kernel internal sb->s_iflags - change that flag's name from MS_STAT_FOR_DEV to SB_I_LOGICALVOL, to make its meaning clearer 2) This patch is for Chris's for-linus-4.11 (commit e1699d2d7bf6e6cce3e1baff19f9dd4595a58664 ("")). It should modify to apply to 4.11-rcX because of statx patches changes struct inode_operations->getattr() interface. For more information about this problem, please refer to the patchat the end of this mail. --- Subject: [PATCH] fix inconsistent devie between /proc/pid/maps and stat /proc/pid/maps returns each device as follows. === dev = inode->i_sb->s_dev; === However, stat(2) returns it as follows. === stat->dev = BTRFS_I(inode)->root->anon_dev; === It results in device mismatch and this inconsistency breaks users programs like lsof(8) as follows. Without this patch: === $ mount | grep /home/vagrant/mnt /dev/vda5 on /home/vagrant/mnt type btrfs (rw,relatime,noacl,space_cache,subvolid=5,subvol=/) $ /home/vagrant/mnt/lsof | grep /home/vagrant/mnt lsof 19292 root txt REG 0,44 163136 257 /home/sat/mnt/lsof lsof 19292 root mem REG 0,42 257 /home/sat/mnt/lsof (path dev=0,44) lsof 19294 root txt REG 0,44 163136 257 /home/sat/mnt/lsof lsof 19294 root mem REG 0,42 257 /home/sat/mnt/lsof (path dev=0,44) === With this patch: === $ mount | grep /home/vagrant/mnt /dev/vda5 on /home/vagrant/mnt type btrfs (rw,relatime,noacl,space_cache,subvolid=5,subvol=/) $ /home/vagrant/mnt/lsof | grep /home/vagrant/mnt lsof 752 root txt REG 0,44 163224 263 /home/vagrant/mnt/lsof lsof 754 root txt REG 0,44 163224 263 /home/vagrant/mnt/lsof === Signed-off-by: Satoru Takeuchi CC: Signed-off-by: Mark Fasheh --- fs/btrfs/super.c | 1 + fs/proc/generic.c | 13 +++++++++++++ fs/proc/internal.h | 1 + fs/proc/nommu.c | 2 +- fs/proc/task_mmu.c | 2 +- fs/proc/task_nommu.c | 2 +- include/linux/fs.h | 1 + 7 files changed, 19 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index da687dc79cce..2ccfdb107ba0 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1133,6 +1133,7 @@ static int btrfs_fill_super(struct super_block *sb, #endif sb->s_flags |= MS_I_VERSION; sb->s_iflags |= SB_I_CGROUPWB; + sb->s_iflags |= SB_I_LOGICALVOL; err = open_ctree(sb, fs_devices, (char *)data); if (err) { btrfs_err(fs_info, "open_ctree failed"); diff --git a/fs/proc/generic.c b/fs/proc/generic.c index f6a01f09f79d..d38cd77b297c 100644 --- a/fs/proc/generic.c +++ b/fs/proc/generic.c @@ -646,3 +646,16 @@ void *PDE_DATA(const struct inode *inode) return __PDE_DATA(inode); } EXPORT_SYMBOL(PDE_DATA); + +dev_t proc_get_map_dev(struct dentry *dentry) +{ + struct inode *inode = dentry->d_inode; + struct kstat kstat; + + if (inode->i_sb->s_iflags & SB_I_LOGICALVOL && + inode->i_op->getattr && + inode->i_op->getattr(NULL, dentry, &kstat) == 0) + return kstat.dev; + + return inode->i_sb->s_dev; +} diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 2de5194ba378..bf0f11fc209b 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -190,6 +190,7 @@ static inline struct proc_dir_entry *pde_get(struct proc_dir_entry *pde) return pde; } extern void pde_put(struct proc_dir_entry *); +dev_t proc_get_map_dev(struct dentry *); static inline bool is_empty_pde(const struct proc_dir_entry *pde) { diff --git a/fs/proc/nommu.c b/fs/proc/nommu.c index 75634379f82e..e9c29864a50e 100644 --- a/fs/proc/nommu.c +++ b/fs/proc/nommu.c @@ -46,7 +46,7 @@ static int nommu_region_show(struct seq_file *m, struct vm_region *region) if (file) { struct inode *inode = file_inode(region->vm_file); - dev = inode->i_sb->s_dev; + dev = proc_get_map_dev(vma->vm_file->f_path.dentry); ino = inode->i_ino; } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 8f96a49178d0..3529adf0faa8 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -292,7 +292,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid) if (file) { struct inode *inode = file_inode(vma->vm_file); - dev = inode->i_sb->s_dev; + dev = proc_get_map_dev(vma->vm_file->f_path.dentry); ino = inode->i_ino; pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT; } diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c index 37175621e890..4f17f91225c3 100644 --- a/fs/proc/task_nommu.c +++ b/fs/proc/task_nommu.c @@ -156,7 +156,7 @@ static int nommu_vma_show(struct seq_file *m, struct vm_area_struct *vma, if (file) { struct inode *inode = file_inode(vma->vm_file); - dev = inode->i_sb->s_dev; + dev = proc_get_map_dev(vma->vm_file->f_path.dentry); ino = inode->i_ino; pgoff = (loff_t)vma->vm_pgoff << PAGE_SHIFT; } diff --git a/include/linux/fs.h b/include/linux/fs.h index 2ba074328894..eec62a5bfd4d 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1262,6 +1262,7 @@ struct mm_struct; #define SB_I_CGROUPWB 0x00000001 /* cgroup-aware writeback enabled */ #define SB_I_NOEXEC 0x00000002 /* Ignore executables on this fs */ #define SB_I_NODEV 0x00000004 /* Ignore devices on this fs */ +#define SB_I_LOGICALVOL 0x00000008 /* this fs supports logical volumes */ /* sb->s_iflags to limit user namespace mounts */ #define SB_I_USERNS_VISIBLE 0x00000010 /* fstype already mounted */