diff mbox

Btrfs: fix crash/invalid memory access on fsync when using overlayfs

Message ID 1458581924-18589-1-git-send-email-fdmanana@kernel.org (mailing list archive)
State Accepted
Headers show

Commit Message

Filipe Manana March 21, 2016, 5:38 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

If the lower or upper directory of an overlayfs mount belong to a btrfs
file system and we fsync the file through the overlayfs' merged directory
we ended up accessing an inode that didn't belong to btrfs as if it were
a btrfs inode at btrfs_sync_file() resulting in a crash like the following:

[ 7782.588845] BUG: unable to handle kernel NULL pointer dereference at 0000000000000544
[ 7782.590624] IP: [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
[ 7782.591931] PGD 4d954067 PUD 1e878067 PMD 0
[ 7782.592016] Oops: 0002 [#6] PREEMPT SMP DEBUG_PAGEALLOC
[ 7782.592016] Modules linked in: btrfs overlay ppdev crc32c_generic evdev xor raid6_pq psmouse pcspkr sg serio_raw acpi_cpufreq parport_pc parport tpm_tis i2c_piix4 tpm i2c_core processor button loop autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix virtio_pci libata virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs]
[ 7782.592016] CPU: 10 PID: 16437 Comm: xfs_io Tainted: G      D         4.5.0-rc6-btrfs-next-26+ #1
[ 7782.592016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[ 7782.592016] task: ffff88001b8d40c0 ti: ffff880137488000 task.ti: ffff880137488000
[ 7782.592016] RIP: 0010:[<ffffffffa030b7ab>]  [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
[ 7782.592016] RSP: 0018:ffff88013748be40  EFLAGS: 00010286
[ 7782.592016] RAX: 0000000080000000 RBX: ffff880133b30c88 RCX: 0000000000000001
[ 7782.592016] RDX: 0000000000000001 RSI: ffffffff8148fec0 RDI: 00000000ffffffff
[ 7782.592016] RBP: ffff88013748bec0 R08: 0000000000000001 R09: 0000000000000000
[ 7782.624248] R10: ffff88013748be40 R11: 0000000000000246 R12: 0000000000000000
[ 7782.624248] R13: 0000000000000000 R14: 00000000009305a0 R15: ffff880015e3be40
[ 7782.624248] FS:  00007fa83b9cb700(0000) GS:ffff88023ed40000(0000) knlGS:0000000000000000
[ 7782.624248] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7782.624248] CR2: 0000000000000544 CR3: 00000001fa652000 CR4: 00000000000006e0
[ 7782.624248] Stack:
[ 7782.624248]  ffffffff8108b5cc ffff88013748bec0 0000000000000246 ffff8800b005ded0
[ 7782.624248]  ffff880133b30d60 8000000000000000 7fffffffffffffff 0000000000000246
[ 7782.624248]  0000000000000246 ffffffff81074f9b ffffffff8104357c ffff880015e3be40
[ 7782.624248] Call Trace:
[ 7782.624248]  [<ffffffff8108b5cc>] ? arch_local_irq_save+0x9/0xc
[ 7782.624248]  [<ffffffff81074f9b>] ? ___might_sleep+0xce/0x217
[ 7782.624248]  [<ffffffff8104357c>] ? __do_page_fault+0x3c0/0x43a
[ 7782.624248]  [<ffffffff811a2351>] vfs_fsync_range+0x8c/0x9e
[ 7782.624248]  [<ffffffff811a237f>] vfs_fsync+0x1c/0x1e
[ 7782.624248]  [<ffffffff811a24d6>] do_fsync+0x31/0x4a
[ 7782.624248]  [<ffffffff811a2700>] SyS_fsync+0x10/0x14
[ 7782.624248]  [<ffffffff81493617>] entry_SYSCALL_64_fastpath+0x12/0x6b
[ 7782.624248] Code: 85 c0 0f 85 e2 02 00 00 48 8b 45 b0 31 f6 4c 29 e8 48 ff c0 48 89 45 a8 48 8d 83 d8 00 00 00 48 89 c7 48 89 45 a0 e8 fc 43 18 e1 <f0> 41 ff 84 24 44 05 00 00 48 8b 83 58 ff ff ff 48 c1 e8 07 83
[ 7782.624248] RIP  [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
[ 7782.624248]  RSP <ffff88013748be40>
[ 7782.624248] CR2: 0000000000000544
[ 7782.661994] ---[ end trace 721e14960eb939bc ]---

This started happening since commit 4bacc9c9234 (overlayfs: Make f_path
always point to the overlay and f_inode to the underlay) and even though
after this change we could still access the btrfs inode through
struct file->f_mapping->host or struct file->f_inode, we would end up
resulting in more similar issues later on at check_parent_dirs_for_sync()
because the dentry we got (from struct file->f_path.dentry) was from
overlayfs and not from btrfs, that is, we had no way of getting the dentry
that belonged to btrfs (we always got the dentry that belonged to
overlayfs).

The new patch from Miklos Szeredi, titled "vfs: add file_dentry()" and
recently submitted to linux-fsdevel, adds a file_dentry() API that allows
us to get the btrfs dentry from the input file and therefore being able
to fsync when the upper and lower directories belong to btrfs filesystems.

This issue has been reported several times by users in the mailing list
and bugzilla. A test case for xfstests is being submitted as well.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101951
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109791
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Chris Mason March 21, 2016, 5:51 p.m. UTC | #1
On Mon, Mar 21, 2016 at 05:38:44PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If the lower or upper directory of an overlayfs mount belong to a btrfs
> file system and we fsync the file through the overlayfs' merged directory
> we ended up accessing an inode that didn't belong to btrfs as if it were
> a btrfs inode at btrfs_sync_file() resulting in a crash like the following:

Thanks Filipe, I'll put this in a second pull this week, even if its the
only patch in it ;)

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Filipe Manana March 21, 2016, 5:52 p.m. UTC | #2
On Mon, Mar 21, 2016 at 5:51 PM, Chris Mason <clm@fb.com> wrote:
> On Mon, Mar 21, 2016 at 05:38:44PM +0000, fdmanana@kernel.org wrote:
>> From: Filipe Manana <fdmanana@suse.com>
>>
>> If the lower or upper directory of an overlayfs mount belong to a btrfs
>> file system and we fsync the file through the overlayfs' merged directory
>> we ended up accessing an inode that didn't belong to btrfs as if it were
>> a btrfs inode at btrfs_sync_file() resulting in a crash like the following:
>
> Thanks Filipe, I'll put this in a second pull this week, even if its the
> only patch in it ;)

Well we need to wait for Miklos patch to be merged first.

>
> -chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba March 21, 2016, 5:58 p.m. UTC | #3
On Mon, Mar 21, 2016 at 01:51:07PM -0400, Chris Mason wrote:
> On Mon, Mar 21, 2016 at 05:38:44PM +0000, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> > 
> > If the lower or upper directory of an overlayfs mount belong to a btrfs
> > file system and we fsync the file through the overlayfs' merged directory
> > we ended up accessing an inode that didn't belong to btrfs as if it were
> > a btrfs inode at btrfs_sync_file() resulting in a crash like the following:
> 
> Thanks Filipe, I'll put this in a second pull this week, even if its the
> only patch in it ;)

I can send you more that I consider relevant for 4.6, low-risk subset of
what's in the misc-4.6 branch.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Mason March 21, 2016, 6:08 p.m. UTC | #4
On Mon, Mar 21, 2016 at 06:58:08PM +0100, David Sterba wrote:
> On Mon, Mar 21, 2016 at 01:51:07PM -0400, Chris Mason wrote:
> > On Mon, Mar 21, 2016 at 05:38:44PM +0000, fdmanana@kernel.org wrote:
> > > From: Filipe Manana <fdmanana@suse.com>
> > > 
> > > If the lower or upper directory of an overlayfs mount belong to a btrfs
> > > file system and we fsync the file through the overlayfs' merged directory
> > > we ended up accessing an inode that didn't belong to btrfs as if it were
> > > a btrfs inode at btrfs_sync_file() resulting in a crash like the following:
> > 
> > Thanks Filipe, I'll put this in a second pull this week, even if its the
> > only patch in it ;)
> 
> I can send you more that I consider relevant for 4.6, low-risk subset of
> what's in the misc-4.6 branch.

You had sent the list last week, I was going to start with that and look
on the list for anything else urgent.  If you have them in a git branch
already I won't say no ;)

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba March 21, 2016, 6:44 p.m. UTC | #5
On Mon, Mar 21, 2016 at 02:08:00PM -0400, Chris Mason wrote:
> On Mon, Mar 21, 2016 at 06:58:08PM +0100, David Sterba wrote:
> > On Mon, Mar 21, 2016 at 01:51:07PM -0400, Chris Mason wrote:
> > > On Mon, Mar 21, 2016 at 05:38:44PM +0000, fdmanana@kernel.org wrote:
> > > > From: Filipe Manana <fdmanana@suse.com>
> > > > 
> > > > If the lower or upper directory of an overlayfs mount belong to a btrfs
> > > > file system and we fsync the file through the overlayfs' merged directory
> > > > we ended up accessing an inode that didn't belong to btrfs as if it were
> > > > a btrfs inode at btrfs_sync_file() resulting in a crash like the following:
> > > 
> > > Thanks Filipe, I'll put this in a second pull this week, even if its the
> > > only patch in it ;)
> > 
> > I can send you more that I consider relevant for 4.6, low-risk subset of
> > what's in the misc-4.6 branch.
> 
> You had sent the list last week, I was going to start with that and look
> on the list for anything else urgent.  If you have them in a git branch
> already I won't say no ;)

Meanwhile I had a chance to review and let them through tests so I think
it's time for a pull request.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Mason March 25, 2016, 6:49 p.m. UTC | #6
On Mon, Mar 21, 2016 at 05:52:44PM +0000, Filipe Manana wrote:
> On Mon, Mar 21, 2016 at 5:51 PM, Chris Mason <clm@fb.com> wrote:
> > On Mon, Mar 21, 2016 at 05:38:44PM +0000, fdmanana@kernel.org wrote:
> >> From: Filipe Manana <fdmanana@suse.com>
> >>
> >> If the lower or upper directory of an overlayfs mount belong to a btrfs
> >> file system and we fsync the file through the overlayfs' merged directory
> >> we ended up accessing an inode that didn't belong to btrfs as if it were
> >> a btrfs inode at btrfs_sync_file() resulting in a crash like the following:
> >
> > Thanks Filipe, I'll put this in a second pull this week, even if its the
> > only patch in it ;)
> 
> Well we need to wait for Miklos patch to be merged first.

I'll cut this one on top of rc1 in a dedicated branch instead.  That way
people can still use the rest of our btrfs pull against v4.5

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Filipe Manana March 25, 2016, 7:46 p.m. UTC | #7
On Fri, Mar 25, 2016 at 6:49 PM, Chris Mason <clm@fb.com> wrote:
> On Mon, Mar 21, 2016 at 05:52:44PM +0000, Filipe Manana wrote:
>> On Mon, Mar 21, 2016 at 5:51 PM, Chris Mason <clm@fb.com> wrote:
>> > On Mon, Mar 21, 2016 at 05:38:44PM +0000, fdmanana@kernel.org wrote:
>> >> From: Filipe Manana <fdmanana@suse.com>
>> >>
>> >> If the lower or upper directory of an overlayfs mount belong to a btrfs
>> >> file system and we fsync the file through the overlayfs' merged directory
>> >> we ended up accessing an inode that didn't belong to btrfs as if it were
>> >> a btrfs inode at btrfs_sync_file() resulting in a crash like the following:
>> >
>> > Thanks Filipe, I'll put this in a second pull this week, even if its the
>> > only patch in it ;)
>>
>> Well we need to wait for Miklos patch to be merged first.
>
> I'll cut this one on top of rc1 in a dedicated branch instead.  That way
> people can still use the rest of our btrfs pull against v4.5

At the moment, the patch it depends on it's not yet on Linus' tree.
But it's in Miklos vfs tree:
https://git.kernel.org/cgit/linux/kernel/git/mszeredi/vfs.git/log/?h=overlayfs-next

>
> -chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Mason March 29, 2016, 8:06 p.m. UTC | #8
On Mon, Mar 21, 2016 at 05:38:44PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If the lower or upper directory of an overlayfs mount belong to a btrfs
> file system and we fsync the file through the overlayfs' merged directory
> we ended up accessing an inode that didn't belong to btrfs as if it were
> a btrfs inode at btrfs_sync_file() resulting in a crash like the following:

Thanks Filipe, adding a sob so this can go upstream with the related
patches.

> 
> [ 7782.588845] BUG: unable to handle kernel NULL pointer dereference at 0000000000000544
> [ 7782.590624] IP: [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
> [ 7782.591931] PGD 4d954067 PUD 1e878067 PMD 0
> [ 7782.592016] Oops: 0002 [#6] PREEMPT SMP DEBUG_PAGEALLOC
> [ 7782.592016] Modules linked in: btrfs overlay ppdev crc32c_generic evdev xor raid6_pq psmouse pcspkr sg serio_raw acpi_cpufreq parport_pc parport tpm_tis i2c_piix4 tpm i2c_core processor button loop autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix virtio_pci libata virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs]
> [ 7782.592016] CPU: 10 PID: 16437 Comm: xfs_io Tainted: G      D         4.5.0-rc6-btrfs-next-26+ #1
> [ 7782.592016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
> [ 7782.592016] task: ffff88001b8d40c0 ti: ffff880137488000 task.ti: ffff880137488000
> [ 7782.592016] RIP: 0010:[<ffffffffa030b7ab>]  [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
> [ 7782.592016] RSP: 0018:ffff88013748be40  EFLAGS: 00010286
> [ 7782.592016] RAX: 0000000080000000 RBX: ffff880133b30c88 RCX: 0000000000000001
> [ 7782.592016] RDX: 0000000000000001 RSI: ffffffff8148fec0 RDI: 00000000ffffffff
> [ 7782.592016] RBP: ffff88013748bec0 R08: 0000000000000001 R09: 0000000000000000
> [ 7782.624248] R10: ffff88013748be40 R11: 0000000000000246 R12: 0000000000000000
> [ 7782.624248] R13: 0000000000000000 R14: 00000000009305a0 R15: ffff880015e3be40
> [ 7782.624248] FS:  00007fa83b9cb700(0000) GS:ffff88023ed40000(0000) knlGS:0000000000000000
> [ 7782.624248] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 7782.624248] CR2: 0000000000000544 CR3: 00000001fa652000 CR4: 00000000000006e0
> [ 7782.624248] Stack:
> [ 7782.624248]  ffffffff8108b5cc ffff88013748bec0 0000000000000246 ffff8800b005ded0
> [ 7782.624248]  ffff880133b30d60 8000000000000000 7fffffffffffffff 0000000000000246
> [ 7782.624248]  0000000000000246 ffffffff81074f9b ffffffff8104357c ffff880015e3be40
> [ 7782.624248] Call Trace:
> [ 7782.624248]  [<ffffffff8108b5cc>] ? arch_local_irq_save+0x9/0xc
> [ 7782.624248]  [<ffffffff81074f9b>] ? ___might_sleep+0xce/0x217
> [ 7782.624248]  [<ffffffff8104357c>] ? __do_page_fault+0x3c0/0x43a
> [ 7782.624248]  [<ffffffff811a2351>] vfs_fsync_range+0x8c/0x9e
> [ 7782.624248]  [<ffffffff811a237f>] vfs_fsync+0x1c/0x1e
> [ 7782.624248]  [<ffffffff811a24d6>] do_fsync+0x31/0x4a
> [ 7782.624248]  [<ffffffff811a2700>] SyS_fsync+0x10/0x14
> [ 7782.624248]  [<ffffffff81493617>] entry_SYSCALL_64_fastpath+0x12/0x6b
> [ 7782.624248] Code: 85 c0 0f 85 e2 02 00 00 48 8b 45 b0 31 f6 4c 29 e8 48 ff c0 48 89 45 a8 48 8d 83 d8 00 00 00 48 89 c7 48 89 45 a0 e8 fc 43 18 e1 <f0> 41 ff 84 24 44 05 00 00 48 8b 83 58 ff ff ff 48 c1 e8 07 83
> [ 7782.624248] RIP  [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
> [ 7782.624248]  RSP <ffff88013748be40>
> [ 7782.624248] CR2: 0000000000000544
> [ 7782.661994] ---[ end trace 721e14960eb939bc ]---
> 
> This started happening since commit 4bacc9c9234 (overlayfs: Make f_path
> always point to the overlay and f_inode to the underlay) and even though
> after this change we could still access the btrfs inode through
> struct file->f_mapping->host or struct file->f_inode, we would end up
> resulting in more similar issues later on at check_parent_dirs_for_sync()
> because the dentry we got (from struct file->f_path.dentry) was from
> overlayfs and not from btrfs, that is, we had no way of getting the dentry
> that belonged to btrfs (we always got the dentry that belonged to
> overlayfs).
> 
> The new patch from Miklos Szeredi, titled "vfs: add file_dentry()" and
> recently submitted to linux-fsdevel, adds a file_dentry() API that allows
> us to get the btrfs dentry from the input file and therefore being able
> to fsync when the upper and lower directories belong to btrfs filesystems.
> 
> This issue has been reported several times by users in the mailing list
> and bugzilla. A test case for xfstests is being submitted as well.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101951
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109791
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Signed-off-by: Chris Mason <clm@fb.com>


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 15a09cb..2f40482 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1905,7 +1905,7 @@  static int start_ordered_ops(struct inode *inode, loff_t start, loff_t end)
  */
 int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 {
-	struct dentry *dentry = file->f_path.dentry;
+	struct dentry *dentry = file_dentry(file);
 	struct inode *inode = d_inode(dentry);
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_trans_handle *trans;