Message ID | Yhks88tO3Em/G370@mit.edu (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [-v2] ext4: don't BUG if kernel subsystems dirty pages without asking ext4 first | expand |
On Fri, Feb 25, 2022 at 02:24:35PM -0500, Theodore Ts'o wrote: > [un]pin_user_pages_remote is dirtying pages without properly warning > the file system in advance (or faulting in the file data if the page > is not yet in the page cache). This was noted by Jan Kara in 2018[1] > and more recently has resulted in bug reports by Syzbot in various > Android kernels[2]. > > This is technically a bug in the mm/gup.c codepath, but arguably ext4 > is fragile in that a buggy get_user_pages() implementation causes ext4 > to crash, where as other file systems are not crashing (although in > some cases the user data will be lost since gup code is not properly > informing the file system to potentially allocate blocks or reserve > space when writing into a sparse portion of file). I suspect in real > life it is rare that people are using RDMA into file-backed memory, > which is why no one has complained to ext4 developers except fuzzing > programs. > > So instead of crashing with a BUG, issue a warning (since there may be > potential data loss) and just mark the page as clean to avoid > unprivileged denial of service attacks until the problem can be > properly fixed. More discussion and background can be found in the > thread starting at [2]. > > [1] https://www.spinics.net/lists/linux-mm/msg142700.html Can you use a lore link (https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz/T/#u)? > + /* > + * Should never happen but for buggy code in > + * other subsystemsa that call subsystemsa => subsystems > + * set_page_dirty() without properly warning > + * the file system first. See [1] for more > + * information. > + * > + * [1] https://www.spinics.net/lists/linux-mm/msg142700.html Likewise, lore link here. - Eric
On Fri, Feb 25, 2022 at 12:51:28PM -0800, Eric Biggers wrote: > > > > [1] https://www.spinics.net/lists/linux-mm/msg142700.html > > Can you use a lore link > (https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz/T/#u)? Sure, thanks for finding the lore link! I did try searching for it last night, but for some reason I wasn't able to surface it. > > + /* > > + * Should never happen but for buggy code in > > + * other subsystemsa that call > > subsystemsa => subsystems Already fixed in my local version of the patch. > > + * [1] https://www.spinics.net/lists/linux-mm/msg142700.html > > Likewise, lore link here. Ack. - Ted
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 01c9e4f743ba..f8fefbf67306 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1993,6 +1993,15 @@ static int ext4_writepage(struct page *page, else len = PAGE_SIZE; + /* Should never happen but for buggy gup code */ + if (!page_has_buffers(page)) { + ext4_warning_inode(inode, + "page %lu does not have buffers attached", page->index); + ClearPageDirty(page); + unlock_page(page); + return 0; + } + page_bufs = page_buffers(page); /* * We cannot do block allocation or other extent handling in this @@ -2588,12 +2597,28 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) (mpd->wbc->sync_mode == WB_SYNC_NONE)) || unlikely(page->mapping != mapping)) { unlock_page(page); - continue; + goto out; } wait_on_page_writeback(page); BUG_ON(PageWriteback(page)); + /* + * Should never happen but for buggy code in + * other subsystemsa that call + * set_page_dirty() without properly warning + * the file system first. See [1] for more + * information. + * + * [1] https://www.spinics.net/lists/linux-mm/msg142700.html + */ + if (!page_has_buffers(page)) { + ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index); + ClearPageDirty(page); + unlock_page(page); + continue; + } + if (mpd->map.m_len == 0) mpd->first_page = page->index; mpd->next_page = page->index + 1;
[un]pin_user_pages_remote is dirtying pages without properly warning the file system in advance (or faulting in the file data if the page is not yet in the page cache). This was noted by Jan Kara in 2018[1] and more recently has resulted in bug reports by Syzbot in various Android kernels[2]. This is technically a bug in the mm/gup.c codepath, but arguably ext4 is fragile in that a buggy get_user_pages() implementation causes ext4 to crash, where as other file systems are not crashing (although in some cases the user data will be lost since gup code is not properly informing the file system to potentially allocate blocks or reserve space when writing into a sparse portion of file). I suspect in real life it is rare that people are using RDMA into file-backed memory, which is why no one has complained to ext4 developers except fuzzing programs. So instead of crashing with a BUG, issue a warning (since there may be potential data loss) and just mark the page as clean to avoid unprivileged denial of service attacks until the problem can be properly fixed. More discussion and background can be found in the thread starting at [2]. [1] https://www.spinics.net/lists/linux-mm/msg142700.html [2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@google.com Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db@syzkaller.appspotmail.com Reported-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> --- fs/ext4/inode.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-)