Message ID | 20181220122324.17082-1-david@fromorbit.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()" | expand |
On Thu, Dec 20, 2018 at 11:23:24PM +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@redhat.com> > > This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3. > > The reverted commit added page reference counting to iomap page > structures that are used to track block size < page size state. This > was supposed to align the code with page migration page accounting > assumptions, but what it has done instead is break XFS filesystems. > Every fstests run I've done on sub-page block size XFS filesystems > has since picking up this commit 2 days ago has failed with bad page > state errors such as: > > # ./run_check.sh "-m rmapbt=1,reflink=1 -i sparse=1 -b size=1k" "generic/038" > .... > SECTION -- xfs > FSTYP -- xfs (debug) > PLATFORM -- Linux/x86_64 test1 4.20.0-rc6-dgc+ > MKFS_OPTIONS -- -f -m rmapbt=1,reflink=1 -i sparse=1 -b size=1k /dev/sdc > MOUNT_OPTIONS -- /dev/sdc /mnt/scratch > > generic/038 454s ... > run fstests generic/038 at 2018-12-20 18:43:05 > XFS (sdc): Unmounting Filesystem > XFS (sdc): Mounting V5 Filesystem > XFS (sdc): Ending clean mount > BUG: Bad page state in process kswapd0 pfn:3a7fa > page:ffffea0000ccbeb0 count:0 mapcount:0 mapping:ffff88800d9b6360 index:0x1 > flags: 0xfffffc0000000() > raw: 000fffffc0000000 dead000000000100 dead000000000200 ffff88800d9b6360 > raw: 0000000000000001 0000000000000000 00000000ffffffff > page dumped because: non-NULL mapping > CPU: 0 PID: 676 Comm: kswapd0 Not tainted 4.20.0-rc6-dgc+ #915 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 > Call Trace: > dump_stack+0x67/0x90 > bad_page.cold.116+0x8a/0xbd > free_pcppages_bulk+0x4bf/0x6a0 > free_unref_page_list+0x10f/0x1f0 > shrink_page_list+0x49d/0xf50 > shrink_inactive_list+0x19d/0x3b0 > shrink_node_memcg.constprop.77+0x398/0x690 > ? shrink_slab.constprop.81+0x278/0x3f0 > shrink_node+0x7a/0x2f0 > kswapd+0x34b/0x6d0 > ? node_reclaim+0x240/0x240 > kthread+0x11f/0x140 > ? __kthread_bind_mask+0x60/0x60 > ret_from_fork+0x24/0x30 > Disabling lock debugging due to kernel taint > .... > > The failures are from anyway that frees pages and empties the > per-cpu page magazines, so it's not a predictable failure or an easy > to debug failure. > > generic/038 is a reliable reproducer of this problem - it has a 9 in > 10 failure rate on one of my test machines. Failure on other > machines have been at random points in fstests runs but every run > has ended up tripping this problem. Hence generic/038 was used to > bisect the failure because it was the most reliable failure. > > It is too close to the 4.20 release (not to mention holidays) to > try to diagnose, fix and test the underlying cause of the problem, > so reverting the commit is the only option we have right now. The > revert has been tested against a current tot 4.20-rc7+ kernel across > multiple machines running sub-page block size XFs filesystems and > none of the bad page state failures have been seen. > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > --- > fs/iomap.c | 7 ------- > 1 file changed, 7 deletions(-) <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter>
Lets fix the problem instead. I'll dig into it.
On Thu, Dec 20, 2018 at 06:41:30PM +0100, Christoph Hellwig wrote:
> Lets fix the problem instead. I'll dig into it.
I'd say we still need to revert this for 4.20 which is due for
release in a couple of days. Most of us are going to be on holidays
and away from computers for the next week, so I don't think we
have time to adequately review and test a new fix (which doesn't
exist yet) before the release occurs.
Cheers,
Dave.
On Thu, Dec 20, 2018 at 01:28:25PM +0100, Greg KH wrote: > On Thu, Dec 20, 2018 at 11:23:24PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3. <snip> > <formletter> > > This is not the correct way to submit patches for inclusion in the > stable kernel tree. Please read: > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html > for how to do this properly. > > </formletter> This wasn't a stable kernel patch submission. This was to let the stable kernel maintainers know that a commit that was submitted for stable kernel inclusion (via a cc stable in the commit message) is buggy and should not be backported. i.e. if you've already queued/ backported 61c6de6672 then you need to drop it from your trees. Cheers, Dave.
On Thu, Dec 20, 2018 at 12:53 PM Dave Chinner <david@fromorbit.com> wrote: > > I'd say we still need to revert this for 4.20 which is due for > release in a couple of days. Most of us are going to be on holidays > and away from computers for the next week, so I don't think we > have time to adequately review and test a new fix (which doesn't > exist yet) before the release occurs. Yeah. I already applied the revert. If somebody finds a "duh" moment, and an alternate fix gets posted and tested we can revert the revert and fix it properly, but for 4.20 (and for xmas) I do think that just going back to the previous state is otherwise the right choice. Linus
On Fri, Dec 21, 2018 at 07:57:21AM +1100, Dave Chinner wrote: > On Thu, Dec 20, 2018 at 01:28:25PM +0100, Greg KH wrote: > > On Thu, Dec 20, 2018 at 11:23:24PM +1100, Dave Chinner wrote: > > > From: Dave Chinner <dchinner@redhat.com> > > > > > > This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3. > > <snip> > > > <formletter> > > > > This is not the correct way to submit patches for inclusion in the > > stable kernel tree. Please read: > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html > > for how to do this properly. > > > > </formletter> > > This wasn't a stable kernel patch submission. This was to let the > stable kernel maintainers know that a commit that was submitted for > stable kernel inclusion (via a cc stable in the commit message) is > buggy and should not be backported. i.e. if you've already queued/ > backported 61c6de6672 then you need to drop it from your trees. As the commit 61c6de667263 ("fs/iomap.c: get/put the page in iomap_page_create/release()") was tagged for a stable release (and is in 4.19.11) it would be nice to also add the cc: stable in the revert. When you just blindly send it to the stable mailing list, we have no idea what to do with it, hence the automated form letter that was sent. Now that this is in Linus's tree, I'll dig out the revert and remember to queue it up for the next round of stable updates, thanks for letting us know. greg k-h
On Thu, Dec 20, 2018 at 02:42:15PM -0800, Linus Torvalds wrote: > Yeah. I already applied the revert. If somebody finds a "duh" moment, > and an alternate fix gets posted and tested we can revert the revert > and fix it properly, but for 4.20 (and for xmas) I do think that just > going back to the previous state is otherwise the right choice. As far as I can tell the commit simply missed updating the recounts in the actual migrate callback for which it was added. The fixed version looks something like this (still running xfstests): diff --git a/fs/iomap.c b/fs/iomap.c index d6bc98ae8d35..20c9c1cadd4e 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -116,6 +116,12 @@ iomap_page_create(struct inode *inode, struct page *page) atomic_set(&iop->read_count, 0); atomic_set(&iop->write_count, 0); bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE); + + /* + * migrate_page_move_mapping() assumes that pages with private data have + * their count elevated by 1. + */ + get_page(page); set_page_private(page, (unsigned long)iop); SetPagePrivate(page); return iop; @@ -132,6 +138,7 @@ iomap_page_release(struct page *page) WARN_ON_ONCE(atomic_read(&iop->write_count)); ClearPagePrivate(page); set_page_private(page, 0); + put_page(page); kfree(iop); } @@ -556,8 +563,10 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage, if (page_has_private(page)) { ClearPagePrivate(page); + get_page(newpage); set_page_private(newpage, page_private(page)); set_page_private(page, 0); + put_page(page); SetPagePrivate(newpage); }
Dave, any chance you can validate Christoph's patch and we could re-apply the fix? Linus On Fri, Dec 21, 2018 at 1:39 AM Christoph Hellwig <hch@lst.de> wrote: > > As far as I can tell the commit simply missed updating the recounts > in the actual migrate callback for which it was added. The fixed > version looks something like this (still running xfstests):
On Fri, Dec 21, 2018 at 11:18:35AM -0800, Linus Torvalds wrote: > Dave, > any chance you can validate Christoph's patch and we could re-apply the fix? I think Dave being in Australia is probably off to his well earned weekend, like I should here in Europe now :) Given that it took so long to find the original issue I think we should not worry about 4.20 and just get the fix into 4.21-rc early and add it to stable.
On Fri, Dec 21, 2018 at 11:20 AM Christoph Hellwig <hch@lst.de> wrote: > > Given that it took so long to find the original issue I think we should > not worry about 4.20 and just get the fix into 4.21-rc early and add it > to stable. Fair enough, I'll stop worrying ;) Linus
On Fri, Dec 21, 2018 at 11:18:35AM -0800, Linus Torvalds wrote: > Dave, > any chance you can validate Christoph's patch and we could re-apply the fix? Not for a few days - I'm out of the office until after xmas now. I did notice the same bug in the original patch yesterday (and mentioned it to Darrick on #xfs) when doing post-mortem analysis, but I couldn't tell if that was the only problem all my test machines were tied up testing that the revert didn't introduce/uncover any new problems... I'll test it as soon as I'm back around the office so we can get it in -rc1 if there's no problems. Cheers, Dave.
On 12/21/18 1:40 AM, Greg KH wrote: > On Fri, Dec 21, 2018 at 07:57:21AM +1100, Dave Chinner wrote: >> On Thu, Dec 20, 2018 at 01:28:25PM +0100, Greg KH wrote: >>> On Thu, Dec 20, 2018 at 11:23:24PM +1100, Dave Chinner wrote: >>>> From: Dave Chinner <dchinner@redhat.com> >>>> >>>> This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3. >> >> <snip> >> >>> <formletter> >>> >>> This is not the correct way to submit patches for inclusion in the >>> stable kernel tree. Please read: >>> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html >>> for how to do this properly. >>> >>> </formletter> >> >> This wasn't a stable kernel patch submission. This was to let the >> stable kernel maintainers know that a commit that was submitted for >> stable kernel inclusion (via a cc stable in the commit message) is >> buggy and should not be backported. i.e. if you've already queued/ >> backported 61c6de6672 then you need to drop it from your trees. > > As the commit 61c6de667263 ("fs/iomap.c: get/put the page in > iomap_page_create/release()") was tagged for a stable release (and is in > 4.19.11) it would be nice to also add the cc: stable in the revert. > When you just blindly send it to the stable mailing list, we have no > idea what to do with it, hence the automated form letter that was sent. > > Now that this is in Linus's tree, I'll dig out the revert and remember > to queue it up for the next round of stable updates, thanks for letting > us know. As a side note from an interested observer, I had no idea that patches from not-yet-released Linus kernels could make it into a "stable" kernel within 5 days of Linus' commit to an -rc. That seems shockingly fast, with very little margin for error. -Eric
On Fri, Dec 21, 2018 at 08:40:42AM +0100, Greg KH wrote: >On Fri, Dec 21, 2018 at 07:57:21AM +1100, Dave Chinner wrote: >> On Thu, Dec 20, 2018 at 01:28:25PM +0100, Greg KH wrote: >> > On Thu, Dec 20, 2018 at 11:23:24PM +1100, Dave Chinner wrote: >> > > From: Dave Chinner <dchinner@redhat.com> >> > > >> > > This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3. >> >> <snip> >> >> > <formletter> >> > >> > This is not the correct way to submit patches for inclusion in the >> > stable kernel tree. Please read: >> > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html >> > for how to do this properly. >> > >> > </formletter> >> >> This wasn't a stable kernel patch submission. This was to let the >> stable kernel maintainers know that a commit that was submitted for >> stable kernel inclusion (via a cc stable in the commit message) is >> buggy and should not be backported. i.e. if you've already queued/ >> backported 61c6de6672 then you need to drop it from your trees. > >As the commit 61c6de667263 ("fs/iomap.c: get/put the page in >iomap_page_create/release()") was tagged for a stable release (and is in >4.19.11) it would be nice to also add the cc: stable in the revert. >When you just blindly send it to the stable mailing list, we have no >idea what to do with it, hence the automated form letter that was sent. > >Now that this is in Linus's tree, I'll dig out the revert and remember >to queue it up for the next round of stable updates, thanks for letting >us know. I've queued the revert for 4.19. -- Thanks, Sasha
diff --git a/fs/iomap.c b/fs/iomap.c index 5bc172f3dfe8..d6bc98ae8d35 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -116,12 +116,6 @@ iomap_page_create(struct inode *inode, struct page *page) atomic_set(&iop->read_count, 0); atomic_set(&iop->write_count, 0); bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE); - - /* - * migrate_page_move_mapping() assumes that pages with private data have - * their count elevated by 1. - */ - get_page(page); set_page_private(page, (unsigned long)iop); SetPagePrivate(page); return iop; @@ -138,7 +132,6 @@ iomap_page_release(struct page *page) WARN_ON_ONCE(atomic_read(&iop->write_count)); ClearPagePrivate(page); set_page_private(page, 0); - put_page(page); kfree(iop); }