diff mbox

lib/scatterlist.c: don't flush_kernel_dcache_page on slab page

Message ID 1382977205-26268-1-git-send-email-ming.lei@canonical.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei Oct. 28, 2013, 4:20 p.m. UTC
Commit b1adaf65ba03([SCSI] block: add sg buffer copy helper functions)
introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page()
on pages in SG list after these pages are written to.

Unfortunately, the commit may introduce a potential bug:

	- Before sending some SCSI commands, kmalloc() buffer may be
	passed to block layper, so flush_kernel_dcache_page() can
	see a slab page finally

	- According to cachetlb.txt, flush_kernel_dcache_page() is
	only called on "a user page", which surely can't be a slab page.

	- ARCH's implementation of flush_kernel_dcache_page() may
	use page mapping information to do optimization so page_mapping()
	will see the slab page, then VM_BUG_ON() is triggered.

Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
before calling flush_kernel_dcache_page().

Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Simon Baatz <gmbnomis@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <tj@kernel.org>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 lib/scatterlist.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Catalin Marinas Oct. 29, 2013, 10:51 a.m. UTC | #1
On Mon, Oct 28, 2013 at 04:20:05PM +0000, Ming Lei wrote:
> Commit b1adaf65ba03([SCSI] block: add sg buffer copy helper functions)
> introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page()
> on pages in SG list after these pages are written to.
> 
> Unfortunately, the commit may introduce a potential bug:
> 
> 	- Before sending some SCSI commands, kmalloc() buffer may be
> 	passed to block layper, so flush_kernel_dcache_page() can
> 	see a slab page finally
> 
> 	- According to cachetlb.txt, flush_kernel_dcache_page() is
> 	only called on "a user page", which surely can't be a slab page.
> 
> 	- ARCH's implementation of flush_kernel_dcache_page() may
> 	use page mapping information to do optimization so page_mapping()
> 	will see the slab page, then VM_BUG_ON() is triggered.
> 
> Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
> and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
> before calling flush_kernel_dcache_page().
> 
> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: Simon Baatz <gmbnomis@gmail.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Simon Baatz Oct. 29, 2013, 12:57 p.m. UTC | #2
On Tue, Oct 29, 2013 at 12:20:05AM +0800, Ming Lei wrote:
> Commit b1adaf65ba03([SCSI] block: add sg buffer copy helper functions)
> introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page()
> on pages in SG list after these pages are written to.
> 
> Unfortunately, the commit may introduce a potential bug:
> 
> 	- Before sending some SCSI commands, kmalloc() buffer may be
> 	passed to block layper, so flush_kernel_dcache_page() can
> 	see a slab page finally
> 
> 	- According to cachetlb.txt, flush_kernel_dcache_page() is
> 	only called on "a user page", which surely can't be a slab page.
> 
> 	- ARCH's implementation of flush_kernel_dcache_page() may
> 	use page mapping information to do optimization so page_mapping()
> 	will see the slab page, then VM_BUG_ON() is triggered.
> 
> Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
> and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
> before calling flush_kernel_dcache_page().
> 
> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: Simon Baatz <gmbnomis@gmail.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
>  lib/scatterlist.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index a685c8a..d16fa29 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -577,7 +577,8 @@ void sg_miter_stop(struct sg_mapping_iter *miter)
>  		miter->__offset += miter->consumed;
>  		miter->__remaining -= miter->consumed;
>  
> -		if (miter->__flags & SG_MITER_TO_SG)
> +		if ((miter->__flags & SG_MITER_TO_SG) &&
> +		    !PageSlab(miter->page))
>  			flush_kernel_dcache_page(miter->page);
>  
>  		if (miter->__flags & SG_MITER_ATOMIC) {
> -- 
> 1.7.9.5
> 

Tested-by: Simon Baatz <gmbnomis@gmail.com>

on top of 3.12-rc7 on Kirkwood with DEBUG_VM enabled.

- Simon
Andrew Morton Oct. 31, 2013, 10:27 p.m. UTC | #3
On Tue, 29 Oct 2013 00:20:05 +0800 Ming Lei <ming.lei@canonical.com> wrote:

> Commit b1adaf65ba03([SCSI] block: add sg buffer copy helper functions)
> introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page()
> on pages in SG list after these pages are written to.
> 
> Unfortunately, the commit may introduce a potential bug:
> 
> 	- Before sending some SCSI commands, kmalloc() buffer may be
> 	passed to block layper, so flush_kernel_dcache_page() can
> 	see a slab page finally
> 
> 	- According to cachetlb.txt, flush_kernel_dcache_page() is
> 	only called on "a user page", which surely can't be a slab page.
> 
> 	- ARCH's implementation of flush_kernel_dcache_page() may
> 	use page mapping information to do optimization so page_mapping()
> 	will see the slab page, then VM_BUG_ON() is triggered.
> 
> Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
> and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
> before calling flush_kernel_dcache_page().

We should work out which kernel(s) need this patch.  b1adaf65ba03 was
merged in 2008, so presumably some more recent patch has exposed the
problem, but I don't know what one that was.

Help me out here?
Russell King - ARM Linux Oct. 31, 2013, 10:44 p.m. UTC | #4
On Thu, Oct 31, 2013 at 03:27:45PM -0700, Andrew Morton wrote:
> On Tue, 29 Oct 2013 00:20:05 +0800 Ming Lei <ming.lei@canonical.com> wrote:
> 
> > Commit b1adaf65ba03([SCSI] block: add sg buffer copy helper functions)
> > introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page()
> > on pages in SG list after these pages are written to.
> > 
> > Unfortunately, the commit may introduce a potential bug:
> > 
> > 	- Before sending some SCSI commands, kmalloc() buffer may be
> > 	passed to block layper, so flush_kernel_dcache_page() can
> > 	see a slab page finally
> > 
> > 	- According to cachetlb.txt, flush_kernel_dcache_page() is
> > 	only called on "a user page", which surely can't be a slab page.
> > 
> > 	- ARCH's implementation of flush_kernel_dcache_page() may
> > 	use page mapping information to do optimization so page_mapping()
> > 	will see the slab page, then VM_BUG_ON() is triggered.
> > 
> > Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
> > and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
> > before calling flush_kernel_dcache_page().
> 
> We should work out which kernel(s) need this patch.  b1adaf65ba03 was
> merged in 2008, so presumably some more recent patch has exposed the
> problem, but I don't know what one that was.
> 
> Help me out here?

1bc39742aab0 (ARM: 7755/1: handle user space mapped pages in
flush_kernel_dcache_page) in June 2013, which according to git describe
is:

v3.10-rc4-9-g1bc39742aab0

Presumably that means v3.10+ need it.
Aaro Koskinen Oct. 31, 2013, 10:49 p.m. UTC | #5
Hi,

On Thu, Oct 31, 2013 at 03:27:45PM -0700, Andrew Morton wrote:
> On Tue, 29 Oct 2013 00:20:05 +0800 Ming Lei <ming.lei@canonical.com> wrote:
> 
> > Commit b1adaf65ba03([SCSI] block: add sg buffer copy helper functions)
> > introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page()
> > on pages in SG list after these pages are written to.
> > 
> > Unfortunately, the commit may introduce a potential bug:
> > 
> > 	- Before sending some SCSI commands, kmalloc() buffer may be
> > 	passed to block layper, so flush_kernel_dcache_page() can
> > 	see a slab page finally
> > 
> > 	- According to cachetlb.txt, flush_kernel_dcache_page() is
> > 	only called on "a user page", which surely can't be a slab page.
> > 
> > 	- ARCH's implementation of flush_kernel_dcache_page() may
> > 	use page mapping information to do optimization so page_mapping()
> > 	will see the slab page, then VM_BUG_ON() is triggered.
> > 
> > Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
> > and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
> > before calling flush_kernel_dcache_page().
> 
> We should work out which kernel(s) need this patch.  b1adaf65ba03 was
> merged in 2008, so presumably some more recent patch has exposed the
> problem, but I don't know what one that was.
> 
> Help me out here?

On ARM, this problem started to appear after
1bc39742aab09248169ef9d3727c9def3528b3f3 (ARM: 7755/1: handle user space
mapped pages in flush_kernel_dcache_page) and that was tagged for 3.2+
stable kernels.

A.
Andrew Morton Oct. 31, 2013, 10:54 p.m. UTC | #6
On Fri, 1 Nov 2013 00:49:27 +0200 Aaro Koskinen <aaro.koskinen@iki.fi> wrote:

> Hi,
> 
> On Thu, Oct 31, 2013 at 03:27:45PM -0700, Andrew Morton wrote:
> > On Tue, 29 Oct 2013 00:20:05 +0800 Ming Lei <ming.lei@canonical.com> wrote:
> > 
> > > Commit b1adaf65ba03([SCSI] block: add sg buffer copy helper functions)
> > > introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page()
> > > on pages in SG list after these pages are written to.
> > > 
> > > Unfortunately, the commit may introduce a potential bug:
> > > 
> > > 	- Before sending some SCSI commands, kmalloc() buffer may be
> > > 	passed to block layper, so flush_kernel_dcache_page() can
> > > 	see a slab page finally
> > > 
> > > 	- According to cachetlb.txt, flush_kernel_dcache_page() is
> > > 	only called on "a user page", which surely can't be a slab page.
> > > 
> > > 	- ARCH's implementation of flush_kernel_dcache_page() may
> > > 	use page mapping information to do optimization so page_mapping()
> > > 	will see the slab page, then VM_BUG_ON() is triggered.
> > > 
> > > Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
> > > and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
> > > before calling flush_kernel_dcache_page().
> > 
> > We should work out which kernel(s) need this patch.  b1adaf65ba03 was
> > merged in 2008, so presumably some more recent patch has exposed the
> > problem, but I don't know what one that was.
> > 
> > Help me out here?
> 
> On ARM, this problem started to appear after
> 1bc39742aab09248169ef9d3727c9def3528b3f3 (ARM: 7755/1: handle user space
> mapped pages in flush_kernel_dcache_page) and that was tagged for 3.2+
> stable kernels.
> 

OK, thanks.  I added

Cc: <stable@vger.kernel.org>    [3.2+]

to the changelog.

I guess I should get this and a few other things into Linus today, as
3.12 is nigh.
diff mbox

Patch

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index a685c8a..d16fa29 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -577,7 +577,8 @@  void sg_miter_stop(struct sg_mapping_iter *miter)
 		miter->__offset += miter->consumed;
 		miter->__remaining -= miter->consumed;
 
-		if (miter->__flags & SG_MITER_TO_SG)
+		if ((miter->__flags & SG_MITER_TO_SG) &&
+		    !PageSlab(miter->page))
 			flush_kernel_dcache_page(miter->page);
 
 		if (miter->__flags & SG_MITER_ATOMIC) {