Message ID | 1358332355.2384.11.camel@dabdike.int.hansenpartnership.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 1/16/2013 4:02 PM, James Bottomley wrote: > On Wed, 2013-01-16 at 12:07 +0530, Subhash Jadavani wrote: > >> Now consider this call stack from MMC block driver (this is on the ARmv7 >> based board): >> [ 98.918174] [<c001b50c>] (v7_dma_inv_range+0x30/0x48) from >> [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) >> [ 98.927819] [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) from >> [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) >> [ 98.937982] [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) from >> [<c0017ff8>] (dma_map_sg+0x3c/0x114) > OK, so this is showing that ARM itself is making the assumption that the > pages are contiguous in the page offset map. > > Fix this by doing the increment via the pfn, which will do the right > thing whatever the memory model. > > Signed-off-by: James Bottomley <JBottomley@Parallels.com> Thanks James. Yes, it make sense to fix the ARM code itself if it is the only one giving this trouble. I have tried your change below and it also fixes this issue (without having my blk-merge patch). I will forward your change to Russel King to see what he thinks about it. Regards, Subhash > > --- > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 6b2fb87..ab88c5b 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -809,7 +809,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, > op(vaddr, len, dir); > } > offset = 0; > - page++; > + page = pfn_to_page(page_to_pfn(page) + 1); > left -= len; > } while (left); > } > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Russell, Is it possible to pick up James patch below? Thread here: http://comments.gmane.org/gmane.linux.kernel.mmc/18670, have the details on the motivation behind this fix. Regards, Subhash -------- Original Message -------- Subject: Re: [PATCH v2 1/1] block: blk-merge: don't merge the pages with non-contiguous descriptors Date: Wed, 16 Jan 2013 18:09:14 +0530 From: Subhash Jadavani <subhashj@codeaurora.org> To: James Bottomley <James.Bottomley@HansenPartnership.com> CC: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, linux-mmc@vger.kernel.org, linux-arm-msm@vger.kernel.org, martin.petersen@oracle.com, asias@redhat.com, tj@kernel.org, linux-arm-kernel@lists.infradead.org, Russell King <linux@arm.linux.org.uk> On 1/16/2013 4:02 PM, James Bottomley wrote: > On Wed, 2013-01-16 at 12:07 +0530, Subhash Jadavani wrote: > >> Now consider this call stack from MMC block driver (this is on the ARmv7 >> based board): >> [ 98.918174] [<c001b50c>] (v7_dma_inv_range+0x30/0x48) from >> [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) >> [ 98.927819] [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) from >> [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) >> [ 98.937982] [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) from >> [<c0017ff8>] (dma_map_sg+0x3c/0x114) > OK, so this is showing that ARM itself is making the assumption that the > pages are contiguous in the page offset map. > > Fix this by doing the increment via the pfn, which will do the right > thing whatever the memory model. > > Signed-off-by: James Bottomley <JBottomley@Parallels.com> Thanks James. Yes, it make sense to fix the ARM code itself if it is the only one giving this trouble. I have tried your change below and it also fixes this issue (without having my blk-merge patch). I will forward your change to Russel King to see what he thinks about it. Regards, Subhash > > --- > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 6b2fb87..ab88c5b 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -809,7 +809,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, > op(vaddr, len, dir); > } > offset = 0; > - page++; > + page = pfn_to_page(page_to_pfn(page) + 1); > left -= len; > } while (left); > } > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-01-16 at 18:17 +0530, Subhash Jadavani wrote: > Is it possible to pick up James patch below? Thread here: > http://comments.gmane.org/gmane.linux.kernel.mmc/18670, have the details > on the motivation behind this fix. Someone should also audit the arm kernel code for more of these linear page array assumptions. I'm guessing that when sparsemem was added to arm over a year ago, whoever did it either didn't audit or missed a few. James
On Wed, Jan 16, 2013 at 12:51:55PM +0000, James Bottomley wrote: > On Wed, 2013-01-16 at 18:17 +0530, Subhash Jadavani wrote: > > Is it possible to pick up James patch below? Thread here: > > http://comments.gmane.org/gmane.linux.kernel.mmc/18670, have the details > > on the motivation behind this fix. > > Someone should also audit the arm kernel code for more of these linear > page array assumptions. I'm guessing that when sparsemem was added to > arm over a year ago, whoever did it either didn't audit or missed a few. No, that's a bad assumption. We've had discontigmem for years - maybe something like 12 years. I switched everything over to sparsemem, and sparsemem has been used on ARM for years too: commit 05944d74bc28fffbcce159cb915d0acff82f30a1 Author: Russell King <rmk@dyn-67.arm.linux.org.uk> Date: Thu Nov 30 20:43:51 2006 +0000 [ARM] Add initial sparsemem support Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> However, there's a big problem with this: very few of the lead people have machines which suffer from this disability, so there's very little testing of it - and there's very little testing of new code with it. The patch which originally introduced this code which your patch touches was part of adding highmem support to ARM: commit 43377453af83b8ff8c1c731da1508bd6b84ebfea Author: Nicolas Pitre <nico@cam.org> Date: Thu Mar 12 22:52:09 2009 -0400 [ARM] introduce dma_cache_maint_page() This is a helper to be used by the DMA mapping API to handle cache maintenance for memory identified by a page structure instead of a virtual address. Those pages may or may not be highmem pages, and when they're highmem pages, they may or may not be virtually mapped. When they're not mapped then there is no L1 cache to worry about. But even in that case the L2 cache must be processed since unmapped highmem pages can still be L2 cached. Signed-off-by: Nicolas Pitre <nico@marvell.com> some three years later, and has been through a number of revisions since. I'd really like to get rid of sparsemem so it's one less failure case, but alas there's a relatively small bunch of folk who rely upon it. That means it's always going to be more buggy.
On Wed, 2013-01-16 at 13:08 +0000, Russell King - ARM Linux wrote: > On Wed, Jan 16, 2013 at 12:51:55PM +0000, James Bottomley wrote: > > On Wed, 2013-01-16 at 18:17 +0530, Subhash Jadavani wrote: > > > Is it possible to pick up James patch below? Thread here: > > > http://comments.gmane.org/gmane.linux.kernel.mmc/18670, have the details > > > on the motivation behind this fix. > > > > Someone should also audit the arm kernel code for more of these linear > > page array assumptions. I'm guessing that when sparsemem was added to > > arm over a year ago, whoever did it either didn't audit or missed a few. > > No, that's a bad assumption. We've had discontigmem for years - maybe > something like 12 years. Discontigmem doesn't suffer from this particular problem: The breaks in the page arrays in a discontigmem environment represent real breaks in the physical memory map, so there can never arise a situation where page++ wouldn't get you the next valid page, provided there's still a real contiguous page of physical memory present in the system, of course. Sparsemem tried to be cleverer and it broke that assumption. I'm starting to wonder if there are other places in the kernel which make it and get broken by sparsemem. I have patches for parisc to switch it over to sparsemem from discontigmem ... I haven't pushed them yet, fortunately, and I certainly didn't check our underlying assumptions for this problem. > I switched everything over to sparsemem, and > sparsemem has been used on ARM for years too: > > commit 05944d74bc28fffbcce159cb915d0acff82f30a1 > Author: Russell King <rmk@dyn-67.arm.linux.org.uk> > Date: Thu Nov 30 20:43:51 2006 +0000 > > [ARM] Add initial sparsemem support > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > > However, there's a big problem with this: very few of the lead people > have machines which suffer from this disability, so there's very little > testing of it - and there's very little testing of new code with it. Anyone can test this just by enabling sparsemem with an artificially small SECTION_SIZE_BITS (that makes it much more likely you'll hit a boundary). > The patch which originally introduced this code which your patch > touches was part of adding highmem support to ARM: > > commit 43377453af83b8ff8c1c731da1508bd6b84ebfea > Author: Nicolas Pitre <nico@cam.org> > Date: Thu Mar 12 22:52:09 2009 -0400 > > [ARM] introduce dma_cache_maint_page() > > This is a helper to be used by the DMA mapping API to handle cache > maintenance for memory identified by a page structure instead of a > virtual address. Those pages may or may not be highmem pages, and > when they're highmem pages, they may or may not be virtually mapped. > When they're not mapped then there is no L1 cache to worry about. But > even in that case the L2 cache must be processed since unmapped highmem > pages can still be L2 cached. > > Signed-off-by: Nicolas Pitre <nico@marvell.com> > > some three years later, and has been through a number of revisions since. > > I'd really like to get rid of sparsemem so it's one less failure case, but > alas there's a relatively small bunch of folk who rely upon it. That > means it's always going to be more buggy. Heh, I'd like to do that too on parisc; the problem is that the memory holes in our platform are just to big to use a linear array on. And now that the slub people want DISCONTIGMEM deprecated, there's not much choice. James
On Wed, Jan 16, 2013 at 10:32:35AM +0000, James Bottomley wrote: > On Wed, 2013-01-16 at 12:07 +0530, Subhash Jadavani wrote: > > > Now consider this call stack from MMC block driver (this is on the ARmv7 > > based board): > > [ 98.918174] [<c001b50c>] (v7_dma_inv_range+0x30/0x48) from > > [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) > > [ 98.927819] [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) from > > [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) > > [ 98.937982] [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) from > > [<c0017ff8>] (dma_map_sg+0x3c/0x114) > > OK, so this is showing that ARM itself is making the assumption that the > pages are contiguous in the page offset map. > > Fix this by doing the increment via the pfn, which will do the right > thing whatever the memory model. > > Signed-off-by: James Bottomley <JBottomley@Parallels.com> Ok. What would you like the patch summary line for this to be - the existing one seems to be a little wrong given the content of this patch... > > --- > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 6b2fb87..ab88c5b 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -809,7 +809,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, > op(vaddr, len, dir); > } > offset = 0; > - page++; > + page = pfn_to_page(page_to_pfn(page) + 1); > left -= len; > } while (left); > } > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Wed, Jan 16, 2013 at 10:32:35AM +0000, James Bottomley wrote: > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > index 6b2fb87..ab88c5b 100644 > --- a/arch/arm/mm/dma-mapping.c > +++ b/arch/arm/mm/dma-mapping.c > @@ -809,7 +809,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, > op(vaddr, len, dir); > } > offset = 0; > - page++; > + page = pfn_to_page(page_to_pfn(page) + 1); Probably page = nth_page(page, 1) is the better form. Thanks.
On Wed, 2013-01-16 at 23:14 +0000, Russell King - ARM Linux wrote: > On Wed, Jan 16, 2013 at 10:32:35AM +0000, James Bottomley wrote: > > On Wed, 2013-01-16 at 12:07 +0530, Subhash Jadavani wrote: > > > > > Now consider this call stack from MMC block driver (this is on the ARmv7 > > > based board): > > > [ 98.918174] [<c001b50c>] (v7_dma_inv_range+0x30/0x48) from > > > [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) > > > [ 98.927819] [<c0017b8c>] (dma_cache_maint_page+0x1c4/0x24c) from > > > [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) > > > [ 98.937982] [<c0017c28>] (___dma_page_cpu_to_dev+0x14/0x1c) from > > > [<c0017ff8>] (dma_map_sg+0x3c/0x114) > > > > OK, so this is showing that ARM itself is making the assumption that the > > pages are contiguous in the page offset map. > > > > Fix this by doing the increment via the pfn, which will do the right > > thing whatever the memory model. > > > > Signed-off-by: James Bottomley <JBottomley@Parallels.com> > > Ok. What would you like the patch summary line for this to be - > the existing one seems to be a little wrong given the content of > this patch... how about arm: fix struct page iterator in dma_cache_maint() to work with sparsemem ? James > > > > --- > > > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > > index 6b2fb87..ab88c5b 100644 > > --- a/arch/arm/mm/dma-mapping.c > > +++ b/arch/arm/mm/dma-mapping.c > > @@ -809,7 +809,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, > > op(vaddr, len, dir); > > } > > offset = 0; > > - page++; > > + page = pfn_to_page(page_to_pfn(page) + 1); > > left -= len; > > } while (left); > > } > > > > > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-01-16 at 15:18 -0800, Tejun Heo wrote: > On Wed, Jan 16, 2013 at 10:32:35AM +0000, James Bottomley wrote: > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > > index 6b2fb87..ab88c5b 100644 > > --- a/arch/arm/mm/dma-mapping.c > > +++ b/arch/arm/mm/dma-mapping.c > > @@ -809,7 +809,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, > > op(vaddr, len, dir); > > } > > offset = 0; > > - page++; > > + page = pfn_to_page(page_to_pfn(page) + 1); > > Probably page = nth_page(page, 1) is the better form. It's the same thing. I'd actually prefer page = pfn_to_page(page_to_pfn(page) + 1); because it makes the code look like the hack it is. The preferred form for all iterators like this should be to iterate over the pfn instead of a pointer into the page arrays, because that will always work correctly no matter how many weird and wonderful memory schemes we come up with. James
On Thu, Jan 17, 2013 at 09:11:20AM +0000, James Bottomley wrote: > On Wed, 2013-01-16 at 15:18 -0800, Tejun Heo wrote: > > On Wed, Jan 16, 2013 at 10:32:35AM +0000, James Bottomley wrote: > > > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c > > > index 6b2fb87..ab88c5b 100644 > > > --- a/arch/arm/mm/dma-mapping.c > > > +++ b/arch/arm/mm/dma-mapping.c > > > @@ -809,7 +809,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, > > > op(vaddr, len, dir); > > > } > > > offset = 0; > > > - page++; > > > + page = pfn_to_page(page_to_pfn(page) + 1); > > > > Probably page = nth_page(page, 1) is the better form. > > It's the same thing. > > I'd actually prefer page = pfn_to_page(page_to_pfn(page) + 1); because > it makes the code look like the hack it is. The preferred form for all > iterators like this should be to iterate over the pfn instead of a > pointer into the page arrays, because that will always work correctly no > matter how many weird and wonderful memory schemes we come up with. So, why don't we update the code to do that then?
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 6b2fb87..ab88c5b 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -809,7 +809,7 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset, op(vaddr, len, dir); } offset = 0; - page++; + page = pfn_to_page(page_to_pfn(page) + 1); left -= len; } while (left); }