Message ID | 1457654203-20856-7-git-send-email-fcooper@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Franklin, On Thu, 10 Mar 2016 17:56:42 -0600 Franklin S Cooper Jr <fcooper@ti.com> wrote: > Based on DMA documentation and testing using high memory buffer when > doing dma transfers can lead to various issues including kernel > panics. I guess it all comes from the vmalloced buffer case, which are not guaranteed to be physically contiguous (one of the DMA requirement, unless you have an iommu). > > To workaround this simply use cpu copy. The amount of high memory > buffers used are very uncommon so no noticeable performance hit should > be seen. Hm, that's not necessarily true. UBI and UBIFS allocate their buffers using vmalloc (vmalloced buffers fall in the high_memory region), and those are likely to be dis-contiguous if you have NANDs with pages > 4k. I recently posted patches to ease sg_table creation from any kind of virtual address [1][2]. Can you try them and let me know if it fixes your problem? Thanks, Boris [1]https://lkml.org/lkml/2016/3/8/276 [2]https://lkml.org/lkml/2016/3/8/277
On 03/21/2016 10:04 AM, Boris Brezillon wrote: > Hi Franklin, > > On Thu, 10 Mar 2016 17:56:42 -0600 > Franklin S Cooper Jr <fcooper@ti.com> wrote: > >> Based on DMA documentation and testing using high memory buffer when >> doing dma transfers can lead to various issues including kernel >> panics. > > I guess it all comes from the vmalloced buffer case, which are not > guaranteed to be physically contiguous (one of the DMA requirement, > unless you have an iommu). > >> >> To workaround this simply use cpu copy. The amount of high memory >> buffers used are very uncommon so no noticeable performance hit should >> be seen. > > Hm, that's not necessarily true. UBI and UBIFS allocate their buffers > using vmalloc (vmalloced buffers fall in the high_memory region), and > those are likely to be dis-contiguous if you have NANDs with pages > 4k. > > I recently posted patches to ease sg_table creation from any kind of > virtual address [1][2]. Can you try them and let me know if it fixes > your problem? It looks like you won't be going forward with your patchset based on this thread [1]. I can probably reword the patch description to avoid implying that it is uncommon to run into high mem buffers. Also DMA with NAND prefetch suffers from a reduction of performance compared to CPU polling with prefetch. This is largely due to the significant over head required to read such a small amount of data at a time. The optimizations I've worked on all revolved around reducing the cycles spent before executing the DMA request. Trying to make a high memory buffer able to be used by the DMA adds significant amount of cycles and your better off just using the cpu for performance reasons. [1]https://lkml.org/lkml/2016/4/4/346 > > Thanks, > > Boris > > [1]https://lkml.org/lkml/2016/3/8/276 > [2]https://lkml.org/lkml/2016/3/8/277 > > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Franklin, On Wed, 13 Apr 2016 15:08:12 -0500 "Franklin S Cooper Jr." <fcooper@ti.com> wrote: > > > On 03/21/2016 10:04 AM, Boris Brezillon wrote: > > Hi Franklin, > > > > On Thu, 10 Mar 2016 17:56:42 -0600 > > Franklin S Cooper Jr <fcooper@ti.com> wrote: > > > >> Based on DMA documentation and testing using high memory buffer when > >> doing dma transfers can lead to various issues including kernel > >> panics. > > > > I guess it all comes from the vmalloced buffer case, which are not > > guaranteed to be physically contiguous (one of the DMA requirement, > > unless you have an iommu). > > > >> > >> To workaround this simply use cpu copy. The amount of high memory > >> buffers used are very uncommon so no noticeable performance hit should > >> be seen. > > > > Hm, that's not necessarily true. UBI and UBIFS allocate their buffers > > using vmalloc (vmalloced buffers fall in the high_memory region), and > > those are likely to be dis-contiguous if you have NANDs with pages > 4k. > > > > I recently posted patches to ease sg_table creation from any kind of > > virtual address [1][2]. Can you try them and let me know if it fixes > > your problem? > > It looks like you won't be going forward with your patchset based on > this thread [1]. Nope. According to Russell it's unsafe to do that. > I can probably reword the patch description to avoid > implying that it is uncommon to run into high mem buffers. Also DMA with > NAND prefetch suffers from a reduction of performance compared to CPU > polling with prefetch. This is largely due to the significant over head > required to read such a small amount of data at a time. The > optimizations I've worked on all revolved around reducing the cycles > spent before executing the DMA request. Trying to make a high memory > buffer able to be used by the DMA adds significant amount of cycles and > your better off just using the cpu for performance reasons. Okay. One comment though, why not using virt_addr_valid() instead of addr >= high_memory here? Best Regards, Boris
On 04/13/2016 03:24 PM, Boris Brezillon wrote: > Hi Franklin, > > On Wed, 13 Apr 2016 15:08:12 -0500 > "Franklin S Cooper Jr." <fcooper@ti.com> wrote: > >> >> >> On 03/21/2016 10:04 AM, Boris Brezillon wrote: >>> Hi Franklin, >>> >>> On Thu, 10 Mar 2016 17:56:42 -0600 >>> Franklin S Cooper Jr <fcooper@ti.com> wrote: >>> >>>> Based on DMA documentation and testing using high memory buffer when >>>> doing dma transfers can lead to various issues including kernel >>>> panics. >>> >>> I guess it all comes from the vmalloced buffer case, which are not >>> guaranteed to be physically contiguous (one of the DMA requirement, >>> unless you have an iommu). >>> >>>> >>>> To workaround this simply use cpu copy. The amount of high memory >>>> buffers used are very uncommon so no noticeable performance hit should >>>> be seen. >>> >>> Hm, that's not necessarily true. UBI and UBIFS allocate their buffers >>> using vmalloc (vmalloced buffers fall in the high_memory region), and >>> those are likely to be dis-contiguous if you have NANDs with pages > 4k. >>> >>> I recently posted patches to ease sg_table creation from any kind of >>> virtual address [1][2]. Can you try them and let me know if it fixes >>> your problem? >> >> It looks like you won't be going forward with your patchset based on >> this thread [1]. > > Nope. According to Russell it's unsafe to do that. > >> I can probably reword the patch description to avoid >> implying that it is uncommon to run into high mem buffers. Also DMA with >> NAND prefetch suffers from a reduction of performance compared to CPU >> polling with prefetch. This is largely due to the significant over head >> required to read such a small amount of data at a time. The >> optimizations I've worked on all revolved around reducing the cycles >> spent before executing the DMA request. Trying to make a high memory >> buffer able to be used by the DMA adds significant amount of cycles and >> your better off just using the cpu for performance reasons. > > Okay. > One comment though, why not using virt_addr_valid() instead of > addr >= high_memory here? I had no reason other than simply using the approach used in the driver already. Virt_addr_valid looks like it will work so I'll make the switch after testing it. > > Best Regards, > > Boris > > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c index 0863a83..22b0112 100644 --- a/drivers/mtd/nand/omap2.c +++ b/drivers/mtd/nand/omap2.c @@ -467,17 +467,8 @@ static inline int omap_nand_dma_transfer(struct mtd_info *mtd, void *addr, int ret; u32 val; - if (addr >= high_memory) { - struct page *p1; - - if (((size_t)addr & PAGE_MASK) != - ((size_t)(addr + len - 1) & PAGE_MASK)) - goto out_copy; - p1 = vmalloc_to_page(addr); - if (!p1) - goto out_copy; - addr = page_address(p1) + ((size_t)addr & ~PAGE_MASK); - } + if (addr >= high_memory) + goto out_copy; sg_init_one(&sg, addr, len); n = dma_map_sg(info->dma->device->dev, &sg, 1, dir); @@ -534,6 +525,7 @@ out_copy: else is_write == 0 ? omap_read_buf8(mtd, (u_char *) addr, len) : omap_write_buf8(mtd, (u_char *) addr, len); + return 0; }
Based on DMA documentation and testing using high memory buffer when doing dma transfers can lead to various issues including kernel panics. To workaround this simply use cpu copy. The amount of high memory buffers used are very uncommon so no noticeable performance hit should be seen. Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com> --- drivers/mtd/nand/omap2.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)