Message ID | 20110601134338.GB6700@n2100.arm.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Russell, On Wednesday 01 June 2011 15:43:38 Russell King - ARM Linux wrote: > On Wed, Jun 01, 2011 at 03:30:11PM +0200, Laurent Pinchart wrote: > > sg_alloc_table can only allocate multi-page scatter-gather list tables > > if the architecture supports scatter-gather lists chaining. ARM doesn't > > fit in that category. > > Let's fix this properly, as I've said countless times and so far no one > has bothered to sort this out: > > 8<---- > From: Russell King <rmk+kernel@arm.linux.org.uk> > Subject: ARM: Allow SoCs to enable scatterlist chaining > > Allow SoCs to enable the scatterlist chaining support, which allows > scatterlist tables to be broken up into smaller allocations. > > As support for this feature depends on the implementation details of > the users of the scatterlists, we can't enable this globally without > auditing all the users, which is a very big task. Instead, let SoCs > progressively switch over to using this. > > SoC drivers using scatterlists and SoC DMA implementations need > auditing before this option can be enabled for the SoC. In the specific iovmm case, the driver uses the sglist API to build a list of page-size sg entries, and then process it in software. Is that considered as an abuse of the sglist API, or valid usage ? Anyway, sglist chaining is not needed by iovmm. As iovmm just walks the sglist manually, it's easier to allocate it in one go rather than using sglist chaining. This of course doesn't make your patch unneeded or wrong. > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > --- > arch/arm/Kconfig | 3 +++ > arch/arm/include/asm/scatterlist.h | 4 ++++ > 2 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 9adc278..cc0dcbf 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -37,6 +37,9 @@ config ARM > Europe. There is an ARM Linux project with a web page at > <http://www.arm.linux.org.uk/>. > > +config ARM_HAS_SG_CHAIN > + bool > + > config HAVE_PWM > bool > > diff --git a/arch/arm/include/asm/scatterlist.h > b/arch/arm/include/asm/scatterlist.h index 2f87870..cefdb8f 100644 > --- a/arch/arm/include/asm/scatterlist.h > +++ b/arch/arm/include/asm/scatterlist.h > @@ -1,6 +1,10 @@ > #ifndef _ASMARM_SCATTERLIST_H > #define _ASMARM_SCATTERLIST_H > > +#ifdef CONFIG_ARM_HAS_SG_CHAIN > +#define ARCH_HAS_SG_CHAIN > +#endif > + > #include <asm/memory.h> > #include <asm/types.h> > #include <asm-generic/scatterlist.h>
On Wed, Jun 01, 2011 at 03:50:50PM +0200, Laurent Pinchart wrote: > In the specific iovmm case, the driver uses the sglist API to build a list of > page-size sg entries, and then process it in software. Is that considered as > an abuse of the sglist API, or valid usage ? > > Anyway, sglist chaining is not needed by iovmm. As iovmm just walks the sglist > manually, it's easier to allocate it in one go rather than using sglist > chaining. This of course doesn't make your patch unneeded or wrong. Well, there's a two issues here: 1. Should iovmm use sg_phys(sg) with sg_dma_len(sg) ? Probably not, because a scatterlist before DMA API mapping is defined by sg_page(sg), sg->offset, sg->length and has N entries. After DMA API mapping (n = dma_map_sg(dev, sg, N, dir)), it has n entries where n <= N, and the DMA address/lengths are sg_dma_address(sg) and sg_dma_len(sg). Both these are undefined for unmapped scatterlists. Getting this wrong means breakage when CONFIG_NEED_SG_DMA_LENGTH is enabled. 2. What would be the effect of enabling SG list chaining on iovmm? The code uses the correct SG list walking helpers (for_each_sg) so it should be able to cope with chained SG lists. So, I think there's no problem here with chained SG lists, but there is an issue with using sg_dma_len(). I'd suggest converting stuff to use sg->length with sg_page(sg) rather than sg_dma_len(sg). As for whether SG chaining is required or not, if you're running up against the maximum SG table size, then you do have a requirement for SG chaining.
Hi Russell, On Wednesday 01 June 2011 16:03:06 Russell King - ARM Linux wrote: > On Wed, Jun 01, 2011 at 03:50:50PM +0200, Laurent Pinchart wrote: > > In the specific iovmm case, the driver uses the sglist API to build a > > list of page-size sg entries, and then process it in software. Is that > > considered as an abuse of the sglist API, or valid usage ? > > > > Anyway, sglist chaining is not needed by iovmm. As iovmm just walks the > > sglist manually, it's easier to allocate it in one go rather than using > > sglist chaining. This of course doesn't make your patch unneeded or > > wrong. > > Well, there's a two issues here: > 1. Should iovmm use sg_phys(sg) with sg_dma_len(sg) ? > Probably not, because a scatterlist before DMA API mapping is defined > by sg_page(sg), sg->offset, sg->length and has N entries. After DMA > API mapping (n = dma_map_sg(dev, sg, N, dir)), it has n entries where > n <= N, and the DMA address/lengths are sg_dma_address(sg) and > sg_dma_len(sg). Both these are undefined for unmapped scatterlists. > > Getting this wrong means breakage when CONFIG_NEED_SG_DMA_LENGTH is > enabled. iovmm abuses the sglist API, there's no doubt on that. It will break when CONFIG_NEED_SG_DMA_LENGTH is enabled. iovmm should probably not use the sglist API, and it should probably not even exist in the first place. I know that TI is working on moving the OMAP-specific iommu/iovmm implementation to the generic IOMMU API, but that will take time. In the meantime, I'd like to fix iovmm to avoid the userspace-triggerable BUG_ON(). > 2. What would be the effect of enabling SG list chaining on iovmm? > The code uses the correct SG list walking helpers (for_each_sg) so > it should be able to cope with chained SG lists. Yes it should. It might be slightly less efficient, but I don't think we will notice. > So, I think there's no problem here with chained SG lists, but there is > an issue with using sg_dma_len(). I'd suggest converting stuff to use > sg->length with sg_page(sg) rather than sg_dma_len(sg). With sg_page(sg) ? I'm not sure to follow you there. > As for whether SG chaining is required or not, if you're running up against > the maximum SG table size, then you do have a requirement for SG chaining. The SG table size limit makes sure that the SG list fits in a page, so that it can be passed to the hardware. This isn't needed by iovmm, as it processes the sglist in software. iovmm could use SG chaining, but we would then need to enable it for the SoCs on which iovmm is used. I don't know if they properly support that.
On Fri, Jun 03, 2011 at 02:12:47AM +0200, Laurent Pinchart wrote: > Hi Russell, > > On Wednesday 01 June 2011 16:03:06 Russell King - ARM Linux wrote: > > On Wed, Jun 01, 2011 at 03:50:50PM +0200, Laurent Pinchart wrote: > > > In the specific iovmm case, the driver uses the sglist API to build a > > > list of page-size sg entries, and then process it in software. Is that > > > considered as an abuse of the sglist API, or valid usage ? > > > > > > Anyway, sglist chaining is not needed by iovmm. As iovmm just walks the > > > sglist manually, it's easier to allocate it in one go rather than using > > > sglist chaining. This of course doesn't make your patch unneeded or > > > wrong. > > > > Well, there's a two issues here: > > 1. Should iovmm use sg_phys(sg) with sg_dma_len(sg) ? > > Probably not, because a scatterlist before DMA API mapping is defined > > by sg_page(sg), sg->offset, sg->length and has N entries. After DMA > > API mapping (n = dma_map_sg(dev, sg, N, dir)), it has n entries where > > n <= N, and the DMA address/lengths are sg_dma_address(sg) and > > sg_dma_len(sg). Both these are undefined for unmapped scatterlists. > > > > Getting this wrong means breakage when CONFIG_NEED_SG_DMA_LENGTH is > > enabled. > > iovmm abuses the sglist API, there's no doubt on that. It will break when > CONFIG_NEED_SG_DMA_LENGTH is enabled. iovmm should probably not use the sglist > API, and it should probably not even exist in the first place. I know that TI > is working on moving the OMAP-specific iommu/iovmm implementation to the > generic IOMMU API, but that will take time. In the meantime, I'd like to fix > iovmm to avoid the userspace-triggerable BUG_ON(). > > > 2. What would be the effect of enabling SG list chaining on iovmm? > > The code uses the correct SG list walking helpers (for_each_sg) so > > it should be able to cope with chained SG lists. > > Yes it should. It might be slightly less efficient, but I don't think we will > notice. > > > So, I think there's no problem here with chained SG lists, but there is > > an issue with using sg_dma_len(). I'd suggest converting stuff to use > > sg->length with sg_page(sg) rather than sg_dma_len(sg). > > With sg_page(sg) ? I'm not sure to follow you there. sg->length and sg_page(sg) are paired (and sg->length is paired with other stuff). They describe the scatterlist _before_ DMA API mapping. After DMA API mapping, the scatterlist describes a list of regions defined by sg_dma_address(sg) and sg_dma_len(sg) - sg_dma_len(sg) is _only_ paired with sg_dma_address(sg). > > As for whether SG chaining is required or not, if you're running up against > > the maximum SG table size, then you do have a requirement for SG chaining. > > The SG table size limit makes sure that the SG list fits in a page, so that it > can be passed to the hardware. This isn't needed by iovmm, as it processes the > sglist in software. iovmm could use SG chaining, but we would then need to > enable it for the SoCs on which iovmm is used. I don't know if they properly > support that. Err, no. scatterlists are _never_ passed to hardware. They're a kernel internal description of a list of regions in memory, which initially start off as describing the kernels view of those regions. After DMA mapping, they describe it in terms of the device's view of those regions. At that point, scatterlists get converted to whatever form is required by the hardware doing DMA, which most certainly won't be the layout which struct scatterlist describes. SG chaining has _nothing_ to do with hardware. It's all to do with software and hitting the SG table limit.
On Fri, Jun 3, 2011 at 3:12 AM, Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > Hi Russell, > > On Wednesday 01 June 2011 16:03:06 Russell King - ARM Linux wrote: >> On Wed, Jun 01, 2011 at 03:50:50PM +0200, Laurent Pinchart wrote: >> > In the specific iovmm case, the driver uses the sglist API to build a >> > list of page-size sg entries, and then process it in software. Is that >> > considered as an abuse of the sglist API, or valid usage ? >> > >> > Anyway, sglist chaining is not needed by iovmm. As iovmm just walks the >> > sglist manually, it's easier to allocate it in one go rather than using >> > sglist chaining. This of course doesn't make your patch unneeded or >> > wrong. >> >> Well, there's a two issues here: >> 1. Should iovmm use sg_phys(sg) with sg_dma_len(sg) ? >> Probably not, because a scatterlist before DMA API mapping is defined >> by sg_page(sg), sg->offset, sg->length and has N entries. After DMA >> API mapping (n = dma_map_sg(dev, sg, N, dir)), it has n entries where >> n <= N, and the DMA address/lengths are sg_dma_address(sg) and >> sg_dma_len(sg). Both these are undefined for unmapped scatterlists. >> >> Getting this wrong means breakage when CONFIG_NEED_SG_DMA_LENGTH is >> enabled. > > iovmm abuses the sglist API, there's no doubt on that. It will break when > CONFIG_NEED_SG_DMA_LENGTH is enabled. iovmm should probably not use the sglist > API, and it should probably not even exist in the first place. I know that TI > is working on moving the OMAP-specific iommu/iovmm implementation to the > generic IOMMU API, but that will take time. In the meantime, I'd like to fix > iovmm to avoid the userspace-triggerable BUG_ON(). This would also allow the tidspbridge driver to use iommu.
Hi Russell, On Friday 03 June 2011 08:32:12 Russell King - ARM Linux wrote: > On Fri, Jun 03, 2011 at 02:12:47AM +0200, Laurent Pinchart wrote: > > On Wednesday 01 June 2011 16:03:06 Russell King - ARM Linux wrote: > > > On Wed, Jun 01, 2011 at 03:50:50PM +0200, Laurent Pinchart wrote: > > > > In the specific iovmm case, the driver uses the sglist API to build a > > > > list of page-size sg entries, and then process it in software. Is > > > > that considered as an abuse of the sglist API, or valid usage ? > > > > > > > > Anyway, sglist chaining is not needed by iovmm. As iovmm just walks > > > > the sglist manually, it's easier to allocate it in one go rather > > > > than using sglist chaining. This of course doesn't make your patch > > > > unneeded or wrong. > > > > > > Well, there's a two issues here: > > > 1. Should iovmm use sg_phys(sg) with sg_dma_len(sg) ? > > > > > > Probably not, because a scatterlist before DMA API mapping is > > > defined by sg_page(sg), sg->offset, sg->length and has N entries. > > > After DMA API mapping (n = dma_map_sg(dev, sg, N, dir)), it has n > > > entries where n <= N, and the DMA address/lengths are > > > sg_dma_address(sg) and sg_dma_len(sg). Both these are undefined > > > for unmapped scatterlists. > > > > > > Getting this wrong means breakage when CONFIG_NEED_SG_DMA_LENGTH is > > > enabled. > > > > iovmm abuses the sglist API, there's no doubt on that. It will break when > > CONFIG_NEED_SG_DMA_LENGTH is enabled. iovmm should probably not use the > > sglist API, and it should probably not even exist in the first place. I > > know that TI is working on moving the OMAP-specific iommu/iovmm > > implementation to the generic IOMMU API, but that will take time. In the > > meantime, I'd like to fix iovmm to avoid the userspace-triggerable > > BUG_ON(). > > > > > 2. What would be the effect of enabling SG list chaining on iovmm? > > > > > > The code uses the correct SG list walking helpers (for_each_sg) so > > > it should be able to cope with chained SG lists. > > > > Yes it should. It might be slightly less efficient, but I don't think we > > will notice. > > > > > So, I think there's no problem here with chained SG lists, but there is > > > an issue with using sg_dma_len(). I'd suggest converting stuff to use > > > sg->length with sg_page(sg) rather than sg_dma_len(sg). > > > > With sg_page(sg) ? I'm not sure to follow you there. > > sg->length and sg_page(sg) are paired (and sg->length is paired with > other stuff). They describe the scatterlist _before_ DMA API mapping. > After DMA API mapping, the scatterlist describes a list of regions > defined by sg_dma_address(sg) and sg_dma_len(sg) - sg_dma_len(sg) is > _only_ paired with sg_dma_address(sg). OK. The driver is already using sg_phys(sg), which is a wrapper around sg_page(sg). I still need to replace sg_dma_len(sg) with sg->length. > > > As for whether SG chaining is required or not, if you're running up > > > against the maximum SG table size, then you do have a requirement for > > > SG chaining. > > > > The SG table size limit makes sure that the SG list fits in a page, so > > that it can be passed to the hardware. This isn't needed by iovmm, as it > > processes the sglist in software. iovmm could use SG chaining, but we > > would then need to enable it for the SoCs on which iovmm is used. I > > don't know if they properly support that. > > Err, no. scatterlists are _never_ passed to hardware. They're a kernel > internal description of a list of regions in memory, which initially > start off as describing the kernels view of those regions. After DMA > mapping, they describe it in terms of the device's view of those > regions. > > At that point, scatterlists get converted to whatever form is required > by the hardware doing DMA, which most certainly won't be the layout which > struct scatterlist describes. > > SG chaining has _nothing_ to do with hardware. It's all to do with software > and hitting the SG table limit. What's the reason for limiting the SG table size to one page then ?
On Mon, Jun 06, 2011 at 06:23:18PM +0200, Laurent Pinchart wrote: > Hi Russell, > > On Friday 03 June 2011 08:32:12 Russell King - ARM Linux wrote: > > SG chaining has _nothing_ to do with hardware. It's all to do with software > > and hitting the SG table limit. > > What's the reason for limiting the SG table size to one page then ? As I say, it's got nothing to do with them ending up being passed to hardware. Take a look at their definition: struct scatterlist { #ifdef CONFIG_DEBUG_SG unsigned long sg_magic; #endif unsigned long page_link; unsigned int offset; unsigned int length; dma_addr_t dma_address; #ifdef CONFIG_NEED_SG_DMA_LENGTH unsigned int dma_length; #endif }; That clearly isn't hardware specific - hardware won't cope with CONFIG_DEBUG_SG being enabled or disabled, or whether the architecture supports the dma_length field, or that this structure has developed from being: void *addr; unsigend int length; unsigned long dma_address; to the above over the evolution of the kernel. Or that we use the bottom two bits of page_link as our own flag bits? So no, this struct goes nowhere near hardware of any kind. It's merely used as a container to pass a list of scatter-gather locations in memory internally around within the kernel, especially to dma_map_sg()/ dma_unmap_sg(). If you look at IDE or ATA code, or even SCSI code, you'll find the same pattern. They're passed a scatterlist. They map it for dma using dma_map_sg(). They then walk the scatterlist and extract the dma address and length using sg_dma_address() and sg_dma_length() and create the _hardware_ table from that information - and the hardware table very much depends on the hardware itself. Once DMA is complete, they unmap the DMA region using dma_unmap_sg(). One very good reason that its limited to one page is because allocations larger than one page are prone to failure. Would you want your company server failing to read/write data to its storage just because it couldn't get a contiguous 8K page for a 5K long scatterlist? I think if Linux did that, it wouldn't have a future in the enterprise marketplace.
Hi Russell, On Monday 06 June 2011 18:44:00 Russell King - ARM Linux wrote: > On Mon, Jun 06, 2011 at 06:23:18PM +0200, Laurent Pinchart wrote: > > Hi Russell, > > > > On Friday 03 June 2011 08:32:12 Russell King - ARM Linux wrote: > > > SG chaining has _nothing_ to do with hardware. It's all to do with > > > software and hitting the SG table limit. > > > > What's the reason for limiting the SG table size to one page then ? > > As I say, it's got nothing to do with them ending up being passed to > hardware. Take a look at their definition: > > struct scatterlist { > #ifdef CONFIG_DEBUG_SG > unsigned long sg_magic; > #endif > unsigned long page_link; > unsigned int offset; > unsigned int length; > dma_addr_t dma_address; > #ifdef CONFIG_NEED_SG_DMA_LENGTH > unsigned int dma_length; > #endif > }; > > That clearly isn't hardware specific - hardware won't cope with > CONFIG_DEBUG_SG being enabled or disabled, or whether the architecture > supports the dma_length field, or that this structure has developed from > being: > > void *addr; > unsigend int length; > unsigned long dma_address; > > to the above over the evolution of the kernel. Or that we use the bottom > two bits of page_link as our own flag bits? > > So no, this struct goes nowhere near hardware of any kind. It's merely > used as a container to pass a list of scatter-gather locations in memory > internally around within the kernel, especially to dma_map_sg()/ > dma_unmap_sg(). > > If you look at IDE or ATA code, or even SCSI code, you'll find the same > pattern. They're passed a scatterlist. They map it for dma using > dma_map_sg(). They then walk the scatterlist and extract the dma > address and length using sg_dma_address() and sg_dma_length() and create > the _hardware_ table from that information - and the hardware table very > much depends on the hardware itself. Once DMA is complete, they unmap > the DMA region using dma_unmap_sg(). > > One very good reason that its limited to one page is because allocations > larger than one page are prone to failure. Would you want your company > server failing to read/write data to its storage just because it couldn't > get a contiguous 8K page for a 5K long scatterlist? I think if Linux > did that, it wouldn't have a future in the enterprise marketplace. Of course not, but if the scatterlist is only touched by kernel code, it doesn't need to be contiguous in memory. It could be allocated with vmalloc().
On Mon, Jun 06, 2011 at 06:54:10PM +0200, Laurent Pinchart wrote: > Of course not, but if the scatterlist is only touched by kernel code, it > doesn't need to be contiguous in memory. It could be allocated with vmalloc(). Except vmalloc has a higher latency and a more restrictive API than other allocators (think about the coherency issues with SMP systems which may have to do TLB flushing on several cores, etc.)
Hi Russell, On Monday 06 June 2011 20:00:52 Russell King - ARM Linux wrote: > On Mon, Jun 06, 2011 at 06:54:10PM +0200, Laurent Pinchart wrote: > > Of course not, but if the scatterlist is only touched by kernel code, it > > doesn't need to be contiguous in memory. It could be allocated with > > vmalloc(). > > Except vmalloc has a higher latency and a more restrictive API than > other allocators (think about the coherency issues with SMP systems > which may have to do TLB flushing on several cores, etc.) Right, thank you for the explanation. It is now clear to me why we want to use the page allocator and handle scatter list chaining in the critical paths. We could still use vmalloc() in the iovmm driver, but that's probably not worth it if we can enable scatter list chaining for the system. With your patch scatter list chaining can be enabled at the platform level for the ARM architecture. What are the platform requirements to enable scatter list chaining ?
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 9adc278..cc0dcbf 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -37,6 +37,9 @@ config ARM Europe. There is an ARM Linux project with a web page at <http://www.arm.linux.org.uk/>. +config ARM_HAS_SG_CHAIN + bool + config HAVE_PWM bool diff --git a/arch/arm/include/asm/scatterlist.h b/arch/arm/include/asm/scatterlist.h index 2f87870..cefdb8f 100644 --- a/arch/arm/include/asm/scatterlist.h +++ b/arch/arm/include/asm/scatterlist.h @@ -1,6 +1,10 @@ #ifndef _ASMARM_SCATTERLIST_H #define _ASMARM_SCATTERLIST_H +#ifdef CONFIG_ARM_HAS_SG_CHAIN +#define ARCH_HAS_SG_CHAIN +#endif + #include <asm/memory.h> #include <asm/types.h> #include <asm-generic/scatterlist.h>