Message ID | 1406736193-26685-1-git-send-email-maxime.ripard@free-electrons.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Wed, Jul 30, 2014 at 06:03:13PM +0200, Maxime Ripard wrote: > The dmaengine is neither trivial nor properly documented at the moment, which > means a lot of trial and error development, which is not that good for such a > central piece of the system. > > Attempt at making such a documentation. Did you miss Documentation/dmaengine.txt, lots of this is already covered there. But yes i would be really glad to know what isnt, so that we can fix that. > + * device_slave_caps > + - Isn't that redundant with the cap_mask already? > + - Only a few drivers seem to implement it For audio to know what your channel can do rather than hardcoding it > + > + * dma cookies? cookie is dma transaction representation which is monotonically incrementing number.
Hi Vinod, On Wed, Jul 30, 2014 at 09:36:07PM +0530, Vinod Koul wrote: > On Wed, Jul 30, 2014 at 06:03:13PM +0200, Maxime Ripard wrote: > > The dmaengine is neither trivial nor properly documented at the moment, which > > means a lot of trial and error development, which is not that good for such a > > central piece of the system. > > > > Attempt at making such a documentation. > > Did you miss Documentation/dmaengine.txt, lots of this is already covered > there. But yes i would be really glad to know what isnt, so that we can fix > that. I didn't miss it. But I feel like it describes quite nicely the slave API, but doesn't help at all whenever you're writing a DMAengine driver. The first lines of the existing document makes it quite clear too. There's still a bit of duplication, but I don't feel it's such a big deal. What I'd like to do with the documentation I just sent is basically have a clear idea whenever you step into dmaengine what you can/cannot do, and have a reference document explaining what's expected by the framework, and hopefully have unified drivers that follow this pattern. Because, for the moment, we're pretty much left in the dark with different drivers doing the same thing in completetely different ways, with basically no way to tell if it's either the framework that requires such behaviour, or if the author was just feeling creative. There's numerous examples for this at the moment: - The GFP flags, with different drivers using either GFP_ATOMIC, GFP_NOWAIT or GFP_KERNEL in the same functions - Having to set device_slave_caps or not? - Some drivers use dma_run_depedencies, some other don't - That might just be my experience, but judging from previous commits, DMA_PRIVATE is completely obscure, and we just set it because it was making it work, without knowing what it was supposed to do. - etc. And basically, we have no way to tell at the moment which one is right and which one needs fixing. The corollary being that it cripples the whole community ability to maintain the framework and make it evolve. > > + * device_slave_caps > > + - Isn't that redundant with the cap_mask already? > > + - Only a few drivers seem to implement it > For audio to know what your channel can do rather than hardcoding it Ah, yes, I see it now. It's not related to the caps mask at all. Just out of curiosity, wouldn't it be better to move this to the framework, and have these informations provided through the struct dma_device? Or would it have some non-trivial side-effects? > > + * dma cookies? > cookie is dma transaction representation which is monotonically incrementing > number. Ok, and it identifies a unique dma_async_tx_descriptor, right? Thanks, Maxime
On Thu, Jul 31, 2014 at 09:44:40AM +0200, Maxime Ripard wrote: > Hi Vinod, > > On Wed, Jul 30, 2014 at 09:36:07PM +0530, Vinod Koul wrote: > > On Wed, Jul 30, 2014 at 06:03:13PM +0200, Maxime Ripard wrote: > > > The dmaengine is neither trivial nor properly documented at the moment, which > > > means a lot of trial and error development, which is not that good for such a > > > central piece of the system. > > > > > > Attempt at making such a documentation. > > > > Did you miss Documentation/dmaengine.txt, lots of this is already covered > > there. But yes i would be really glad to know what isnt, so that we can fix > > that. > > I didn't miss it. But I feel like it describes quite nicely the slave > API, but doesn't help at all whenever you're writing a DMAengine driver. > > The first lines of the existing document makes it quite clear too. > > There's still a bit of duplication, but I don't feel it's such a big > deal. And that made me think that you might have missed it. I am okay for idea to have this document but it needs to co-exist one. No point in duplicating as it can create ambiguity in future. > > What I'd like to do with the documentation I just sent is basically > have a clear idea whenever you step into dmaengine what you can/cannot > do, and have a reference document explaining what's expected by the > framework, and hopefully have unified drivers that follow this > pattern. Sure, can you pls modify this to avoid duplication. I would be happy to apply that :) > > Because, for the moment, we're pretty much left in the dark with > different drivers doing the same thing in completetely different ways, > with basically no way to tell if it's either the framework that > requires such behaviour, or if the author was just feeling creative. > > There's numerous examples for this at the moment: > - The GFP flags, with different drivers using either GFP_ATOMIC, > GFP_NOWAIT or GFP_KERNEL in the same functions > - Having to set device_slave_caps or not? > - Some drivers use dma_run_depedencies, some other don't > - That might just be my experience, but judging from previous > commits, DMA_PRIVATE is completely obscure, and we just set it > because it was making it work, without knowing what it was > supposed to do. > - etc. Thanks for highlighting we should definitely add these in Documentation > > And basically, we have no way to tell at the moment which one is > right and which one needs fixing. > > The corollary being that it cripples the whole community ability to > maintain the framework and make it evolve. > > > > + * device_slave_caps > > > + - Isn't that redundant with the cap_mask already? > > > + - Only a few drivers seem to implement it > > For audio to know what your channel can do rather than hardcoding it > > Ah, yes, I see it now. It's not related to the caps mask at all. > > Just out of curiosity, wouldn't it be better to move this to the > framework, and have these informations provided through the struct > dma_device? Or would it have some non-trivial side-effects? Well the problem is ability to have this queried uniformly from all drivers across subsystems. If we can do this that would be nice. > > > > + * dma cookies? > > cookie is dma transaction representation which is monotonically incrementing > > number. > > Ok, and it identifies a unique dma_async_tx_descriptor, right? Yup and this basically represents transactions you have submitted. Thats why cookie is allocated at tx_submit. Thanks
On 07/31/2014 09:44 AM, Maxime Ripard wrote:
[...]
> - Having to set device_slave_caps or not?
Yes. This should in my opinion be mandatory for new drivers. One of the
issues with the DMAengine API is that it is really hard to write generic
drivers that do not already know about the capabilities of the DMA
controller. E.g. if you have a peripheral that is used on SoC A it assumes
that the DMA controller it is connected to has the capabilities of the DMA
controller in SoC A. If the same peripheral is now used on SoC B with a DMA
controller with different capabilities this often ends up with ugly ifdefery
in the peripheral driver. The peripheral driver shouldn't have to know which
specific DMA controller it is connected to (It's a bit as if a GPIO consumer
needed to know which GPIO controller is connected to). We got away with the
current approach since there is not that much diversity in the mixing of
peripherals and DMA controllers (each vendor pretty has their own DMA
controller which it uses for their own peripherals). But with more recent
code consolidation we are on a path to generic DMA support within subsystem
frameworks (there is generic DMA support for audio, there is generic DMA
support for SPI and I also have a (currently) out of tree patch for generic
DMA support for IIO). Also these generic drivers need to be able to discover
the capabilities of the DMA controller to be able to make the right decisions.
- Lars
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 31, 2014 at 09:44:40AM +0200, Maxime Ripard wrote: > I didn't miss it. But I feel like it describes quite nicely the slave > API, but doesn't help at all whenever you're writing a DMAengine driver. There's actually two documents: Documentation/crypto/async-tx-api.txt Documentation/dmaengine.txt The first is the original DMA engine API for use with offloading things like memcpy, as well as the crypto API to hardware crypto DMA engines. The second is documentation for the DMA slave API which modifies some of the requirements of the first (like, the ability to call the slave DMA prepare functions from within the callback, which are not permitted in the async tx API.) Both documents are relevant for writing a DMA engine driver. > Because, for the moment, we're pretty much left in the dark with > different drivers doing the same thing in completetely different ways, > with basically no way to tell if it's either the framework that > requires such behaviour, or if the author was just feeling creative. DMA engine has lacked a lot of infrastructure for common patterns in drivers; some of that is solved by my efforts with the virt_dma.c support, and also various cleanups to existing drivers, but it's not easy to fix this problem after drivers have been merged. > There's numerous examples for this at the moment: > - The GFP flags, with different drivers using either GFP_ATOMIC, > GFP_NOWAIT or GFP_KERNEL in the same functions The slave prepare APIs should be using GFP_ATOMIC to ensure safety. > - Having to set device_slave_caps or not? device_slave_caps is a relatively new invention, so old drivers don't implement it. Newer drivers should, and there probably should be some motivation for older drivers to add it. > - Some drivers use dma_run_depedencies, some other don't Dependent operations are part of the async-tx API, and aren't really applicable to the slave DMA API. A driver implementing just the slave DMA API need not concern itself with dependent operations, but a driver implementing the async-tx APIs should do. > - That might just be my experience, but judging from previous > commits, DMA_PRIVATE is completely obscure, and we just set it > because it was making it work, without knowing what it was > supposed to do. DMA_PRIVATE means that the channel is unavailable for async-tx operations - in other words, it's slave-only. Setting it before registration results in the private-use count being incremented, disabling the DMA_PRIVATE manipulation in the channel request/release APIs (requested channels are unavailable for async-tx operations, which is why that flag is also set there.) To put it another way, if a DMA engine driver only provides slave DMA support, it must set DMA_PRIVATE to mark the channel unavailable for async-tx operations. If a DMA engine drivers channels can also be used for async-tx operations, then it should leave the flag clear. > And basically, we have no way to tell at the moment which one is > right and which one needs fixing. > > The corollary being that it cripples the whole community ability to > maintain the framework and make it evolve. All respect to Vinod (who looks after the slave side of the DMA engine implementations) and Dan, what it needs is someone who knows the *whole* DMA engine picture to be in control of the subsystem, and who knows these details - not must one bit of it. Someone who has the time to put into reviewing changes to it, and probably more importantly, has the time to clean up the existing code. Finding someone who fits that is a struggle. Dan knows the whole picture (or used to when he was working on it at one of his former employers) but I don't think Dan had the available time to address various issues with it. > > cookie is dma transaction representation which is monotonically incrementing > > number. > > Ok, and it identifies a unique dma_async_tx_descriptor, right? You really should not be concerned with DMA cookies anymore since one of my cleanups - I added a number of helper functions which hide the manipulation of DMA cookies, and take away from driver writers the semantics of how cookies themselves are operated upon. virt-dma goes a little further and provides descriptor lists and methods to look up a descriptor given a cookie. The only thing that a driver writer should concern themselves with is making the appropriate cookie management calls at the relevant point in the code - in other words, allocating a cookie at in the submit callback, completing a cookie when a descriptor is complete, etc. Drivers should have no reason what so ever to be touching the cookie directly anymore.
Hi Lars-Peter, On Thu, Jul 31, 2014 at 02:44:56PM +0200, Lars-Peter Clausen wrote: > On 07/31/2014 09:44 AM, Maxime Ripard wrote: > [...] > > - Having to set device_slave_caps or not? > > Yes. This should in my opinion be mandatory for new drivers. Ok. > One of the issues with the DMAengine API is that it is really hard > to write generic drivers that do not already know about the > capabilities of the DMA controller. E.g. if you have a peripheral > that is used on SoC A it assumes that the DMA controller it is > connected to has the capabilities of the DMA controller in SoC A. If > the same peripheral is now used on SoC B with a DMA controller with > different capabilities this often ends up with ugly ifdefery in the > peripheral driver. The peripheral driver shouldn't have to know > which specific DMA controller it is connected to (It's a bit as if a > GPIO consumer needed to know which GPIO controller is connected > to). We got away with the current approach since there is not that > much diversity in the mixing of peripherals and DMA controllers > (each vendor pretty has their own DMA controller which it uses for > their own peripherals). But with more recent code consolidation we > are on a path to generic DMA support within subsystem frameworks > (there is generic DMA support for audio, there is generic DMA > support for SPI and I also have a (currently) out of tree patch for > generic DMA support for IIO). Also these generic drivers need to be > able to discover the capabilities of the DMA controller to be able > to make the right decisions. Yeah, I've seen the generic infrastructure in both ASoC and SPI, and it's great that it's coming to IIO as well. I wasn't aware that it was relying on device_slave_caps though, and been mislead by the caps name into thinking that it was related to the caps_mask, which is obviously not. From what you're saying, and judging from the drivers that already implement it, can't it be moved directly to the framework itself ? The informations put there could be either used elsewhere (like framework-level filtering of invalid directions/bus width) or could be derived directly from which callbacks are set (in the pause/terminate case)? I guess that would make generic layer much easier to write, since you'll always have this information. Maxime
On Thu, Jul 31, 2014 at 05:26:28PM +0530, Vinod Koul wrote: > On Thu, Jul 31, 2014 at 09:44:40AM +0200, Maxime Ripard wrote: > > Hi Vinod, > > > > On Wed, Jul 30, 2014 at 09:36:07PM +0530, Vinod Koul wrote: > > > On Wed, Jul 30, 2014 at 06:03:13PM +0200, Maxime Ripard wrote: > > > > The dmaengine is neither trivial nor properly documented at the moment, which > > > > means a lot of trial and error development, which is not that good for such a > > > > central piece of the system. > > > > > > > > Attempt at making such a documentation. > > > > > > Did you miss Documentation/dmaengine.txt, lots of this is already covered > > > there. But yes i would be really glad to know what isnt, so that we can fix > > > that. > > > > I didn't miss it. But I feel like it describes quite nicely the slave > > API, but doesn't help at all whenever you're writing a DMAengine driver. > > > > The first lines of the existing document makes it quite clear too. > > > > There's still a bit of duplication, but I don't feel it's such a big > > deal. > And that made me think that you might have missed it. > > I am okay for idea to have this document but it needs to co-exist one. No > point in duplicating as it can create ambiguity in future. The only duplication I'm seeing is about the device_prep* operations, that get described in dmaengine.txt too. There's also a minor one about struct dma_slave_config, but both are rather generic and point to dmaengine.h, so I guess we won't be at risk of any real ambiguity. Do you see anything else? > > What I'd like to do with the documentation I just sent is basically > > have a clear idea whenever you step into dmaengine what you can/cannot > > do, and have a reference document explaining what's expected by the > > framework, and hopefully have unified drivers that follow this > > pattern. > Sure, can you pls modify this to avoid duplication. I would be happy to > apply that :) See above :) Also, feel free to add anything that you feel like you keep saying during the review. If mistakes keep coming, it's probably worth documenting what you expect. > > Because, for the moment, we're pretty much left in the dark with > > different drivers doing the same thing in completetely different ways, > > with basically no way to tell if it's either the framework that > > requires such behaviour, or if the author was just feeling creative. > > > > There's numerous examples for this at the moment: > > - The GFP flags, with different drivers using either GFP_ATOMIC, > > GFP_NOWAIT or GFP_KERNEL in the same functions > > - Having to set device_slave_caps or not? > > - Some drivers use dma_run_depedencies, some other don't > > - That might just be my experience, but judging from previous > > commits, DMA_PRIVATE is completely obscure, and we just set it > > because it was making it work, without knowing what it was > > supposed to do. > > - etc. > > Thanks for highlighting we should definitely add these in Documentation It's quite clear in the case of the GFP flags now, Lars-Peter and you cleared up device_slave_caps, but I still could use some help with DMA_PRIVATE. > > And basically, we have no way to tell at the moment which one is > > right and which one needs fixing. > > > > The corollary being that it cripples the whole community ability to > > maintain the framework and make it evolve. > > > > > > + * device_slave_caps > > > > + - Isn't that redundant with the cap_mask already? > > > > + - Only a few drivers seem to implement it > > > For audio to know what your channel can do rather than hardcoding it > > > > Ah, yes, I see it now. It's not related to the caps mask at all. > > > > Just out of curiosity, wouldn't it be better to move this to the > > framework, and have these informations provided through the struct > > dma_device? Or would it have some non-trivial side-effects? > Well the problem is ability to have this queried uniformly from all drivers > across subsystems. If we can do this that would be nice. I can work on some premelinary work to do just that, and see if it works for you then. > > > > + * dma cookies? > > > cookie is dma transaction representation which is monotonically incrementing > > > number. > > > > Ok, and it identifies a unique dma_async_tx_descriptor, right? > Yup and this basically represents transactions you have submitted. Thats why > cookie is allocated at tx_submit. Ok, thanks. Maxime
On Thu, Jul 31, 2014 at 02:22:23PM +0100, Russell King - ARM Linux wrote: > On Thu, Jul 31, 2014 at 09:44:40AM +0200, Maxime Ripard wrote: > > I didn't miss it. But I feel like it describes quite nicely the slave > > API, but doesn't help at all whenever you're writing a DMAengine driver. > > There's actually two documents: > > Documentation/crypto/async-tx-api.txt > Documentation/dmaengine.txt > > The first is the original DMA engine API for use with offloading things > like memcpy, as well as the crypto API to hardware crypto DMA engines. > > The second is documentation for the DMA slave API which modifies some > of the requirements of the first (like, the ability to call the slave > DMA prepare functions from within the callback, which are not permitted > in the async tx API.) > > Both documents are relevant for writing a DMA engine driver. They're both relevant, but clearly not enough. > > Because, for the moment, we're pretty much left in the dark with > > different drivers doing the same thing in completetely different ways, > > with basically no way to tell if it's either the framework that > > requires such behaviour, or if the author was just feeling creative. > > DMA engine has lacked a lot of infrastructure for common patterns in > drivers; some of that is solved by my efforts with the virt_dma.c > support, and also various cleanups to existing drivers, but it's not > easy to fix this problem after drivers have been merged. > > > There's numerous examples for this at the moment: > > - The GFP flags, with different drivers using either GFP_ATOMIC, > > GFP_NOWAIT or GFP_KERNEL in the same functions > > The slave prepare APIs should be using GFP_ATOMIC to ensure safety. You prove my point then. Vinod asks for GFP_NOWAIT in his reviews. Even though it doesn't change anything relative to the atomicity of the operations, the policy is still not the same. > > - Having to set device_slave_caps or not? > > device_slave_caps is a relatively new invention, so old drivers don't > implement it. Newer drivers should, and there probably should be some > motivation for older drivers to add it. Yep, just discovered that. > > - Some drivers use dma_run_depedencies, some other don't > > Dependent operations are part of the async-tx API, and aren't really > applicable to the slave DMA API. A driver implementing just the slave > DMA API need not concern itself with dependent operations, but a > driver implementing the async-tx APIs should do. Ok. > > - That might just be my experience, but judging from previous > > commits, DMA_PRIVATE is completely obscure, and we just set it > > because it was making it work, without knowing what it was > > supposed to do. > > DMA_PRIVATE means that the channel is unavailable for async-tx operations > - in other words, it's slave-only. Setting it before registration results > in the private-use count being incremented, disabling the DMA_PRIVATE > manipulation in the channel request/release APIs (requested channels are > unavailable for async-tx operations, which is why that flag is also set > there.) > > To put it another way, if a DMA engine driver only provides slave DMA > support, it must set DMA_PRIVATE to mark the channel unavailable for > async-tx operations. If a DMA engine drivers channels can also be used > for async-tx operations, then it should leave the flag clear. What about channels that can be both used for slave operations and async-tx (like memcpy) ? > > And basically, we have no way to tell at the moment which one is > > right and which one needs fixing. > > > > The corollary being that it cripples the whole community ability to > > maintain the framework and make it evolve. > > All respect to Vinod (who looks after the slave side of the DMA engine > implementations) and Dan, what it needs is someone who knows the *whole* > DMA engine picture to be in control of the subsystem, and who knows these > details - not must one bit of it. Someone who has the time to put into > reviewing changes to it, and probably more importantly, has the time to > clean up the existing code. Finding someone who fits that is a struggle. > > Dan knows the whole picture (or used to when he was working on it at one > of his former employers) but I don't think Dan had the available time to > address various issues with it. Hence why we should document as much as possible then, to spread the knowledge and avoid it being lost when someone disappears or isn't available anymore. (And hopefully reduce the number of "errors" that might be done by submitters, hence reducing the number of review iterations, lessening the maintainers load) > > > cookie is dma transaction representation which is monotonically incrementing > > > number. > > > > Ok, and it identifies a unique dma_async_tx_descriptor, right? > > You really should not be concerned with DMA cookies anymore since one > of my cleanups - I added a number of helper functions which hide the > manipulation of DMA cookies, and take away from driver writers the > semantics of how cookies themselves are operated upon. virt-dma goes > a little further and provides descriptor lists and methods to look up > a descriptor given a cookie. > > The only thing that a driver writer should concern themselves with is > making the appropriate cookie management calls at the relevant point in > the code - in other words, allocating a cookie at in the submit callback, > completing a cookie when a descriptor is complete, etc. > > Drivers should have no reason what so ever to be touching the cookie > directly anymore. Ok, thanks a lot for this! Maxime
On 07/31/2014 06:13 PM, Maxime Ripard wrote: [...] > From what you're saying, and judging from the drivers that already > implement it, can't it be moved directly to the framework itself ? > What exactly do you mean by moving it directly to the framework? The slave_caps API is part of the DMAengine framework. - Lars -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 31, 2014 at 06:54:11PM +0200, Lars-Peter Clausen wrote: > On 07/31/2014 06:13 PM, Maxime Ripard wrote: > [...] > > From what you're saying, and judging from the drivers that already > >implement it, can't it be moved directly to the framework itself ? > > > > What exactly do you mean by moving it directly to the framework? The > slave_caps API is part of the DMAengine framework. Not its implementation, which is defined by each and every driver, while the behaviour of device_slave_caps is rather generic.
On 07/31/2014 07:37 PM, Maxime Ripard wrote: > On Thu, Jul 31, 2014 at 06:54:11PM +0200, Lars-Peter Clausen wrote: >> On 07/31/2014 06:13 PM, Maxime Ripard wrote: >> [...] >>> From what you're saying, and judging from the drivers that already >>> implement it, can't it be moved directly to the framework itself ? >>> >> >> What exactly do you mean by moving it directly to the framework? The >> slave_caps API is part of the DMAengine framework. > > Not its implementation, which is defined by each and every driver, > while the behaviour of device_slave_caps is rather generic. > Do you mean something like adding a dma_slave_caps struct field to the DMA channel that gets initialized when the channel is created and then remove the callback? That makes some sense. I think the main reason why we use a callback right now is that in earlier revisions of the API it was possible to pass a slave_config and the result would be the capabilities of the DMA channel for this particular config. But this was dropped before the final version was merged upstream. - Lars -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 01, 2014 at 10:00:10AM +0200, Lars-Peter Clausen wrote: > On 07/31/2014 07:37 PM, Maxime Ripard wrote: > >On Thu, Jul 31, 2014 at 06:54:11PM +0200, Lars-Peter Clausen wrote: > >>On 07/31/2014 06:13 PM, Maxime Ripard wrote: > >>[...] > >>> From what you're saying, and judging from the drivers that already > >>>implement it, can't it be moved directly to the framework itself ? > >>> > >> > >>What exactly do you mean by moving it directly to the framework? The > >>slave_caps API is part of the DMAengine framework. > > > >Not its implementation, which is defined by each and every driver, > >while the behaviour of device_slave_caps is rather generic. > > > > Do you mean something like adding a dma_slave_caps struct field to > the DMA channel that gets initialized when the channel is created > and then remove the callback? That makes some sense. I was rather thinking into something like: - Splitting device_control into independant functions - Then, knowing if you support pause/resume/terminate is trivial: either you implement the callback, or you don't - Putting the supported width and direction into fields of struct dma_device, which can eventually be used by the framework to filter out invalid configurations before calling the relevant callbacks - That would then be trivial to get from the framework, without calling any callback > I think the main reason why we use a callback right now is that in > earlier revisions of the API it was possible to pass a slave_config > and the result would be the capabilities of the DMA channel for this > particular config. But this was dropped before the final version was > merged upstream. Ah, that would explain yes. Maxime
On Thu, Jul 31, 2014 at 06:41:52PM +0200, Maxime Ripard wrote: > You prove my point then. Vinod asks for GFP_NOWAIT in his > reviews. Even though it doesn't change anything relative to the > atomicity of the operations, the policy is still not the same. The difference between GFP_NOWAIT and GFP_ATOMIC is whether the emergency pool can be used. If you want DMA prepare functions to fail when there is memory pressure, and you can guarantee that all drivers using the DMA engine API will properly fall back to some other method, then I suppose using GFP_NOWAIT is reasonable. GFP_ATOMIC just provides additional protection. > > > - That might just be my experience, but judging from previous > > > commits, DMA_PRIVATE is completely obscure, and we just set it > > > because it was making it work, without knowing what it was > > > supposed to do. > > > > DMA_PRIVATE means that the channel is unavailable for async-tx operations > > - in other words, it's slave-only. Setting it before registration results > > in the private-use count being incremented, disabling the DMA_PRIVATE > > manipulation in the channel request/release APIs (requested channels are > > unavailable for async-tx operations, which is why that flag is also set > > there.) > > > > To put it another way, if a DMA engine driver only provides slave DMA > > support, it must set DMA_PRIVATE to mark the channel unavailable for > > async-tx operations. If a DMA engine drivers channels can also be used > > for async-tx operations, then it should leave the flag clear. > > What about channels that can be both used for slave operations and > async-tx (like memcpy) ? Then you don't set DMA_PRIVATE when the DMA engine driver registers with the core. That then allows the DMA engine core to manage the flag, setting it when the channel is allocated for slave usage. One important point here is that async_tx does not have the idea of channel allocation - a channel which is registered into the DMA engine core without DMA_PRIVATE set is a candidate for use by the async_tx API. > Hence why we should document as much as possible then, to spread the > knowledge and avoid it being lost when someone disappears or isn't > available anymore. > > (And hopefully reduce the number of "errors" that might be done by > submitters, hence reducing the number of review iterations, lessening > the maintainers load) The bigger issue in the longer run is being able to maintain and improve the DMA engine implementations and API. One of the blockers to that is properly understanding all the async_tx API stuff, something which even I have been unable to do despite knowing a fair amount about DMA engine itself. Where I fall down is the exact meaning of the ACK stuff, how the chaining is supposed to work, how the dependency between two DMA channels are handled. This is why I was unable to resolve DMA engine's multiple mapping of the same DMA buffers, which has caused problems for ARM. I've discussed this point in the past with Dan, making the point that each DMA engine driver should not be that concerned about things like descriptor list management, cookies, dependent transfers, but we seem to be ultimately doomed to re-implementing much of that infrastructure in every single DMA engine driver that comes along. Each DMA engine driver which gets accepted into mainline makes this problem worse - it's another driver which does something slightly differently that if we were to ever clean this stuff up, would need to be fixed. This remains my biggest concern with the DMA engine code today, and I'm one of the few people who have put effort into trying to sort that out. > > Drivers should have no reason what so ever to be touching the cookie > > directly anymore. > > Ok, thanks a lot for this! I notice that drivers/dma/sirf-dma.c directly touches it and the completed cookie as well: drivers/dma/sirf-dma.c: dma_cookie_t last_cookie = 0; drivers/dma/sirf-dma.c: last_cookie = desc->cookie; drivers/dma/sirf-dma.c: schan->chan.completed_cookie = last_cookie; That's something which should be fixed. I haven't looked too hard because it's not something I want to get involved with again right now.
On Thu, Jul 31, 2014 at 06:23:30PM +0200, Maxime Ripard wrote: > On Thu, Jul 31, 2014 at 05:26:28PM +0530, Vinod Koul wrote: > > Also, feel free to add anything that you feel like you keep saying > during the review. If mistakes keep coming, it's probably worth > documenting what you expect. I think the common issues seen would be: - prpeare calls in atomic context and usuage of GFP_NOWAIT for memory allocations - residue callculation, though situation is much better now but still lots of driver do it worng and folks do get it wrong > > > > Because, for the moment, we're pretty much left in the dark with > > > different drivers doing the same thing in completetely different ways, > > > with basically no way to tell if it's either the framework that > > > requires such behaviour, or if the author was just feeling creative. > > > > > > There's numerous examples for this at the moment: > > > - The GFP flags, with different drivers using either GFP_ATOMIC, > > > GFP_NOWAIT or GFP_KERNEL in the same functions > > > - Having to set device_slave_caps or not? > > > - Some drivers use dma_run_depedencies, some other don't > > > - That might just be my experience, but judging from previous > > > commits, DMA_PRIVATE is completely obscure, and we just set it > > > because it was making it work, without knowing what it was > > > supposed to do. > > > - etc. > > > > Thanks for highlighting we should definitely add these in Documentation > > It's quite clear in the case of the GFP flags now, Lars-Peter and you > cleared up device_slave_caps, but I still could use some help with > DMA_PRIVATE. > > > > And basically, we have no way to tell at the moment which one is > > > right and which one needs fixing. > > > > > > The corollary being that it cripples the whole community ability to > > > maintain the framework and make it evolve. > > > > > > > > + * device_slave_caps > > > > > + - Isn't that redundant with the cap_mask already? > > > > > + - Only a few drivers seem to implement it > > > > For audio to know what your channel can do rather than hardcoding it > > > > > > Ah, yes, I see it now. It's not related to the caps mask at all. > > > > > > Just out of curiosity, wouldn't it be better to move this to the > > > framework, and have these informations provided through the struct > > > dma_device? Or would it have some non-trivial side-effects? > > Well the problem is ability to have this queried uniformly from all drivers > > across subsystems. If we can do this that would be nice. > > I can work on some premelinary work to do just that, and see if it > works for you then. Sure sounds excellent to me
On Fri, Aug 01, 2014 at 10:57:07AM +0200, Maxime Ripard wrote: > On Fri, Aug 01, 2014 at 10:00:10AM +0200, Lars-Peter Clausen wrote: > > On 07/31/2014 07:37 PM, Maxime Ripard wrote: > > >On Thu, Jul 31, 2014 at 06:54:11PM +0200, Lars-Peter Clausen wrote: > > >>On 07/31/2014 06:13 PM, Maxime Ripard wrote: > > >>[...] > > >>> From what you're saying, and judging from the drivers that already > > >>>implement it, can't it be moved directly to the framework itself ? > > >>> > > >> > > >>What exactly do you mean by moving it directly to the framework? The > > >>slave_caps API is part of the DMAengine framework. > > > > > >Not its implementation, which is defined by each and every driver, > > >while the behaviour of device_slave_caps is rather generic. > > > > > > > Do you mean something like adding a dma_slave_caps struct field to > > the DMA channel that gets initialized when the channel is created > > and then remove the callback? That makes some sense. > > I was rather thinking into something like: > - Splitting device_control into independant functions I like this part :) > - Then, knowing if you support pause/resume/terminate is trivial: > either you implement the callback, or you don't > - Putting the supported width and direction into fields of struct > dma_device, which can eventually be used by the framework to > filter out invalid configurations before calling the relevant > callbacks thats is a good idea > - That would then be trivial to get from the framework, without > calling any callback Yes please
On Thu, Jul 31, 2014 at 02:22:23PM +0100, Russell King - ARM Linux wrote: > DMA engine has lacked a lot of infrastructure for common patterns in > drivers; some of that is solved by my efforts with the virt_dma.c > support, and also various cleanups to existing drivers, but it's not > easy to fix this problem after drivers have been merged. I think list management is something that needs fix. I see lots of time and energy being spent by drivers in this. I will plan for this in near future to move descriptor and list management to the framework. That should help drivers get cleaned up a bit more
On 08/01/2014 07:15 PM, Vinod Koul wrote: > On Fri, Aug 01, 2014 at 10:57:07AM +0200, Maxime Ripard wrote: >> On Fri, Aug 01, 2014 at 10:00:10AM +0200, Lars-Peter Clausen wrote: >>> On 07/31/2014 07:37 PM, Maxime Ripard wrote: >>>> On Thu, Jul 31, 2014 at 06:54:11PM +0200, Lars-Peter Clausen wrote: >>>>> On 07/31/2014 06:13 PM, Maxime Ripard wrote: >>>>> [...] >>>>>> From what you're saying, and judging from the drivers that already >>>>>> implement it, can't it be moved directly to the framework itself ? >>>>>> >>>>> >>>>> What exactly do you mean by moving it directly to the framework? The >>>>> slave_caps API is part of the DMAengine framework. >>>> >>>> Not its implementation, which is defined by each and every driver, >>>> while the behaviour of device_slave_caps is rather generic. >>>> >>> >>> Do you mean something like adding a dma_slave_caps struct field to >>> the DMA channel that gets initialized when the channel is created >>> and then remove the callback? That makes some sense. >> >> I was rather thinking into something like: >> - Splitting device_control into independant functions > I like this part :) I started working on this a while ago by splitting out the slave_config functionality into its own callback. Haven't managed to finalize it since it wasn't really top priority. - Lars -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 01, 2014 at 10:43:06PM +0530, Vinod Koul wrote: > On Thu, Jul 31, 2014 at 06:23:30PM +0200, Maxime Ripard wrote: > > On Thu, Jul 31, 2014 at 05:26:28PM +0530, Vinod Koul wrote: > > > > Also, feel free to add anything that you feel like you keep saying > > during the review. If mistakes keep coming, it's probably worth > > documenting what you expect. > I think the common issues seen would be: > - prpeare calls in atomic context and usuage of GFP_NOWAIT for memory > allocations I think we have that part covered already. > - residue callculation, though situation is much better now but still lots > of driver do it worng and folks do get it wrong What mistake in often made regarding the residue calculation? > > > > Because, for the moment, we're pretty much left in the dark with > > > > different drivers doing the same thing in completetely different ways, > > > > with basically no way to tell if it's either the framework that > > > > requires such behaviour, or if the author was just feeling creative. > > > > > > > > There's numerous examples for this at the moment: > > > > - The GFP flags, with different drivers using either GFP_ATOMIC, > > > > GFP_NOWAIT or GFP_KERNEL in the same functions > > > > - Having to set device_slave_caps or not? > > > > - Some drivers use dma_run_depedencies, some other don't > > > > - That might just be my experience, but judging from previous > > > > commits, DMA_PRIVATE is completely obscure, and we just set it > > > > because it was making it work, without knowing what it was > > > > supposed to do. > > > > - etc. > > > > > > Thanks for highlighting we should definitely add these in Documentation > > > > It's quite clear in the case of the GFP flags now, Lars-Peter and you > > cleared up device_slave_caps, but I still could use some help with > > DMA_PRIVATE. > > > > > > And basically, we have no way to tell at the moment which one is > > > > right and which one needs fixing. > > > > > > > > The corollary being that it cripples the whole community ability to > > > > maintain the framework and make it evolve. > > > > > > > > > > + * device_slave_caps > > > > > > + - Isn't that redundant with the cap_mask already? > > > > > > + - Only a few drivers seem to implement it > > > > > For audio to know what your channel can do rather than hardcoding it > > > > > > > > Ah, yes, I see it now. It's not related to the caps mask at all. > > > > > > > > Just out of curiosity, wouldn't it be better to move this to the > > > > framework, and have these informations provided through the struct > > > > dma_device? Or would it have some non-trivial side-effects? > > > Well the problem is ability to have this queried uniformly from all drivers > > > across subsystems. If we can do this that would be nice. > > > > I can work on some premelinary work to do just that, and see if it > > works for you then. > Sure sounds excellent to me Another extra questions arose during starting this. In the case of the call to device_control, especially in the DMA_SLAVE_CONFIG case, but that also applies to pause/resume, are the changes supposed to be immediates or can they happen later? I actually have in mind the case where we would use a vchan, that might or might not be actually mapped to a physical channel at the moment where the DMA_SLAVE_CONFIG call is made. In the case where it's not mapped and not transfering anything, it's pretty trivial, to handle, but in the case where it's actually mapped to a physical channel, should we push the new configuration to the physical channel right away, or can it wait until the transfer ends ? Maxime
On Fri, Aug 01, 2014 at 03:53:26PM +0100, Russell King - ARM Linux wrote: > > > > - That might just be my experience, but judging from previous > > > > commits, DMA_PRIVATE is completely obscure, and we just set it > > > > because it was making it work, without knowing what it was > > > > supposed to do. > > > > > > DMA_PRIVATE means that the channel is unavailable for async-tx operations > > > - in other words, it's slave-only. Setting it before registration results > > > in the private-use count being incremented, disabling the DMA_PRIVATE > > > manipulation in the channel request/release APIs (requested channels are > > > unavailable for async-tx operations, which is why that flag is also set > > > there.) > > > > > > To put it another way, if a DMA engine driver only provides slave DMA > > > support, it must set DMA_PRIVATE to mark the channel unavailable for > > > async-tx operations. If a DMA engine drivers channels can also be used > > > for async-tx operations, then it should leave the flag clear. > > > > What about channels that can be both used for slave operations and > > async-tx (like memcpy) ? > > Then you don't set DMA_PRIVATE when the DMA engine driver registers with > the core. That then allows the DMA engine core to manage the flag, > setting it when the channel is allocated for slave usage. Then I guess we currently have a bug related to this. During the development of the A31 driver that recently got merged, during two subsequent channel allocation, the second would fail if DMA_PRIVATE wasn't set. I think it was on in dmaengine_get, but I'll have to get back to you on that whenever I'm back from my holydays. > One important point here is that async_tx does not have the idea of > channel allocation - a channel which is registered into the DMA engine > core without DMA_PRIVATE set is a candidate for use by the async_tx > API. > > > Hence why we should document as much as possible then, to spread the > > knowledge and avoid it being lost when someone disappears or isn't > > available anymore. > > > > (And hopefully reduce the number of "errors" that might be done by > > submitters, hence reducing the number of review iterations, lessening > > the maintainers load) > > The bigger issue in the longer run is being able to maintain and improve > the DMA engine implementations and API. One of the blockers to that is > properly understanding all the async_tx API stuff, something which even > I have been unable to do despite knowing a fair amount about DMA engine > itself. > > Where I fall down is the exact meaning of the ACK stuff, how the chaining > is supposed to work, how the dependency between two DMA channels are > handled. > > This is why I was unable to resolve DMA engine's multiple mapping of the > same DMA buffers, which has caused problems for ARM. > > I've discussed this point in the past with Dan, making the point that > each DMA engine driver should not be that concerned about things like > descriptor list management, cookies, dependent transfers, but we seem > to be ultimately doomed to re-implementing much of that infrastructure > in every single DMA engine driver that comes along. You can also add vchan scheduling or descriptor allocations. The atmel guys seem to have hit a performance issue with dma_pool_alloc, so they re-developped a descriptor allocation mechanism in their driver. I don't have much more details on this, but if that was to be true, that would definitely be something that should be in the framework. vchan mapping to physical channels also seem to be quite common in the drivers. That would be a nice addition too. > Each DMA engine driver which gets accepted into mainline makes this > problem worse - it's another driver which does something slightly > differently that if we were to ever clean this stuff up, would need > to be fixed. This remains my biggest concern with the DMA engine > code today, and I'm one of the few people who have put effort into > trying to sort that out. Yep. It's one of mine too, I'm willing to help if possible. > > > Drivers should have no reason what so ever to be touching the cookie > > > directly anymore. > > > > Ok, thanks a lot for this! > > I notice that drivers/dma/sirf-dma.c directly touches it and the > completed cookie as well: > > drivers/dma/sirf-dma.c: dma_cookie_t last_cookie = 0; > drivers/dma/sirf-dma.c: last_cookie = desc->cookie; > drivers/dma/sirf-dma.c: schan->chan.completed_cookie = last_cookie; > > That's something which should be fixed. I haven't looked too hard > because it's not something I want to get involved with again right > now. By quickly looking at the code, it looks like dead code. I guess it could just be removed. Maxime
On Fri, Aug 01, 2014 at 08:09:17PM +0200, Lars-Peter Clausen wrote: > On 08/01/2014 07:15 PM, Vinod Koul wrote: > >On Fri, Aug 01, 2014 at 10:57:07AM +0200, Maxime Ripard wrote: > >>On Fri, Aug 01, 2014 at 10:00:10AM +0200, Lars-Peter Clausen wrote: > >>>On 07/31/2014 07:37 PM, Maxime Ripard wrote: > >>>>On Thu, Jul 31, 2014 at 06:54:11PM +0200, Lars-Peter Clausen wrote: > >>>>>On 07/31/2014 06:13 PM, Maxime Ripard wrote: > >>>>>[...] > >>>>>> From what you're saying, and judging from the drivers that already > >>>>>>implement it, can't it be moved directly to the framework itself ? > >>>>>> > >>>>> > >>>>>What exactly do you mean by moving it directly to the framework? The > >>>>>slave_caps API is part of the DMAengine framework. > >>>> > >>>>Not its implementation, which is defined by each and every driver, > >>>>while the behaviour of device_slave_caps is rather generic. > >>>> > >>> > >>>Do you mean something like adding a dma_slave_caps struct field to > >>>the DMA channel that gets initialized when the channel is created > >>>and then remove the callback? That makes some sense. > >> > >>I was rather thinking into something like: > >> - Splitting device_control into independant functions > >I like this part :) > > I started working on this a while ago by splitting out the > slave_config functionality into its own callback. Haven't managed to > finalize it since it wasn't really top priority. He, I've done the same yesterday... Do you plan on submitting it soon, or should I keep going? Maxime
On Sat, Aug 02, 2014 at 04:49:25PM +0200, Maxime Ripard wrote: > In the case of the call to device_control, especially in the > DMA_SLAVE_CONFIG case, but that also applies to pause/resume, are the > changes supposed to be immediates or can they happen later? pause/resume are expected to operate synchronously on the channel. When asking for the channel to pause, the channel must be stopped before the call returns. In the case of channel configuration, that should have no effect on any previously queued transfer - in other words, changing the channel configuration only has an effect when subsequent transactions are prepared.
On Sat, Aug 02, 2014 at 05:11:44PM +0200, Maxime Ripard wrote: > On Fri, Aug 01, 2014 at 03:53:26PM +0100, Russell King - ARM Linux wrote: > > > > > - That might just be my experience, but judging from previous > > > > > commits, DMA_PRIVATE is completely obscure, and we just set it > > > > > because it was making it work, without knowing what it was > > > > > supposed to do. > > > > > > > > DMA_PRIVATE means that the channel is unavailable for async-tx operations > > > > - in other words, it's slave-only. Setting it before registration results > > > > in the private-use count being incremented, disabling the DMA_PRIVATE > > > > manipulation in the channel request/release APIs (requested channels are > > > > unavailable for async-tx operations, which is why that flag is also set > > > > there.) > > > > > > > > To put it another way, if a DMA engine driver only provides slave DMA > > > > support, it must set DMA_PRIVATE to mark the channel unavailable for > > > > async-tx operations. If a DMA engine drivers channels can also be used > > > > for async-tx operations, then it should leave the flag clear. > > > > > > What about channels that can be both used for slave operations and > > > async-tx (like memcpy) ? > > > > Then you don't set DMA_PRIVATE when the DMA engine driver registers with > > the core. That then allows the DMA engine core to manage the flag, > > setting it when the channel is allocated for slave usage. > > Then I guess we currently have a bug related to this. > > During the development of the A31 driver that recently got merged, > during two subsequent channel allocation, the second would fail if > DMA_PRIVATE wasn't set. > > I think it was on in dmaengine_get, but I'll have to get back to you > on that whenever I'm back from my holydays. I think you mean dma_chan_get(). dmaengine_get() is used to obtain references to the async_tx memcpy/crypto channels, and should not be used by a driver making use of the slave capabilities. There two systems are slightly competitive between each other. Slave stuff generally want to get hold of specific channels, whereas the async_tx stuff can generally do with any channel which provides the appropriate capabilities, and can switch between channels if it needs to perform a series of operations only supported by separate channels. > You can also add vchan scheduling or descriptor allocations. The atmel > guys seem to have hit a performance issue with dma_pool_alloc, so they > re-developped a descriptor allocation mechanism in their driver. I > don't have much more details on this, but if that was to be true, that > would definitely be something that should be in the framework. It is entirely legal for a dmaengine driver to maintain its own list of "free" transaction descriptors (that's actually what many async-tx drivers do) and is partly why there's the alloc_chan_resources and free_chan_resources callbacks. However, they tend to be fairly large structures, and don't always match what the hardware wants, so using them to describe the individual scatterlist elements of a transaction is not always a good idea. > vchan mapping to physical channels also seem to be quite common in the > drivers. That would be a nice addition too. It's intentionally not in the vchan interface because that mapping is device specific. For example, there are some DMA engine implementations where certain physical DMA channels should not be used because they have slightly different properties (for example, they can lock the CPU off the bus if they're permanently transferring data, such as in a memcpy or device-to-memory transfer where the device always has data available.) In any case, I'm not interested in seeing vchan turning into another "layer" - it must remain a library. Layers are bad news. :)
On Sat, Aug 02, 2014 at 04:29:21PM +0100, Russell King - ARM Linux wrote: > On Sat, Aug 02, 2014 at 05:11:44PM +0200, Maxime Ripard wrote: > > On Fri, Aug 01, 2014 at 03:53:26PM +0100, Russell King - ARM Linux wrote: > > > > > > - That might just be my experience, but judging from previous > > > > > > commits, DMA_PRIVATE is completely obscure, and we just set it > > > > > > because it was making it work, without knowing what it was > > > > > > supposed to do. > > > > > > > > > > DMA_PRIVATE means that the channel is unavailable for async-tx operations > > > > > - in other words, it's slave-only. Setting it before registration results > > > > > in the private-use count being incremented, disabling the DMA_PRIVATE > > > > > manipulation in the channel request/release APIs (requested channels are > > > > > unavailable for async-tx operations, which is why that flag is also set > > > > > there.) > > > > > > > > > > To put it another way, if a DMA engine driver only provides slave DMA > > > > > support, it must set DMA_PRIVATE to mark the channel unavailable for > > > > > async-tx operations. If a DMA engine drivers channels can also be used > > > > > for async-tx operations, then it should leave the flag clear. > > > > > > > > What about channels that can be both used for slave operations and > > > > async-tx (like memcpy) ? > > > > > > Then you don't set DMA_PRIVATE when the DMA engine driver registers with > > > the core. That then allows the DMA engine core to manage the flag, > > > setting it when the channel is allocated for slave usage. > > > > Then I guess we currently have a bug related to this. > > > > During the development of the A31 driver that recently got merged, > > during two subsequent channel allocation, the second would fail if > > DMA_PRIVATE wasn't set. > > > > I think it was on in dmaengine_get, but I'll have to get back to you > > on that whenever I'm back from my holydays. > > I think you mean dma_chan_get(). dmaengine_get() is used to obtain > references to the async_tx memcpy/crypto channels, and should not be used > by a driver making use of the slave capabilities. Like I said, I don't remember where the actual issue was lying, but ignoring DMA_PRIVATE prevented to allocate two channels at the same time in my case. When I'll have access again to my board, I'll try to dig more into it. > There two systems are slightly competitive between each other. Slave > stuff generally want to get hold of specific channels, whereas the > async_tx stuff can generally do with any channel which provides the > appropriate capabilities, and can switch between channels if it needs > to perform a series of operations only supported by separate channels. > > > You can also add vchan scheduling or descriptor allocations. The atmel > > guys seem to have hit a performance issue with dma_pool_alloc, so they > > re-developped a descriptor allocation mechanism in their driver. I > > don't have much more details on this, but if that was to be true, that > > would definitely be something that should be in the framework. > > It is entirely legal for a dmaengine driver to maintain its own list of > "free" transaction descriptors (that's actually what many async-tx > drivers do) and is partly why there's the alloc_chan_resources and > free_chan_resources callbacks. > > However, they tend to be fairly large structures, and don't always > match what the hardware wants, so using them to describe the individual > scatterlist elements of a transaction is not always a good idea. I was not really claiming it was illegal, but rather that it could be useful for other drivers too. > > vchan mapping to physical channels also seem to be quite common in the > > drivers. That would be a nice addition too. > > It's intentionally not in the vchan interface because that mapping is > device specific. > > For example, there are some DMA engine implementations where certain > physical DMA channels should not be used because they have slightly > different properties (for example, they can lock the CPU off the bus > if they're permanently transferring data, such as in a memcpy or > device-to-memory transfer where the device always has data available.) > > In any case, I'm not interested in seeing vchan turning into another > "layer" - it must remain a library. Layers are bad news. :) Yeah, I was assuming there would be such constraints. Still, handling the trivial case where you can map any vchan to any free physical channels is very painful, while at least a couple of drivers implement the same dumb logic.
On Sat, Aug 02, 2014 at 04:17:14PM +0100, Russell King - ARM Linux wrote: > On Sat, Aug 02, 2014 at 04:49:25PM +0200, Maxime Ripard wrote: > > In the case of the call to device_control, especially in the > > DMA_SLAVE_CONFIG case, but that also applies to pause/resume, are the > > changes supposed to be immediates or can they happen later? > > pause/resume are expected to operate synchronously on the channel. > When asking for the channel to pause, the channel must be stopped > before the call returns. > > In the case of channel configuration, that should have no effect on > any previously queued transfer - in other words, changing the channel > configuration only has an effect when subsequent transactions are > prepared. Ok, thanks!
On 08/02/2014 05:13 PM, Maxime Ripard wrote: > On Fri, Aug 01, 2014 at 08:09:17PM +0200, Lars-Peter Clausen wrote: >> On 08/01/2014 07:15 PM, Vinod Koul wrote: >>> On Fri, Aug 01, 2014 at 10:57:07AM +0200, Maxime Ripard wrote: >>>> On Fri, Aug 01, 2014 at 10:00:10AM +0200, Lars-Peter Clausen wrote: >>>>> On 07/31/2014 07:37 PM, Maxime Ripard wrote: >>>>>> On Thu, Jul 31, 2014 at 06:54:11PM +0200, Lars-Peter Clausen wrote: >>>>>>> On 07/31/2014 06:13 PM, Maxime Ripard wrote: >>>>>>> [...] >>>>>>>> From what you're saying, and judging from the drivers that already >>>>>>>> implement it, can't it be moved directly to the framework itself ? >>>>>>>> >>>>>>> >>>>>>> What exactly do you mean by moving it directly to the framework? The >>>>>>> slave_caps API is part of the DMAengine framework. >>>>>> >>>>>> Not its implementation, which is defined by each and every driver, >>>>>> while the behaviour of device_slave_caps is rather generic. >>>>>> >>>>> >>>>> Do you mean something like adding a dma_slave_caps struct field to >>>>> the DMA channel that gets initialized when the channel is created >>>>> and then remove the callback? That makes some sense. >>>> >>>> I was rather thinking into something like: >>>> - Splitting device_control into independant functions >>> I like this part :) >> >> I started working on this a while ago by splitting out the >> slave_config functionality into its own callback. Haven't managed to >> finalize it since it wasn't really top priority. > > He, I've done the same yesterday... Do you plan on submitting it soon, > or should I keep going? The plan was to submit it eventually. But if you already did it as well, go for it. - Lars -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Maxime, On Wed, Jul 30, 2014 at 6:03 PM, Maxime Ripard <maxime.ripard@free-electrons.com> wrote: > The dmaengine is neither trivial nor properly documented at the moment, which > means a lot of trial and error development, which is not that good for such a > central piece of the system. > > Attempt at making such a documentation. Great, thanks! > +Our dma_device structure has a field called caps_mask that holds the > +various type of transaction supported, and you need to modify this types of transactions > +mask using the dma_cap_set function, with various flags depending on > +transaction types you support as an argument. > + * DMA_SLAVE > + - The device can handle device to memory transfers, including > + scatter-gather transfers. > + - While in the mem2mem case we were having two distinct type to types > + deal with a single chunk to copy or a collection of them, here, > + we just have a single transaction type that is supposed to > + handle both. > + * DMA_INTERLEAVE > + - The device support interleaved transfer. Those transfers usually supports > + involve an interleaved set of data, with chunks a few bytes > + wide, where a scatter-gather transfer would be quite > + inefficient. > +The functions that we have to fill in there, and hence have to > +implement, obviously depend on the transaction type you reported as types > +supported. > + * device_issue_pending > + - Takes the first descriptor in the pending queue, and start the starts > + transfer. Whenever that transfer is done, it should move to the > + next transaction in the list. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Aug 02, 2014 at 04:49:25PM +0200, Maxime Ripard wrote: > > - residue callculation, though situation is much better now but still lots > > of driver do it worng and folks do get it wrong > > What mistake in often made regarding the residue calculation? - residue is for the descriptor asked - not for the current active one - we can store the size locally and return that when not processing - arg txstate can be null > Another extra questions arose during starting this. > > In the case of the call to device_control, especially in the > DMA_SLAVE_CONFIG case, but that also applies to pause/resume, are the > changes supposed to be immediates or can they happen later? > > I actually have in mind the case where we would use a vchan, that > might or might not be actually mapped to a physical channel at the > moment where the DMA_SLAVE_CONFIG call is made. In the case where it's > not mapped and not transfering anything, it's pretty trivial, to > handle, but in the case where it's actually mapped to a physical > channel, should we push the new configuration to the physical channel > right away, or can it wait until the transfer ends ? Ah no. The model doesn't assume the config will be written runtime. So the descriptor which are already prepared doesn't change with DMA_SLAVE_CONFIG call. Only new descriptors will take those values But yes the pause/resume would be syncronous HTH
On Wed, Jul 30, 2014 at 06:03:13PM +0200, Maxime Ripard wrote: > The dmaengine is neither trivial nor properly documented at the moment, which > means a lot of trial and error development, which is not that good for such a > central piece of the system. > > Attempt at making such a documentation. Good idea, many questions are asked when writing a new dmaengine driver. For instance I didn't find how to use the DMA_CTRL_ACK flags. - How this flag has to be managed? For instance, async_tx_ack is used in dmaengine driver but also in some devices. - Is it mandatory to deal with this flag? It seems some dmaengine drivers don't care about it. > > Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> > --- > Documentation/dmaengine-driver.txt | 293 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 293 insertions(+) > create mode 100644 Documentation/dmaengine-driver.txt > > diff --git a/Documentation/dmaengine-driver.txt b/Documentation/dmaengine-driver.txt > new file mode 100644 > index 000000000000..4828b50038c0 > --- /dev/null > +++ b/Documentation/dmaengine-driver.txt > @@ -0,0 +1,293 @@ > +DMAengine controller documentation > +================================== > + > +Hardware Introduction > ++++++++++++++++++++++ > + > +Most of the Slave DMA controllers have the same general principles of > +operations. > + > +They have a given number of channels to use for the DMA transfers, and > +a given number of requests lines. > + > +Requests and channels are pretty much orthogonal. Channels can be used > +to serve several to any requests. To simplify, channels are the > +entities that will be doing the copy, and requests what endpoints are > +involved. > + > +The request lines actually correspond to physical lines going from the > +DMA-elligible devices to the controller itself. Whenever the device > +will want to start a transfer, it will assert a DMA request (DRQ) by > +asserting that request line. > + > +A very simple DMA controller would only take into account a single > +parameter: the transfer size. At each clock cycle, it would transfer a > +byte of data from one buffer to another, until the transfer size has > +been reached. > + > +That wouldn't work well in the real world, since slave devices might > +require to have to retrieve various number of bits from memory at a > +time. For example, we probably want to transfer 32 bits at a time when > +doing a simple memory copy operation, but our audio device will > +require to have 16 or 24 bits written to its FIFO. This is why most if > +not all of the DMA controllers can adjust this, using a parameter > +called the width. > + > +Moreover, some DMA controllers, whenever the RAM is involved, can > +group the reads or writes in memory into a buffer, so instead of > +having a lot of small memory accesses, which is not really efficient, > +you'll get several bigger transfers. This is done using a parameter > +called the burst size, that defines how many single reads/writes it's > +allowed to do in a single clock cycle. > + > +Our theorical DMA controller would then only be able to do transfers > +that involve a single contiguous block of data. However, some of the > +transfers we usually have are not, and want to copy data from > +non-contiguous buffers to a contiguous buffer, which is called > +scatter-gather. > + > +DMAEngine, at least for mem2dev transfers, require support for > +scatter-gather. So we're left with two cases here: either we have a > +quite simple DMA controller that doesn't support it, and we'll have to > +implement it in software, or we have a more advanced DMA controller, > +that implements in hardware scatter-gather. > + > +The latter are usually programmed using a collection of chunks to > +transfer, and whenever the transfer is started, the controller will go > +over that collection, doing whatever we programmed there. > + > +This collection is usually either a table or a linked list. You will > +then push either the address of the table and its number of elements, > +or the first item of the list to one channel of the DMA controller, > +and whenever a DRQ will be asserted, it will go through the collection > +to know where to fetch the data from. > + > +Either way, the format of this collection is completely dependent of > +your hardware. Each DMA controller will require a different structure, > +but all of them will require, for every chunk, at least the source and > +destination addresses, wether it should increment these addresses or > +not and the three parameters we saw earlier: the burst size, the bus > +width and the transfer size. > + > +The one last thing is that usually, slave devices won't issue DRQ by > +default, and you have to enable this in your slave device driver first > +whenever you're willing to use DMA. > + > +These were just the general memory-to-memory (also called mem2mem) or > +memory-to-device (mem2dev) transfers. Other kind of transfers might be > +offered by your DMA controller, and are probably already supported by > +dmaengine. > + > +DMAEngine Registration > +++++++++++++++++++++++ > + > +struct dma_device Initialization > +-------------------------------- > + > +Just like any other kernel framework, the whole DMAEngine registration > +relies on the driver filling a structure and registering against the > +framework. In our case, that structure is dma_device. > + > +The first thing you need to do in your driver is to allocate this > +structure. Any of the usual memory allocator will do, but you'll also > +need to initialize a few fields in there: > + > + * chancnt: should be the number of channels your driver is exposing > + to the system. > + This doesn't have to be the number of physical > + channels: some DMA controllers also expose virtual > + channels to the system to overcome the case where you > + have more consumers than physical channels available. > + > + * channels: should be initialized as a list using the > + INIT_LIST_HEAD macro for example > + > + * dev: should hold the pointer to the struct device associated > + to your current driver instance. > + > +Supported transaction types > +--------------------------- > +The next thing you need is to actually set which transaction type your > +device (and driver) supports. > + > +Our dma_device structure has a field called caps_mask that holds the > +various type of transaction supported, and you need to modify this > +mask using the dma_cap_set function, with various flags depending on > +transaction types you support as an argument. > + > +All those capabilities are defined in the dma_transaction_type enum, > +in include/linux/dmaengine.h > + > +Currently, the types available are: > + * DMA_MEMCPY > + - The device is able to do memory to memory copies > + > + * DMA_XOR > + - The device is able to perform XOR operations on memory areas > + - Particularly useful to accelerate XOR intensive tasks, such as > + RAID5 > + > + * DMA_XOR_VAL > + - The device is able to perform parity check using the XOR > + algorithm against a memory buffer. > + > + * DMA_PQ > + - The device is able to perform RAID6 P+Q computations, P being a > + simple XOR, and Q being a Reed-Solomon algorithm. > + > + * DMA_PQ_VAL > + - The device is able to perform parity check using RAID6 P+Q > + algorithm against a memory buffer. > + > + * DMA_INTERRUPT > + /* TODO: Is it that the device has one interrupt per channel? */ > + > + * DMA_SG > + - The device supports memory to memory scatter-gather > + transfers. > + - Even though a plain memcpy can look like a particular case of a > + scatter-gather transfer, with a single chunk to transfer, it's a > + distinct transaction type in the mem2mem transfers case > + > + * DMA_PRIVATE > + - The device can have several client at a time, most likely > + because it has several parallel channels. > + > + * DMA_ASYNC_TX > + - Must not be set by the device, and will be set by the framework > + if needed > + - /* TODO: What is it about? */ > + > + * DMA_SLAVE > + - The device can handle device to memory transfers, including > + scatter-gather transfers. > + - While in the mem2mem case we were having two distinct type to > + deal with a single chunk to copy or a collection of them, here, > + we just have a single transaction type that is supposed to > + handle both. > + > + * DMA_CYCLIC > + - The device can handle cyclic transfers. > + - A cyclic transfer is a transfer where the chunk collection will > + loop over itself, with the last item pointing to the first. It's > + usually used for audio transfers, where you want to operate on a > + single big buffer that you will fill with your audio data. > + > + * DMA_INTERLEAVE > + - The device support interleaved transfer. Those transfers usually > + involve an interleaved set of data, with chunks a few bytes > + wide, where a scatter-gather transfer would be quite > + inefficient. > + > +These various types will also affect how the source and destination > +addresses change over time, as DMA_SLAVE transfers will usually have > +one of the addresses that will increment, while the other will not, > +DMA_CYCLIC will have one address that will loop, while the other, will > +not change, etc. > + > +Device operations > +----------------- > + > +Our dma_device structure also requires a few function pointers in > +order to implement the actual logic, now that we described what > +operations we were able to perform. > + > +The functions that we have to fill in there, and hence have to > +implement, obviously depend on the transaction type you reported as > +supported. > + > + * device_alloc_chan_resources > + * device_free_chan_resources > + - These functions will be called whenever a driver will call > + dma_request_channel or dma_release_channel for the first/last > + time on the channel associated to that driver. > + - They are in charge of allocating/freeing all the needed > + resources in order for that channel to be useful for your > + driver. > + - These functions can sleep. > + > + * device_prep_dma_* > + - These functions are matching the capabilities you registered > + previously. > + - These functions all take the buffer or the scatterlist relevant > + for the transfer being prepared, and should create a hardware > + descriptor or a list of descriptors from it > + - These functions can be called from an interrupt context > + - Any allocation you might do should be using the GFP_NOWAIT > + flag, in order not to potentially sleep, but without depleting > + the emergency pool either. > + > + - It should return a unique instance of the > + dma_async_tx_descriptor structure, that further represents this > + particular transfer. > + > + - This structure can be allocated using the function > + dma_async_tx_descriptor_init. > + - You'll also need to set two fields in this structure: > + + flags: > + TODO: Can it be modified by the driver itself, or > + should it be always the flags passed in the arguments > + > + + tx_submit: A pointer to a function you have to implement, > + that is supposed to push the current descriptor > + to a pending queue, waiting for issue_pending to > + be called. > + > + * device_issue_pending > + - Takes the first descriptor in the pending queue, and start the > + transfer. Whenever that transfer is done, it should move to the > + next transaction in the list. > + - It should call the registered callback if any each time a > + transaction is done. > + - This function can be called in an interrupt context > + > + * device_tx_status > + - Should report the bytes left to go over in the current transfer > + for the given channel > + - Should use dma_set_residue to report it > + - In the case of a cyclic transfer, it should only take into > + account the current period. > + - This function can be called in an interrupt context. > + > + * device_control > + - Used by client drivers to control and configure the channel it > + has a handle on. > + - Called with a command and an argument > + + The command is one of the values listed by the enum > + dma_ctrl_cmd. To this date, the valid commands are: > + + DMA_RESUME > + + Restarts a transfer on the channel > + + DMA_PAUSE > + + Pauses a transfer on the channel > + + DMA_TERMINATE_ALL > + + Aborts all the pending and ongoing transfers on the > + channel > + + DMA_SLAVE_CONFIG > + + Reconfigures the channel with passed configuration > + + FSLDMA_EXTERNAL_START > + + TODO: Why does that even exist? > + + The argument is an opaque unsigned long. This actually is a > + pointer to a struct dma_slave_config that should be used only > + in the DMA_SLAVE_CONFIG. > + > +Misc notes (stuff that should be documented, but don't really know > +what to say about it) > +------------------------------------------------------------------ > + * dma_run_dependencies > + - What is it supposed to do/when should it be called? > + - Some drivers seems to implement it at the end of a transfer, but > + not all of them do, so it seems we can get away without it > + > + * device_slave_caps > + - Isn't that redundant with the cap_mask already? > + - Only a few drivers seem to implement it > + > + * dma cookies? > + > +Glossary > +-------- > + > +Burst: Usually a few contiguous bytes that will be transfered > + at once by the DMA controller > +Chunk: A contiguous collection of bursts > +Transfer: A collection of chunks (be it contiguous or not) > -- > 2.0.2 > -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 14, 2014 at 10:53:01AM +0200, Ludovic Desroches wrote: > On Wed, Jul 30, 2014 at 06:03:13PM +0200, Maxime Ripard wrote: > > The dmaengine is neither trivial nor properly documented at the moment, which > > means a lot of trial and error development, which is not that good for such a > > central piece of the system. > > > > Attempt at making such a documentation. > > Good idea, many questions are asked when writing a new dmaengine driver. > > For instance I didn't find how to use the DMA_CTRL_ACK flags. > > - How this flag has to be managed? For instance, async_tx_ack is used in > dmaengine driver but also in some devices. > - Is it mandatory to deal with this flag? It seems some dmaengine > drivers don't care about it. This is one part of the DMA engine API that even I don't understand. It's got something to do with the async engine API, and seems to be something to do with whether a descriptor can have other transactions added to it, and whether a descriptor can be re-used (async-tx engines typically allocate a fixed set of descriptors and recycle them.) I was never able to get to the bottom of what that flag meant or how it should be used with the DMA slave API (and I suspect everyone else just gave up trying to understand it as well.)
On Thu, Aug 14, 2014 at 09:57:53AM +0100, Russell King - ARM Linux wrote: > On Thu, Aug 14, 2014 at 10:53:01AM +0200, Ludovic Desroches wrote: > > On Wed, Jul 30, 2014 at 06:03:13PM +0200, Maxime Ripard wrote: > > > The dmaengine is neither trivial nor properly documented at the moment, which > > > means a lot of trial and error development, which is not that good for such a > > > central piece of the system. > > > > > > Attempt at making such a documentation. > > > > Good idea, many questions are asked when writing a new dmaengine driver. > > > > For instance I didn't find how to use the DMA_CTRL_ACK flags. > > > > - How this flag has to be managed? For instance, async_tx_ack is used in > > dmaengine driver but also in some devices. > > - Is it mandatory to deal with this flag? It seems some dmaengine > > drivers don't care about it. > This is one part of the DMA engine API that even I don't understand. > > It's got something to do with the async engine API, and seems to be > something to do with whether a descriptor can have other transactions > added to it, and whether a descriptor can be re-used (async-tx engines > typically allocate a fixed set of descriptors and recycle them.) Yes this is my understanding too. Async API IIUC can reuse descriptors and if engine doesnt support this is a way to tell them please do use that. For slave dmanegine API we need to ignore it. > > I was never able to get to the bottom of what that flag meant or how > it should be used with the DMA slave API (and I suspect everyone else > just gave up trying to understand it as well.)
On Tue, Aug 19, 2014 at 07:15:07PM +0530, Vinod Koul wrote: > On Thu, Aug 14, 2014 at 09:57:53AM +0100, Russell King - ARM Linux wrote: > > It's got something to do with the async engine API, and seems to be > > something to do with whether a descriptor can have other transactions > > added to it, and whether a descriptor can be re-used (async-tx engines > > typically allocate a fixed set of descriptors and recycle them.) > Yes this is my understanding too. Async API IIUC can reuse descriptors and > if engine doesnt support this is a way to tell them please do use that. > > For slave dmanegine API we need to ignore it. We shouldn't ignore it - ignoring it makes it harder to implement a DMA engine driver which supports both the slave and async APIs, because we then need to know the reason for the channel being requested. Ignoring it until it can be understood and documented is an approach I would agree with though.
On Tue, Aug 19, 2014 at 03:44:23PM +0100, Russell King - ARM Linux wrote: > On Tue, Aug 19, 2014 at 07:15:07PM +0530, Vinod Koul wrote: > > On Thu, Aug 14, 2014 at 09:57:53AM +0100, Russell King - ARM Linux wrote: > > > It's got something to do with the async engine API, and seems to be > > > something to do with whether a descriptor can have other transactions > > > added to it, and whether a descriptor can be re-used (async-tx engines > > > typically allocate a fixed set of descriptors and recycle them.) > > Yes this is my understanding too. Async API IIUC can reuse descriptors and > > if engine doesnt support this is a way to tell them please do use that. > > > > For slave dmanegine API we need to ignore it. > > We shouldn't ignore it - ignoring it makes it harder to implement a DMA > engine driver which supports both the slave and async APIs, because > we then need to know the reason for the channel being requested. Do you have such usage coming up near future :) > > Ignoring it until it can be understood and documented is an approach I > would agree with though. Certainly, I am asking Dan to add more clarity on these bits to help improve.
diff --git a/Documentation/dmaengine-driver.txt b/Documentation/dmaengine-driver.txt new file mode 100644 index 000000000000..4828b50038c0 --- /dev/null +++ b/Documentation/dmaengine-driver.txt @@ -0,0 +1,293 @@ +DMAengine controller documentation +================================== + +Hardware Introduction ++++++++++++++++++++++ + +Most of the Slave DMA controllers have the same general principles of +operations. + +They have a given number of channels to use for the DMA transfers, and +a given number of requests lines. + +Requests and channels are pretty much orthogonal. Channels can be used +to serve several to any requests. To simplify, channels are the +entities that will be doing the copy, and requests what endpoints are +involved. + +The request lines actually correspond to physical lines going from the +DMA-elligible devices to the controller itself. Whenever the device +will want to start a transfer, it will assert a DMA request (DRQ) by +asserting that request line. + +A very simple DMA controller would only take into account a single +parameter: the transfer size. At each clock cycle, it would transfer a +byte of data from one buffer to another, until the transfer size has +been reached. + +That wouldn't work well in the real world, since slave devices might +require to have to retrieve various number of bits from memory at a +time. For example, we probably want to transfer 32 bits at a time when +doing a simple memory copy operation, but our audio device will +require to have 16 or 24 bits written to its FIFO. This is why most if +not all of the DMA controllers can adjust this, using a parameter +called the width. + +Moreover, some DMA controllers, whenever the RAM is involved, can +group the reads or writes in memory into a buffer, so instead of +having a lot of small memory accesses, which is not really efficient, +you'll get several bigger transfers. This is done using a parameter +called the burst size, that defines how many single reads/writes it's +allowed to do in a single clock cycle. + +Our theorical DMA controller would then only be able to do transfers +that involve a single contiguous block of data. However, some of the +transfers we usually have are not, and want to copy data from +non-contiguous buffers to a contiguous buffer, which is called +scatter-gather. + +DMAEngine, at least for mem2dev transfers, require support for +scatter-gather. So we're left with two cases here: either we have a +quite simple DMA controller that doesn't support it, and we'll have to +implement it in software, or we have a more advanced DMA controller, +that implements in hardware scatter-gather. + +The latter are usually programmed using a collection of chunks to +transfer, and whenever the transfer is started, the controller will go +over that collection, doing whatever we programmed there. + +This collection is usually either a table or a linked list. You will +then push either the address of the table and its number of elements, +or the first item of the list to one channel of the DMA controller, +and whenever a DRQ will be asserted, it will go through the collection +to know where to fetch the data from. + +Either way, the format of this collection is completely dependent of +your hardware. Each DMA controller will require a different structure, +but all of them will require, for every chunk, at least the source and +destination addresses, wether it should increment these addresses or +not and the three parameters we saw earlier: the burst size, the bus +width and the transfer size. + +The one last thing is that usually, slave devices won't issue DRQ by +default, and you have to enable this in your slave device driver first +whenever you're willing to use DMA. + +These were just the general memory-to-memory (also called mem2mem) or +memory-to-device (mem2dev) transfers. Other kind of transfers might be +offered by your DMA controller, and are probably already supported by +dmaengine. + +DMAEngine Registration +++++++++++++++++++++++ + +struct dma_device Initialization +-------------------------------- + +Just like any other kernel framework, the whole DMAEngine registration +relies on the driver filling a structure and registering against the +framework. In our case, that structure is dma_device. + +The first thing you need to do in your driver is to allocate this +structure. Any of the usual memory allocator will do, but you'll also +need to initialize a few fields in there: + + * chancnt: should be the number of channels your driver is exposing + to the system. + This doesn't have to be the number of physical + channels: some DMA controllers also expose virtual + channels to the system to overcome the case where you + have more consumers than physical channels available. + + * channels: should be initialized as a list using the + INIT_LIST_HEAD macro for example + + * dev: should hold the pointer to the struct device associated + to your current driver instance. + +Supported transaction types +--------------------------- +The next thing you need is to actually set which transaction type your +device (and driver) supports. + +Our dma_device structure has a field called caps_mask that holds the +various type of transaction supported, and you need to modify this +mask using the dma_cap_set function, with various flags depending on +transaction types you support as an argument. + +All those capabilities are defined in the dma_transaction_type enum, +in include/linux/dmaengine.h + +Currently, the types available are: + * DMA_MEMCPY + - The device is able to do memory to memory copies + + * DMA_XOR + - The device is able to perform XOR operations on memory areas + - Particularly useful to accelerate XOR intensive tasks, such as + RAID5 + + * DMA_XOR_VAL + - The device is able to perform parity check using the XOR + algorithm against a memory buffer. + + * DMA_PQ + - The device is able to perform RAID6 P+Q computations, P being a + simple XOR, and Q being a Reed-Solomon algorithm. + + * DMA_PQ_VAL + - The device is able to perform parity check using RAID6 P+Q + algorithm against a memory buffer. + + * DMA_INTERRUPT + /* TODO: Is it that the device has one interrupt per channel? */ + + * DMA_SG + - The device supports memory to memory scatter-gather + transfers. + - Even though a plain memcpy can look like a particular case of a + scatter-gather transfer, with a single chunk to transfer, it's a + distinct transaction type in the mem2mem transfers case + + * DMA_PRIVATE + - The device can have several client at a time, most likely + because it has several parallel channels. + + * DMA_ASYNC_TX + - Must not be set by the device, and will be set by the framework + if needed + - /* TODO: What is it about? */ + + * DMA_SLAVE + - The device can handle device to memory transfers, including + scatter-gather transfers. + - While in the mem2mem case we were having two distinct type to + deal with a single chunk to copy or a collection of them, here, + we just have a single transaction type that is supposed to + handle both. + + * DMA_CYCLIC + - The device can handle cyclic transfers. + - A cyclic transfer is a transfer where the chunk collection will + loop over itself, with the last item pointing to the first. It's + usually used for audio transfers, where you want to operate on a + single big buffer that you will fill with your audio data. + + * DMA_INTERLEAVE + - The device support interleaved transfer. Those transfers usually + involve an interleaved set of data, with chunks a few bytes + wide, where a scatter-gather transfer would be quite + inefficient. + +These various types will also affect how the source and destination +addresses change over time, as DMA_SLAVE transfers will usually have +one of the addresses that will increment, while the other will not, +DMA_CYCLIC will have one address that will loop, while the other, will +not change, etc. + +Device operations +----------------- + +Our dma_device structure also requires a few function pointers in +order to implement the actual logic, now that we described what +operations we were able to perform. + +The functions that we have to fill in there, and hence have to +implement, obviously depend on the transaction type you reported as +supported. + + * device_alloc_chan_resources + * device_free_chan_resources + - These functions will be called whenever a driver will call + dma_request_channel or dma_release_channel for the first/last + time on the channel associated to that driver. + - They are in charge of allocating/freeing all the needed + resources in order for that channel to be useful for your + driver. + - These functions can sleep. + + * device_prep_dma_* + - These functions are matching the capabilities you registered + previously. + - These functions all take the buffer or the scatterlist relevant + for the transfer being prepared, and should create a hardware + descriptor or a list of descriptors from it + - These functions can be called from an interrupt context + - Any allocation you might do should be using the GFP_NOWAIT + flag, in order not to potentially sleep, but without depleting + the emergency pool either. + + - It should return a unique instance of the + dma_async_tx_descriptor structure, that further represents this + particular transfer. + + - This structure can be allocated using the function + dma_async_tx_descriptor_init. + - You'll also need to set two fields in this structure: + + flags: + TODO: Can it be modified by the driver itself, or + should it be always the flags passed in the arguments + + + tx_submit: A pointer to a function you have to implement, + that is supposed to push the current descriptor + to a pending queue, waiting for issue_pending to + be called. + + * device_issue_pending + - Takes the first descriptor in the pending queue, and start the + transfer. Whenever that transfer is done, it should move to the + next transaction in the list. + - It should call the registered callback if any each time a + transaction is done. + - This function can be called in an interrupt context + + * device_tx_status + - Should report the bytes left to go over in the current transfer + for the given channel + - Should use dma_set_residue to report it + - In the case of a cyclic transfer, it should only take into + account the current period. + - This function can be called in an interrupt context. + + * device_control + - Used by client drivers to control and configure the channel it + has a handle on. + - Called with a command and an argument + + The command is one of the values listed by the enum + dma_ctrl_cmd. To this date, the valid commands are: + + DMA_RESUME + + Restarts a transfer on the channel + + DMA_PAUSE + + Pauses a transfer on the channel + + DMA_TERMINATE_ALL + + Aborts all the pending and ongoing transfers on the + channel + + DMA_SLAVE_CONFIG + + Reconfigures the channel with passed configuration + + FSLDMA_EXTERNAL_START + + TODO: Why does that even exist? + + The argument is an opaque unsigned long. This actually is a + pointer to a struct dma_slave_config that should be used only + in the DMA_SLAVE_CONFIG. + +Misc notes (stuff that should be documented, but don't really know +what to say about it) +------------------------------------------------------------------ + * dma_run_dependencies + - What is it supposed to do/when should it be called? + - Some drivers seems to implement it at the end of a transfer, but + not all of them do, so it seems we can get away without it + + * device_slave_caps + - Isn't that redundant with the cap_mask already? + - Only a few drivers seem to implement it + + * dma cookies? + +Glossary +-------- + +Burst: Usually a few contiguous bytes that will be transfered + at once by the DMA controller +Chunk: A contiguous collection of bursts +Transfer: A collection of chunks (be it contiguous or not)
The dmaengine is neither trivial nor properly documented at the moment, which means a lot of trial and error development, which is not that good for such a central piece of the system. Attempt at making such a documentation. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> --- Documentation/dmaengine-driver.txt | 293 +++++++++++++++++++++++++++++++++++++ 1 file changed, 293 insertions(+) create mode 100644 Documentation/dmaengine-driver.txt