Message ID | 1486455406-11202-3-git-send-email-anup.patel@broadcom.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Herbert Xu |
Headers | show |
On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: > The DMAENGINE framework assumes that if PQ offload is supported by a > DMA device then all 256 PQ coefficients are supported. This assumption > does not hold anymore because we now have BCM-SBA-RAID offload engine > which supports PQ offload with limited number of PQ coefficients. > > This patch extends async_tx APIs to handle DMA devices with support > for fewer PQ coefficients. > > Signed-off-by: Anup Patel <anup.patel@broadcom.com> > Reviewed-by: Scott Branden <scott.branden@broadcom.com> I don't like this approach. Define an interface for md to query the offload engine once at the beginning of time. We should not be adding any new extensions to async_tx.
On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: > On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: >> The DMAENGINE framework assumes that if PQ offload is supported by a >> DMA device then all 256 PQ coefficients are supported. This assumption >> does not hold anymore because we now have BCM-SBA-RAID offload engine >> which supports PQ offload with limited number of PQ coefficients. >> >> This patch extends async_tx APIs to handle DMA devices with support >> for fewer PQ coefficients. >> >> Signed-off-by: Anup Patel <anup.patel@broadcom.com> >> Reviewed-by: Scott Branden <scott.branden@broadcom.com> > > I don't like this approach. Define an interface for md to query the > offload engine once at the beginning of time. We should not be adding > any new extensions to async_tx. Even if we do capability checks in Linux MD, we still need a way for DMAENGINE drivers to advertise number of PQ coefficients handled by the HW. I agree capability checks should be done once in Linux MD but I don't see why this has to be part of BCM-SBA-RAID driver patches. We need separate patchsets to address limitations of async_tx framework. Regards, Anup
On Tue, Feb 07, 2017 at 02:32:15PM +0530, Anup Patel wrote: > On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: > > On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: > >> The DMAENGINE framework assumes that if PQ offload is supported by a > >> DMA device then all 256 PQ coefficients are supported. This assumption > >> does not hold anymore because we now have BCM-SBA-RAID offload engine > >> which supports PQ offload with limited number of PQ coefficients. > >> > >> This patch extends async_tx APIs to handle DMA devices with support > >> for fewer PQ coefficients. > >> > >> Signed-off-by: Anup Patel <anup.patel@broadcom.com> > >> Reviewed-by: Scott Branden <scott.branden@broadcom.com> > > > > I don't like this approach. Define an interface for md to query the > > offload engine once at the beginning of time. We should not be adding > > any new extensions to async_tx. > > Even if we do capability checks in Linux MD, we still need a way > for DMAENGINE drivers to advertise number of PQ coefficients > handled by the HW. If the question is only for advertising caps, then why not do as done for dma_get_slave_caps(). you can add dma_get_pq_caps() so that clients (md) in this case would know the HW capability. > I agree capability checks should be done once in Linux MD but I don't > see why this has to be part of BCM-SBA-RAID driver patches. We need > separate patchsets to address limitations of async_tx framework. > > Regards, > Anup
On Tue, Feb 7, 2017 at 1:02 AM, Anup Patel <anup.patel@broadcom.com> wrote: > On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: >> On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>> The DMAENGINE framework assumes that if PQ offload is supported by a >>> DMA device then all 256 PQ coefficients are supported. This assumption >>> does not hold anymore because we now have BCM-SBA-RAID offload engine >>> which supports PQ offload with limited number of PQ coefficients. >>> >>> This patch extends async_tx APIs to handle DMA devices with support >>> for fewer PQ coefficients. >>> >>> Signed-off-by: Anup Patel <anup.patel@broadcom.com> >>> Reviewed-by: Scott Branden <scott.branden@broadcom.com> >> >> I don't like this approach. Define an interface for md to query the >> offload engine once at the beginning of time. We should not be adding >> any new extensions to async_tx. > > Even if we do capability checks in Linux MD, we still need a way > for DMAENGINE drivers to advertise number of PQ coefficients > handled by the HW. > > I agree capability checks should be done once in Linux MD but I don't > see why this has to be part of BCM-SBA-RAID driver patches. We need > separate patchsets to address limitations of async_tx framework. Right, separate enabling before we pile on new hardware support to a known broken framework.
On Tue, Feb 7, 2017 at 10:12 PM, Vinod Koul <vinod.koul@intel.com> wrote: > On Tue, Feb 07, 2017 at 02:32:15PM +0530, Anup Patel wrote: >> On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: >> > On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: >> >> The DMAENGINE framework assumes that if PQ offload is supported by a >> >> DMA device then all 256 PQ coefficients are supported. This assumption >> >> does not hold anymore because we now have BCM-SBA-RAID offload engine >> >> which supports PQ offload with limited number of PQ coefficients. >> >> >> >> This patch extends async_tx APIs to handle DMA devices with support >> >> for fewer PQ coefficients. >> >> >> >> Signed-off-by: Anup Patel <anup.patel@broadcom.com> >> >> Reviewed-by: Scott Branden <scott.branden@broadcom.com> >> > >> > I don't like this approach. Define an interface for md to query the >> > offload engine once at the beginning of time. We should not be adding >> > any new extensions to async_tx. >> >> Even if we do capability checks in Linux MD, we still need a way >> for DMAENGINE drivers to advertise number of PQ coefficients >> handled by the HW. > > If the question is only for advertising caps, then why not do as done > for dma_get_slave_caps(). you can add dma_get_pq_caps() so that clients (md) > in this case would know the HW capability. We have large number of possible capabilities for DMA slave such as src_addr_widths, dst_addr_widths, directions, max_burst, residue_granularity, and descriptor_resue. The possible capabilities of PQ offload are: 1. Number of PQ sources handled by PQ offload (Represented by "max_pq" member of "struct dma_device") 2. Number of PQ coefficients handled by PQ offload The above two PQ capabilities are good enough for current PQ HW and future PQ HW so we just need a way to specify number of PQ coefficients. Till now all of the PQ HW always supported all 256 PQ coefficients so we never felt the need of capability to specify PQ coefficients. The BCM-SBA-RAID is the only HW (as far as I know) which does not support all 256 PQ coefficients. Currently, DMAENGINE drivers use dma_set_maxpq() to specify number of PQ sources handled by PQ HW and Linux Async Tx uses dma_maxpq() to get number of PQ sources. On similar lines, we added dma_set_maxpqcoef() to specify number of PQ coefficients and Linux Async Tx uses dma_maxpqcoef() to get number of PQ coefficients. If DMAENGINE driver does not specify PQ coefficients then dma_maxpqcoef() will return 256 assuming all PQ coefficients are supported. This approach is backward compatible to existing DMAENGINE APIs and will not break existing DMAENGINE drivers. If we add dma_get_pq_caps() similar to the dma_get_slave_caps() for PQ capabilities then we will have to use this new method for both of the above PQ capabilities and we have to change all DMAENGINE drivers to use new method of specifying PQ capabilities. I think this is too intrusive and bit overkill because its very very unlikely to see anymore additions to PQ capabilities. Regards, Anup
On Tue, Feb 7, 2017 at 11:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: > On Tue, Feb 7, 2017 at 1:02 AM, Anup Patel <anup.patel@broadcom.com> wrote: >> On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>> On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>> The DMAENGINE framework assumes that if PQ offload is supported by a >>>> DMA device then all 256 PQ coefficients are supported. This assumption >>>> does not hold anymore because we now have BCM-SBA-RAID offload engine >>>> which supports PQ offload with limited number of PQ coefficients. >>>> >>>> This patch extends async_tx APIs to handle DMA devices with support >>>> for fewer PQ coefficients. >>>> >>>> Signed-off-by: Anup Patel <anup.patel@broadcom.com> >>>> Reviewed-by: Scott Branden <scott.branden@broadcom.com> >>> >>> I don't like this approach. Define an interface for md to query the >>> offload engine once at the beginning of time. We should not be adding >>> any new extensions to async_tx. >> >> Even if we do capability checks in Linux MD, we still need a way >> for DMAENGINE drivers to advertise number of PQ coefficients >> handled by the HW. >> >> I agree capability checks should be done once in Linux MD but I don't >> see why this has to be part of BCM-SBA-RAID driver patches. We need >> separate patchsets to address limitations of async_tx framework. > > Right, separate enabling before we pile on new hardware support to a > known broken framework. Linux Async Tx not broken framework. The issue is: 1. Its not complete enough 2. Its not optimized for very high through-put offload engines Regards, Anup
On Wed, Feb 8, 2017 at 12:57 AM, Anup Patel <anup.patel@broadcom.com> wrote: > On Tue, Feb 7, 2017 at 11:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: >> On Tue, Feb 7, 2017 at 1:02 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>> On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>>> On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>>> The DMAENGINE framework assumes that if PQ offload is supported by a >>>>> DMA device then all 256 PQ coefficients are supported. This assumption >>>>> does not hold anymore because we now have BCM-SBA-RAID offload engine >>>>> which supports PQ offload with limited number of PQ coefficients. >>>>> >>>>> This patch extends async_tx APIs to handle DMA devices with support >>>>> for fewer PQ coefficients. >>>>> >>>>> Signed-off-by: Anup Patel <anup.patel@broadcom.com> >>>>> Reviewed-by: Scott Branden <scott.branden@broadcom.com> >>>> >>>> I don't like this approach. Define an interface for md to query the >>>> offload engine once at the beginning of time. We should not be adding >>>> any new extensions to async_tx. >>> >>> Even if we do capability checks in Linux MD, we still need a way >>> for DMAENGINE drivers to advertise number of PQ coefficients >>> handled by the HW. >>> >>> I agree capability checks should be done once in Linux MD but I don't >>> see why this has to be part of BCM-SBA-RAID driver patches. We need >>> separate patchsets to address limitations of async_tx framework. >> >> Right, separate enabling before we pile on new hardware support to a >> known broken framework. > > Linux Async Tx not broken framework. The issue is: > 1. Its not complete enough > 2. Its not optimized for very high through-put offload engines I'm not understanding your point. I'm nak'ing this change to add yet more per-transaction capability checking to async_tx. I don't like the DMA_HAS_FEWER_PQ_COEF flag, especially since it is equal to DMA_HAS_PQ_CONTINUE. I'm not asking for all of async_tx's problems to be fixed before this new hardware support, I'm simply saying we should start the process of moving offload-engine capability checking to the raid code.
On Wed, Feb 8, 2017 at 9:54 PM, Dan Williams <dan.j.williams@intel.com> wrote: > On Wed, Feb 8, 2017 at 12:57 AM, Anup Patel <anup.patel@broadcom.com> wrote: >> On Tue, Feb 7, 2017 at 11:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>> On Tue, Feb 7, 2017 at 1:02 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>> On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>>>> On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>>>> The DMAENGINE framework assumes that if PQ offload is supported by a >>>>>> DMA device then all 256 PQ coefficients are supported. This assumption >>>>>> does not hold anymore because we now have BCM-SBA-RAID offload engine >>>>>> which supports PQ offload with limited number of PQ coefficients. >>>>>> >>>>>> This patch extends async_tx APIs to handle DMA devices with support >>>>>> for fewer PQ coefficients. >>>>>> >>>>>> Signed-off-by: Anup Patel <anup.patel@broadcom.com> >>>>>> Reviewed-by: Scott Branden <scott.branden@broadcom.com> >>>>> >>>>> I don't like this approach. Define an interface for md to query the >>>>> offload engine once at the beginning of time. We should not be adding >>>>> any new extensions to async_tx. >>>> >>>> Even if we do capability checks in Linux MD, we still need a way >>>> for DMAENGINE drivers to advertise number of PQ coefficients >>>> handled by the HW. >>>> >>>> I agree capability checks should be done once in Linux MD but I don't >>>> see why this has to be part of BCM-SBA-RAID driver patches. We need >>>> separate patchsets to address limitations of async_tx framework. >>> >>> Right, separate enabling before we pile on new hardware support to a >>> known broken framework. >> >> Linux Async Tx not broken framework. The issue is: >> 1. Its not complete enough >> 2. Its not optimized for very high through-put offload engines > > I'm not understanding your point. I'm nak'ing this change to add yet > more per-transaction capability checking to async_tx. I don't like the > DMA_HAS_FEWER_PQ_COEF flag, especially since it is equal to > DMA_HAS_PQ_CONTINUE. I'm not asking for all of async_tx's problems to > be fixed before this new hardware support, I'm simply saying we should > start the process of moving offload-engine capability checking to the > raid code. The DMA_HAS_FEWER_PQ_COEF is not equal to DMA_HAS_PQ_CONTINUE. I will try to drop this patch and take care of unsupported PQ coefficients in BCM-SBA-RAID driver itself even if this means doing some computations in BCM-SBA-RAID driver itself. Regards, Anup
On Thu, Feb 9, 2017 at 1:29 AM, Anup Patel <anup.patel@broadcom.com> wrote: > On Wed, Feb 8, 2017 at 9:54 PM, Dan Williams <dan.j.williams@intel.com> wrote: >> On Wed, Feb 8, 2017 at 12:57 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>> On Tue, Feb 7, 2017 at 11:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>>> On Tue, Feb 7, 2017 at 1:02 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>>> On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>>>>> On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>>>>> The DMAENGINE framework assumes that if PQ offload is supported by a >>>>>>> DMA device then all 256 PQ coefficients are supported. This assumption >>>>>>> does not hold anymore because we now have BCM-SBA-RAID offload engine >>>>>>> which supports PQ offload with limited number of PQ coefficients. >>>>>>> >>>>>>> This patch extends async_tx APIs to handle DMA devices with support >>>>>>> for fewer PQ coefficients. >>>>>>> >>>>>>> Signed-off-by: Anup Patel <anup.patel@broadcom.com> >>>>>>> Reviewed-by: Scott Branden <scott.branden@broadcom.com> >>>>>> >>>>>> I don't like this approach. Define an interface for md to query the >>>>>> offload engine once at the beginning of time. We should not be adding >>>>>> any new extensions to async_tx. >>>>> >>>>> Even if we do capability checks in Linux MD, we still need a way >>>>> for DMAENGINE drivers to advertise number of PQ coefficients >>>>> handled by the HW. >>>>> >>>>> I agree capability checks should be done once in Linux MD but I don't >>>>> see why this has to be part of BCM-SBA-RAID driver patches. We need >>>>> separate patchsets to address limitations of async_tx framework. >>>> >>>> Right, separate enabling before we pile on new hardware support to a >>>> known broken framework. >>> >>> Linux Async Tx not broken framework. The issue is: >>> 1. Its not complete enough >>> 2. Its not optimized for very high through-put offload engines >> >> I'm not understanding your point. I'm nak'ing this change to add yet >> more per-transaction capability checking to async_tx. I don't like the >> DMA_HAS_FEWER_PQ_COEF flag, especially since it is equal to >> DMA_HAS_PQ_CONTINUE. I'm not asking for all of async_tx's problems to >> be fixed before this new hardware support, I'm simply saying we should >> start the process of moving offload-engine capability checking to the >> raid code. > > The DMA_HAS_FEWER_PQ_COEF is not equal to > DMA_HAS_PQ_CONTINUE. #define DMA_HAS_PQ_CONTINUE (1 << 15 #define DMA_HAS_FEWER_PQ_COEF (1 << 15) > I will try to drop this patch and take care of unsupported PQ > coefficients in BCM-SBA-RAID driver itself even if this means > doing some computations in BCM-SBA-RAID driver itself. That should be nak'd as well, please do capability detection in a routine that is common to all raid engines.
On Thu, Feb 9, 2017 at 10:14 PM, Dan Williams <dan.j.williams@intel.com> wrote: > On Thu, Feb 9, 2017 at 1:29 AM, Anup Patel <anup.patel@broadcom.com> wrote: >> On Wed, Feb 8, 2017 at 9:54 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>> On Wed, Feb 8, 2017 at 12:57 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>> On Tue, Feb 7, 2017 at 11:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>>>> On Tue, Feb 7, 2017 at 1:02 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>>>> On Tue, Feb 7, 2017 at 1:57 PM, Dan Williams <dan.j.williams@intel.com> wrote: >>>>>>> On Tue, Feb 7, 2017 at 12:16 AM, Anup Patel <anup.patel@broadcom.com> wrote: >>>>>>>> The DMAENGINE framework assumes that if PQ offload is supported by a >>>>>>>> DMA device then all 256 PQ coefficients are supported. This assumption >>>>>>>> does not hold anymore because we now have BCM-SBA-RAID offload engine >>>>>>>> which supports PQ offload with limited number of PQ coefficients. >>>>>>>> >>>>>>>> This patch extends async_tx APIs to handle DMA devices with support >>>>>>>> for fewer PQ coefficients. >>>>>>>> >>>>>>>> Signed-off-by: Anup Patel <anup.patel@broadcom.com> >>>>>>>> Reviewed-by: Scott Branden <scott.branden@broadcom.com> >>>>>>> >>>>>>> I don't like this approach. Define an interface for md to query the >>>>>>> offload engine once at the beginning of time. We should not be adding >>>>>>> any new extensions to async_tx. >>>>>> >>>>>> Even if we do capability checks in Linux MD, we still need a way >>>>>> for DMAENGINE drivers to advertise number of PQ coefficients >>>>>> handled by the HW. >>>>>> >>>>>> I agree capability checks should be done once in Linux MD but I don't >>>>>> see why this has to be part of BCM-SBA-RAID driver patches. We need >>>>>> separate patchsets to address limitations of async_tx framework. >>>>> >>>>> Right, separate enabling before we pile on new hardware support to a >>>>> known broken framework. >>>> >>>> Linux Async Tx not broken framework. The issue is: >>>> 1. Its not complete enough >>>> 2. Its not optimized for very high through-put offload engines >>> >>> I'm not understanding your point. I'm nak'ing this change to add yet >>> more per-transaction capability checking to async_tx. I don't like the >>> DMA_HAS_FEWER_PQ_COEF flag, especially since it is equal to >>> DMA_HAS_PQ_CONTINUE. I'm not asking for all of async_tx's problems to >>> be fixed before this new hardware support, I'm simply saying we should >>> start the process of moving offload-engine capability checking to the >>> raid code. >> >> The DMA_HAS_FEWER_PQ_COEF is not equal to >> DMA_HAS_PQ_CONTINUE. > > #define DMA_HAS_PQ_CONTINUE (1 << 15 > #define DMA_HAS_FEWER_PQ_COEF (1 << 15) You are only looking at the values of these flags. The semantics of both these flags are different and both flags are set in different members of "struct dma_device" The DMA_HAS_PQ_CONTINUE is set in "max_pq" whereas DMA_HAS_FEWER_PQ_COEF is set in "max_pqcoef". When DMA_HAS_PQ_CONTINUE is set in "max_pq", it means that PQ HW is capable of taking P & Q computed previous txn as input. If DMA_HAS_PQ_CONTINUE is not supported the async_pq() will pass P & Q computed by previous txn as sources with coef as g^0. When DMA_HAS_FEWER_PQ_COEF is set in "max_pqcoef", it means the PQ HW is not capable of handling all 256 coefs. > >> I will try to drop this patch and take care of unsupported PQ >> coefficients in BCM-SBA-RAID driver itself even if this means >> doing some computations in BCM-SBA-RAID driver itself. > > That should be nak'd as well, please do capability detection in a > routine that is common to all raid engines. Thanks for NAKing this patch. This motivated me to find clean work-around for handling unsupported PQ coefs in BCM-SBA-RAID driver. Let's assume max number of PQ coefs supported by PQ HW is m coefs. Now for any coef n > m, we can use RAID6 math to get g^n = (g^m)*(g^m)*....*(g^k) where k <= m. Using the above fact, we can create chained txn for each source of PQ request in BCM-SBA-RAID driver where each txn will only compute PQ from one source using above described RAID6 math. Also, each txn in chained txn will depend on output of previous txn because we are computing PQ from one source at a time. I will drop this patch and send updated BCM-SBA-RAID driver with above described work-around for handling unsupported PQ coefs. Regards, Anup
diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c index f83de99..16c6526 100644 --- a/crypto/async_tx/async_pq.c +++ b/crypto/async_tx/async_pq.c @@ -187,6 +187,9 @@ async_gen_syndrome(struct page **blocks, unsigned int offset, int disks, BUG_ON(disks > 255 || !(P(blocks, disks) || Q(blocks, disks))); + if (device && dma_maxpqcoef(device) < src_cnt) + device = NULL; + if (device) unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOWAIT); diff --git a/crypto/async_tx/async_raid6_recov.c b/crypto/async_tx/async_raid6_recov.c index 8fab627..2916f95 100644 --- a/crypto/async_tx/async_raid6_recov.c +++ b/crypto/async_tx/async_raid6_recov.c @@ -352,6 +352,7 @@ async_raid6_2data_recov(int disks, size_t bytes, int faila, int failb, { void *scribble = submit->scribble; int non_zero_srcs, i; + struct dma_chan *chan = async_dma_find_channel(DMA_PQ); BUG_ON(faila == failb); if (failb < faila) @@ -359,12 +360,15 @@ async_raid6_2data_recov(int disks, size_t bytes, int faila, int failb, pr_debug("%s: disks: %d len: %zu\n", __func__, disks, bytes); + if (chan && dma_maxpqcoef(chan->device) < RAID6_PQ_MAX_COEF) + chan = NULL; + /* if a dma resource is not available or a scribble buffer is not * available punt to the synchronous path. In the 'dma not * available' case be sure to use the scribble buffer to * preserve the content of 'blocks' as the caller intended. */ - if (!async_dma_find_channel(DMA_PQ) || !scribble) { + if (!chan || !scribble) { void **ptrs = scribble ? scribble : (void **) blocks; async_tx_quiesce(&submit->depend_tx); @@ -432,15 +436,19 @@ async_raid6_datap_recov(int disks, size_t bytes, int faila, void *scribble = submit->scribble; int good_srcs, good, i; struct page *srcs[2]; + struct dma_chan *chan = async_dma_find_channel(DMA_PQ); pr_debug("%s: disks: %d len: %zu\n", __func__, disks, bytes); + if (chan && dma_maxpqcoef(chan->device) < RAID6_PQ_MAX_COEF) + chan = NULL; + /* if a dma resource is not available or a scribble buffer is not * available punt to the synchronous path. In the 'dma not * available' case be sure to use the scribble buffer to * preserve the content of 'blocks' as the caller intended. */ - if (!async_dma_find_channel(DMA_PQ) || !scribble) { + if (!chan || !scribble) { void **ptrs = scribble ? scribble : (void **) blocks; async_tx_quiesce(&submit->depend_tx); diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h index feee6ec..d938a8b 100644 --- a/include/linux/dmaengine.h +++ b/include/linux/dmaengine.h @@ -24,6 +24,7 @@ #include <linux/scatterlist.h> #include <linux/bitmap.h> #include <linux/types.h> +#include <linux/raid/pq.h> #include <asm/page.h> /** @@ -668,6 +669,7 @@ struct dma_filter { * @cap_mask: one or more dma_capability flags * @max_xor: maximum number of xor sources, 0 if no capability * @max_pq: maximum number of PQ sources and PQ-continue capability + * @max_pqcoef: maximum number of PQ coefficients, 0 if all supported * @copy_align: alignment shift for memcpy operations * @xor_align: alignment shift for xor operations * @pq_align: alignment shift for pq operations @@ -727,11 +729,13 @@ struct dma_device { dma_cap_mask_t cap_mask; unsigned short max_xor; unsigned short max_pq; + unsigned short max_pqcoef; enum dmaengine_alignment copy_align; enum dmaengine_alignment xor_align; enum dmaengine_alignment pq_align; enum dmaengine_alignment fill_align; #define DMA_HAS_PQ_CONTINUE (1 << 15) + #define DMA_HAS_FEWER_PQ_COEF (1 << 15) int dev_id; struct device *dev; @@ -1122,6 +1126,21 @@ static inline int dma_maxpq(struct dma_device *dma, enum dma_ctrl_flags flags) BUG(); } +static inline void dma_set_maxpqcoef(struct dma_device *dma, + unsigned short max_pqcoef) +{ + if (max_pqcoef < RAID6_PQ_MAX_COEF) { + dma->max_pqcoef = max_pqcoef; + dma->max_pqcoef |= DMA_HAS_FEWER_PQ_COEF; + } +} + +static inline unsigned short dma_maxpqcoef(struct dma_device *dma) +{ + return (dma->max_pqcoef & DMA_HAS_FEWER_PQ_COEF) ? + (dma->max_pqcoef & ~DMA_HAS_FEWER_PQ_COEF) : RAID6_PQ_MAX_COEF; +} + static inline size_t dmaengine_get_icg(bool inc, bool sgl, size_t icg, size_t dir_icg) { diff --git a/include/linux/raid/pq.h b/include/linux/raid/pq.h index 30f9453..f3a04bb 100644 --- a/include/linux/raid/pq.h +++ b/include/linux/raid/pq.h @@ -15,6 +15,9 @@ #ifdef __KERNEL__ +/* Max number of PQ coefficients */ +#define RAID6_PQ_MAX_COEF 256 + /* Set to 1 to use kernel-wide empty_zero_page */ #define RAID6_USE_EMPTY_ZERO_PAGE 0 #include <linux/blkdev.h>