diff mbox

[v6,05/14] crypto: marvell/CESA: add TDMA support

Message ID 1434527142-3609-6-git-send-email-boris.brezillon@free-electrons.com (mailing list archive)
State Changes Requested
Delegated to: Herbert Xu
Headers show

Commit Message

Boris BREZILLON June 17, 2015, 7:45 a.m. UTC
The CESA IP supports CPU offload through a dedicated DMA engine (TDMA)
which can control the crypto block.
When you use this mode, all the required data (operation metadata and
payload data) are transferred using DMA, and the results are retrieved
through DMA when possible (hash results are not retrieved through DMA yet),
thus reducing the involvement of the CPU and providing better performances
in most cases (for small requests, the cost of DMA preparation might
exceed the performance gain).

Note that some CESA IPs do not embed this dedicated DMA, hence the
activation of this feature on a per platform basis.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
---
 drivers/crypto/Kconfig          |   1 +
 drivers/crypto/marvell/Makefile |   2 +-
 drivers/crypto/marvell/cesa.c   |  68 +++++++
 drivers/crypto/marvell/cesa.h   | 229 ++++++++++++++++++++++
 drivers/crypto/marvell/cipher.c | 166 +++++++++++++++-
 drivers/crypto/marvell/hash.c   | 416 +++++++++++++++++++++++++++++++++++++++-
 drivers/crypto/marvell/tdma.c   | 224 ++++++++++++++++++++++
 7 files changed, 1090 insertions(+), 16 deletions(-)
 create mode 100644 drivers/crypto/marvell/tdma.c

Comments

Herbert Xu June 17, 2015, 9:50 a.m. UTC | #1
On Wed, Jun 17, 2015 at 09:45:33AM +0200, Boris Brezillon wrote:
>
> +	ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
> +			 DMA_TO_DEVICE);
> +	if (!ret)
> +		return -ENOMEM;
> +
> +	creq->src_nents = ret;

DMA-API-HOWTO says that you must retain the original nents and
use it when you call dma_unmap_sg.  So I'm afraid one more repost
is needed :)

Thanks,
Boris BREZILLON June 17, 2015, 11:33 a.m. UTC | #2
On Wed, 17 Jun 2015 17:50:01 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Wed, Jun 17, 2015 at 09:45:33AM +0200, Boris Brezillon wrote:
> >
> > +	ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
> > +			 DMA_TO_DEVICE);
> > +	if (!ret)
> > +		return -ENOMEM;
> > +
> > +	creq->src_nents = ret;
> 
> DMA-API-HOWTO says that you must retain the original nents and
> use it when you call dma_unmap_sg.  So I'm afraid one more repost
> is needed :)

My bad (again :-/).
Actually, I think I don't need to save the dma_map_sg return val, since
I'm using the sg_next function to iterate over the scatterlist. Am I
right ?
IOW, is the ->map_sg() function (in dma_map_ops) supposed to merge the
contiguous entries and then flag the unused entries with the is_chain
flag ?
If that's not the case, and the ->map_sg() just marks the merged entries
as empty (length = 0), then I'll have to rework my iterator algorithm.
Herbert Xu June 17, 2015, 12:25 p.m. UTC | #3
On Wed, Jun 17, 2015 at 01:33:24PM +0200, Boris Brezillon wrote:
>
> Actually, I think I don't need to save the dma_map_sg return val, since
> I'm using the sg_next function to iterate over the scatterlist. Am I
> right ?
> IOW, is the ->map_sg() function (in dma_map_ops) supposed to merge the
> contiguous entries and then flag the unused entries with the is_chain
> flag ?

Right.  If you're simply using sg_dma_length in conjunction with
sg_next then you don't need the new length.

Cheers,
Russell King - ARM Linux June 18, 2015, 9:04 a.m. UTC | #4
On Wed, Jun 17, 2015 at 05:50:01PM +0800, Herbert Xu wrote:
> On Wed, Jun 17, 2015 at 09:45:33AM +0200, Boris Brezillon wrote:
> >
> > +	ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
> > +			 DMA_TO_DEVICE);
> > +	if (!ret)
> > +		return -ENOMEM;
> > +
> > +	creq->src_nents = ret;
> 
> DMA-API-HOWTO says that you must retain the original nents and
> use it when you call dma_unmap_sg.  So I'm afraid one more repost
> is needed :)

It's worse than that...  You're right on that point, but there's an
additional point.

If dma_map_sg() coalesces scatterlist entries, then ret will be smaller
than src_nents, and ret indicates how many scatterlist entries to be
walked during DMA - you should not use src_nents for that.  I couldn't
see where the driver used that information.  In fact, the driver seems
to be capable of walking more than src_nents/ret numbers of scatterlist
entries: it just keeps going with sg_next() until it hits the end of
the allocated scatterlist.
Boris BREZILLON June 18, 2015, 9:33 a.m. UTC | #5
Hi Russel,

On Thu, 18 Jun 2015 10:04:00 +0100
Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:

> On Wed, Jun 17, 2015 at 05:50:01PM +0800, Herbert Xu wrote:
> > On Wed, Jun 17, 2015 at 09:45:33AM +0200, Boris Brezillon wrote:
> > >
> > > +	ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
> > > +			 DMA_TO_DEVICE);
> > > +	if (!ret)
> > > +		return -ENOMEM;
> > > +
> > > +	creq->src_nents = ret;
> > 
> > DMA-API-HOWTO says that you must retain the original nents and
> > use it when you call dma_unmap_sg.  So I'm afraid one more repost
> > is needed :)
> 
> It's worse than that...  You're right on that point, but there's an
> additional point.
> 
> If dma_map_sg() coalesces scatterlist entries, then ret will be smaller
> than src_nents, and ret indicates how many scatterlist entries to be
> walked during DMA - you should not use src_nents for that.  I couldn't
> see where the driver used that information.  In fact, the driver seems
> to be capable of walking more than src_nents/ret numbers of scatterlist
> entries: it just keeps going with sg_next() until it hits the end of
> the allocated scatterlist.

Yes, I realized that, and I never used the value returned by
dma_map_sg() to walk the scatterlist anyway: I was using the sg_next()
and sg->length value (which I replaced by sg_dma_len() in v7 as
suggested by Herbert).
So the ->src_nents assignment to dma_map_sg() return value was just a
silly mistake caused by an uncareful read of the DMA-API-HOWTO.

Am I missing something else ?

Best Regards,

Boris
Herbert Xu June 18, 2015, 9:37 a.m. UTC | #6
On Thu, Jun 18, 2015 at 10:04:00AM +0100, Russell King - ARM Linux wrote:
>
> If dma_map_sg() coalesces scatterlist entries, then ret will be smaller
> than src_nents, and ret indicates how many scatterlist entries to be
> walked during DMA - you should not use src_nents for that.  I couldn't
> see where the driver used that information.  In fact, the driver seems
> to be capable of walking more than src_nents/ret numbers of scatterlist
> entries: it just keeps going with sg_next() until it hits the end of
> the allocated scatterlist.

I think he should be OK even though he throws away the return value
because he's stopping his walk once the sum of sg_dma_length reaches
his end goal.

Cheers,
Russell King - ARM Linux June 18, 2015, 9:48 a.m. UTC | #7
On Thu, Jun 18, 2015 at 11:33:24AM +0200, Boris Brezillon wrote:
> Hi Russel,
> 
> On Thu, 18 Jun 2015 10:04:00 +0100
> Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> 
> > On Wed, Jun 17, 2015 at 05:50:01PM +0800, Herbert Xu wrote:
> > > On Wed, Jun 17, 2015 at 09:45:33AM +0200, Boris Brezillon wrote:
> > > >
> > > > +	ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
> > > > +			 DMA_TO_DEVICE);
> > > > +	if (!ret)
> > > > +		return -ENOMEM;
> > > > +
> > > > +	creq->src_nents = ret;
> > > 
> > > DMA-API-HOWTO says that you must retain the original nents and
> > > use it when you call dma_unmap_sg.  So I'm afraid one more repost
> > > is needed :)
> > 
> > It's worse than that...  You're right on that point, but there's an
> > additional point.
> > 
> > If dma_map_sg() coalesces scatterlist entries, then ret will be smaller
> > than src_nents, and ret indicates how many scatterlist entries to be
> > walked during DMA - you should not use src_nents for that.  I couldn't
> > see where the driver used that information.  In fact, the driver seems
> > to be capable of walking more than src_nents/ret numbers of scatterlist
> > entries: it just keeps going with sg_next() until it hits the end of
> > the allocated scatterlist.
> 
> Yes, I realized that, and I never used the value returned by
> dma_map_sg() to walk the scatterlist anyway: I was using the sg_next()
> and sg->length value (which I replaced by sg_dma_len() in v7 as
> suggested by Herbert).
> So the ->src_nents assignment to dma_map_sg() return value was just a
> silly mistake caused by an uncareful read of the DMA-API-HOWTO.
> 
> Am I missing something else ?

Yes.  'ret' should be used to indicate the number of scatterlist entries
to walk for DMA purposes after the scatterlist has been mapped.  For PIO
purposes, using src_nents is still acceptable.

As Herbert points out, you're stopping after the sum of transferred bytes
matches, so I suppose that's fine.

One other point though: you should use sg_dma_address() rather than
dereferencing sg->dma_address directly.
Boris BREZILLON June 18, 2015, 9:52 a.m. UTC | #8
On Thu, 18 Jun 2015 10:48:03 +0100
Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:

> On Thu, Jun 18, 2015 at 11:33:24AM +0200, Boris Brezillon wrote:
> > Hi Russel,
> > 
> > On Thu, 18 Jun 2015 10:04:00 +0100
> > Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> > 
> > > On Wed, Jun 17, 2015 at 05:50:01PM +0800, Herbert Xu wrote:
> > > > On Wed, Jun 17, 2015 at 09:45:33AM +0200, Boris Brezillon wrote:
> > > > >
> > > > > +	ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
> > > > > +			 DMA_TO_DEVICE);
> > > > > +	if (!ret)
> > > > > +		return -ENOMEM;
> > > > > +
> > > > > +	creq->src_nents = ret;
> > > > 
> > > > DMA-API-HOWTO says that you must retain the original nents and
> > > > use it when you call dma_unmap_sg.  So I'm afraid one more repost
> > > > is needed :)
> > > 
> > > It's worse than that...  You're right on that point, but there's an
> > > additional point.
> > > 
> > > If dma_map_sg() coalesces scatterlist entries, then ret will be smaller
> > > than src_nents, and ret indicates how many scatterlist entries to be
> > > walked during DMA - you should not use src_nents for that.  I couldn't
> > > see where the driver used that information.  In fact, the driver seems
> > > to be capable of walking more than src_nents/ret numbers of scatterlist
> > > entries: it just keeps going with sg_next() until it hits the end of
> > > the allocated scatterlist.
> > 
> > Yes, I realized that, and I never used the value returned by
> > dma_map_sg() to walk the scatterlist anyway: I was using the sg_next()
> > and sg->length value (which I replaced by sg_dma_len() in v7 as
> > suggested by Herbert).
> > So the ->src_nents assignment to dma_map_sg() return value was just a
> > silly mistake caused by an uncareful read of the DMA-API-HOWTO.
> > 
> > Am I missing something else ?
> 
> Yes.  'ret' should be used to indicate the number of scatterlist entries
> to walk for DMA purposes after the scatterlist has been mapped.  For PIO
> purposes, using src_nents is still acceptable.
> 
> As Herbert points out, you're stopping after the sum of transferred bytes
> matches, so I suppose that's fine.
> 
> One other point though: you should use sg_dma_address() rather than
> dereferencing sg->dma_address directly.
> 

Okay, I'll fix that before submitting a new version.

Thanks,

Boris
diff mbox

Patch

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index cbc3d3d..cdca762 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -185,6 +185,7 @@  config CRYPTO_DEV_MARVELL_CESA
 	help
 	  This driver allows you to utilize the Cryptographic Engines and
 	  Security Accelerator (CESA) which can be found on the Armada 370.
+	  This driver supports CPU offload through DMA transfers.
 
 	  This driver is aimed at replacing the mv_cesa driver. This will only
 	  happen once it has received proper testing.
diff --git a/drivers/crypto/marvell/Makefile b/drivers/crypto/marvell/Makefile
index 68d0982..0c12b13 100644
--- a/drivers/crypto/marvell/Makefile
+++ b/drivers/crypto/marvell/Makefile
@@ -1,2 +1,2 @@ 
 obj-$(CONFIG_CRYPTO_DEV_MARVELL_CESA) += marvell-cesa.o
-marvell-cesa-objs := cesa.o cipher.o hash.o
+marvell-cesa-objs := cesa.o cipher.o hash.o tdma.o
diff --git a/drivers/crypto/marvell/cesa.c b/drivers/crypto/marvell/cesa.c
index 76a6943..986f024 100644
--- a/drivers/crypto/marvell/cesa.c
+++ b/drivers/crypto/marvell/cesa.c
@@ -184,6 +184,7 @@  static const struct mv_cesa_caps armada_370_caps = {
 	.ncipher_algs = ARRAY_SIZE(armada_370_cipher_algs),
 	.ahash_algs = armada_370_ahash_algs,
 	.nahash_algs = ARRAY_SIZE(armada_370_ahash_algs),
+	.has_tdma = true,
 };
 
 static const struct of_device_id mv_cesa_of_match_table[] = {
@@ -192,6 +193,66 @@  static const struct of_device_id mv_cesa_of_match_table[] = {
 };
 MODULE_DEVICE_TABLE(of, mv_cesa_of_match_table);
 
+static void
+mv_cesa_conf_mbus_windows(struct mv_cesa_engine *engine,
+			  const struct mbus_dram_target_info *dram)
+{
+	void __iomem *iobase = engine->regs;
+	int i;
+
+	for (i = 0; i < 4; i++) {
+		writel(0, iobase + CESA_TDMA_WINDOW_CTRL(i));
+		writel(0, iobase + CESA_TDMA_WINDOW_BASE(i));
+	}
+
+	for (i = 0; i < dram->num_cs; i++) {
+		const struct mbus_dram_window *cs = dram->cs + i;
+
+		writel(((cs->size - 1) & 0xffff0000) |
+		       (cs->mbus_attr << 8) |
+		       (dram->mbus_dram_target_id << 4) | 1,
+		       iobase + CESA_TDMA_WINDOW_CTRL(i));
+		writel(cs->base, iobase + CESA_TDMA_WINDOW_BASE(i));
+	}
+}
+
+static int mv_cesa_dev_dma_init(struct mv_cesa_dev *cesa)
+{
+	struct device *dev = cesa->dev;
+	struct mv_cesa_dev_dma *dma;
+
+	if (!cesa->caps->has_tdma)
+		return 0;
+
+	dma = devm_kzalloc(dev, sizeof(*dma), GFP_KERNEL);
+	if (!dma)
+		return -ENOMEM;
+
+	dma->tdma_desc_pool = dmam_pool_create("tdma_desc", dev,
+					sizeof(struct mv_cesa_tdma_desc),
+					16, 0);
+	if (!dma->tdma_desc_pool)
+		return -ENOMEM;
+
+	dma->op_pool = dmam_pool_create("cesa_op", dev,
+					sizeof(struct mv_cesa_op_ctx), 16, 0);
+	if (!dma->op_pool)
+		return -ENOMEM;
+
+	dma->cache_pool = dmam_pool_create("cesa_cache", dev,
+					   CESA_MAX_HASH_BLOCK_SIZE, 1, 0);
+	if (!dma->cache_pool)
+		return -ENOMEM;
+
+	dma->padding_pool = dmam_pool_create("cesa_padding", dev, 72, 1, 0);
+	if (!dma->cache_pool)
+		return -ENOMEM;
+
+	cesa->dma = dma;
+
+	return 0;
+}
+
 static int mv_cesa_get_sram(struct platform_device *pdev, int idx)
 {
 	struct mv_cesa_dev *cesa = platform_get_drvdata(pdev);
@@ -299,6 +360,10 @@  static int mv_cesa_probe(struct platform_device *pdev)
 	if (IS_ERR(cesa->regs))
 		return -ENOMEM;
 
+	ret = mv_cesa_dev_dma_init(cesa);
+	if (ret)
+		return ret;
+
 	dram = mv_mbus_dram_info_nooverlap();
 
 	platform_set_drvdata(pdev, cesa);
@@ -347,6 +412,9 @@  static int mv_cesa_probe(struct platform_device *pdev)
 
 		engine->regs = cesa->regs + CESA_ENGINE_OFF(i);
 
+		if (dram && cesa->caps->has_tdma)
+			mv_cesa_conf_mbus_windows(&cesa->engines[i], dram);
+
 		writel(0, cesa->engines[i].regs + CESA_SA_INT_STATUS);
 		writel(CESA_SA_CFG_STOP_DIG_ERR,
 		       cesa->engines[i].regs + CESA_SA_CFG);
diff --git a/drivers/crypto/marvell/cesa.h b/drivers/crypto/marvell/cesa.h
index f68057c..1a48323 100644
--- a/drivers/crypto/marvell/cesa.h
+++ b/drivers/crypto/marvell/cesa.h
@@ -6,6 +6,7 @@ 
 #include <crypto/internal/hash.h>
 
 #include <linux/crypto.h>
+#include <linux/dmapool.h>
 
 #define CESA_ENGINE_OFF(i)			(((i) * 0x2000))
 
@@ -267,11 +268,94 @@  struct mv_cesa_op_ctx {
 	} ctx;
 };
 
+/* TDMA descriptor flags */
+#define CESA_TDMA_DST_IN_SRAM			BIT(31)
+#define CESA_TDMA_SRC_IN_SRAM			BIT(30)
+#define CESA_TDMA_TYPE_MSK			GENMASK(29, 0)
+#define CESA_TDMA_DUMMY				0
+#define CESA_TDMA_DATA				1
+#define CESA_TDMA_OP				2
+
+/**
+ * struct mv_cesa_tdma_desc - TDMA descriptor
+ * @byte_cnt:	number of bytes to transfer
+ * @src:	DMA address of the source
+ * @dst:	DMA address of the destination
+ * @next_dma:	DMA address of the next TDMA descriptor
+ * @cur_dma:	DMA address of this TDMA descriptor
+ * @next:	pointer to the next TDMA descriptor
+ * @op:		CESA operation attached to this TDMA descriptor
+ * @data:	raw data attached to this TDMA descriptor
+ * @flags:	flags describing the TDMA transfer. See the
+ *		"TDMA descriptor flags" section above
+ *
+ * TDMA descriptor used to create a transfer chain describing a crypto
+ * operation.
+ */
+struct mv_cesa_tdma_desc {
+	u32 byte_cnt;
+	u32 src;
+	u32 dst;
+	u32 next_dma;
+	u32 cur_dma;
+	struct mv_cesa_tdma_desc *next;
+	union {
+		struct mv_cesa_op_ctx *op;
+		void *data;
+	};
+	u32 flags;
+};
+
+/**
+ * struct mv_cesa_sg_dma_iter - scatter-gather iterator
+ * @dir:	transfer direction
+ * @sg:		scatter list
+ * @offset:	current position in the scatter list
+ * @op_offset:	current position in the crypto operation
+ *
+ * Iterator used to iterate over a scatterlist while creating a TDMA chain for
+ * a crypto operation.
+ */
+struct mv_cesa_sg_dma_iter {
+	enum dma_data_direction dir;
+	struct scatterlist *sg;
+	unsigned int offset;
+	unsigned int op_offset;
+};
+
+/**
+ * struct mv_cesa_dma_iter - crypto operation iterator
+ * @len:	the crypto operation length
+ * @offset:	current position in the crypto operation
+ * @op_len:	sub-operation length (the crypto engine can only act on 2kb
+ *		chunks)
+ *
+ * Iterator used to create a TDMA chain for a given crypto operation.
+ */
+struct mv_cesa_dma_iter {
+	unsigned int len;
+	unsigned int offset;
+	unsigned int op_len;
+};
+
+/**
+ * struct mv_cesa_tdma_chain - TDMA chain
+ * @first:	first entry in the TDMA chain
+ * @last:	last entry in the TDMA chain
+ *
+ * Stores a TDMA chain for a specific crypto operation.
+ */
+struct mv_cesa_tdma_chain {
+	struct mv_cesa_tdma_desc *first;
+	struct mv_cesa_tdma_desc *last;
+};
+
 struct mv_cesa_engine;
 
 /**
  * struct mv_cesa_caps - CESA device capabilities
  * @engines:		number of engines
+ * @has_tdma:		whether this device has a TDMA block
  * @cipher_algs:	supported cipher algorithms
  * @ncipher_algs:	number of supported cipher algorithms
  * @ahash_algs:		supported hash algorithms
@@ -281,6 +365,7 @@  struct mv_cesa_engine;
  */
 struct mv_cesa_caps {
 	int nengines;
+	bool has_tdma;
 	struct crypto_alg **cipher_algs;
 	int ncipher_algs;
 	struct ahash_alg **ahash_algs;
@@ -288,6 +373,24 @@  struct mv_cesa_caps {
 };
 
 /**
+ * struct mv_cesa_dev_dma - DMA pools
+ * @tdma_desc_pool:	TDMA desc pool
+ * @op_pool:		crypto operation pool
+ * @cache_pool:		data cache pool (used by hash implementation when the
+ *			hash request is smaller than the hash block size)
+ * @padding_pool:	padding pool (used by hash implementation when hardware
+ *			padding cannot be used)
+ *
+ * Structure containing the different DMA pools used by this driver.
+ */
+struct mv_cesa_dev_dma {
+	struct dma_pool *tdma_desc_pool;
+	struct dma_pool *op_pool;
+	struct dma_pool *cache_pool;
+	struct dma_pool *padding_pool;
+};
+
+/**
  * struct mv_cesa_dev - CESA device
  * @caps:	device capabilities
  * @regs:	device registers
@@ -295,6 +398,7 @@  struct mv_cesa_caps {
  * @lock:	device lock
  * @queue:	crypto request queue
  * @engines:	array of engines
+ * @dma:	dma pools
  *
  * Structure storing CESA device information.
  */
@@ -306,6 +410,7 @@  struct mv_cesa_dev {
 	spinlock_t lock;
 	struct crypto_queue queue;
 	struct mv_cesa_engine *engines;
+	struct mv_cesa_dev_dma *dma;
 };
 
 /**
@@ -391,9 +496,11 @@  struct mv_cesa_hmac_ctx {
 /**
  * enum mv_cesa_req_type - request type definitions
  * @CESA_STD_REQ:	standard request
+ * @CESA_DMA_REQ:	DMA request
  */
 enum mv_cesa_req_type {
 	CESA_STD_REQ,
+	CESA_DMA_REQ,
 };
 
 /**
@@ -407,6 +514,27 @@  struct mv_cesa_req {
 };
 
 /**
+ * struct mv_cesa_tdma_req - CESA TDMA request
+ * @base:	base information
+ * @chain:	TDMA chain
+ */
+struct mv_cesa_tdma_req {
+	struct mv_cesa_req base;
+	struct mv_cesa_tdma_chain chain;
+};
+
+/**
+ * struct mv_cesa_sg_std_iter - CESA scatter-gather iterator for standard
+ *				requests
+ * @iter:	sg mapping iterator
+ * @offset:	current offset in the SG entry mapped in memory
+ */
+struct mv_cesa_sg_std_iter {
+	struct sg_mapping_iter iter;
+	unsigned int offset;
+};
+
+/**
  * struct mv_cesa_ablkcipher_std_req - cipher standard request
  * @base:	base information
  * @op:		operation context
@@ -430,6 +558,7 @@  struct mv_cesa_ablkcipher_std_req {
 struct mv_cesa_ablkcipher_req {
 	union {
 		struct mv_cesa_req base;
+		struct mv_cesa_tdma_req dma;
 		struct mv_cesa_ablkcipher_std_req std;
 	} req;
 	int src_nents;
@@ -447,6 +576,20 @@  struct mv_cesa_ahash_std_req {
 };
 
 /**
+ * struct mv_cesa_ahash_dma_req - DMA hash request
+ * @base:		base information
+ * @padding:		padding buffer
+ * @padding_dma:	DMA address of the padding buffer
+ * @cache_dma:		DMA address of the cache buffer
+ */
+struct mv_cesa_ahash_dma_req {
+	struct mv_cesa_tdma_req base;
+	u8 *padding;
+	dma_addr_t padding_dma;
+	dma_addr_t cache_dma;
+};
+
+/**
  * struct mv_cesa_ahash_req - hash request
  * @req:		type specific request information
  * @cache:		cache buffer
@@ -460,6 +603,7 @@  struct mv_cesa_ahash_std_req {
 struct mv_cesa_ahash_req {
 	union {
 		struct mv_cesa_req base;
+		struct mv_cesa_ahash_dma_req dma;
 		struct mv_cesa_ahash_std_req std;
 	} req;
 	struct mv_cesa_op_ctx op_tmpl;
@@ -543,6 +687,91 @@  static inline u32 mv_cesa_get_int_mask(struct mv_cesa_engine *engine)
 
 int mv_cesa_queue_req(struct crypto_async_request *req);
 
+/* TDMA functions */
+
+static inline void mv_cesa_req_dma_iter_init(struct mv_cesa_dma_iter *iter,
+					     unsigned int len)
+{
+	iter->len = len;
+	iter->op_len = min(len, CESA_SA_SRAM_PAYLOAD_SIZE);
+	iter->offset = 0;
+}
+
+static inline void mv_cesa_sg_dma_iter_init(struct mv_cesa_sg_dma_iter *iter,
+					    struct scatterlist *sg,
+					    enum dma_data_direction dir)
+{
+	iter->op_offset = 0;
+	iter->offset = 0;
+	iter->sg = sg;
+	iter->dir = dir;
+}
+
+static inline unsigned int
+mv_cesa_req_dma_iter_transfer_len(struct mv_cesa_dma_iter *iter,
+				  struct mv_cesa_sg_dma_iter *sgiter)
+{
+	return min(iter->op_len - sgiter->op_offset,
+		   sgiter->sg->length - sgiter->offset);
+}
+
+bool mv_cesa_req_dma_iter_next_transfer(struct mv_cesa_dma_iter *chain,
+					struct mv_cesa_sg_dma_iter *sgiter,
+					unsigned int len);
+
+static inline bool mv_cesa_req_dma_iter_next_op(struct mv_cesa_dma_iter *iter)
+{
+	iter->offset += iter->op_len;
+	iter->op_len = min(iter->len - iter->offset,
+			   CESA_SA_SRAM_PAYLOAD_SIZE);
+
+	return iter->op_len;
+}
+
+void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq);
+
+static inline int mv_cesa_dma_process(struct mv_cesa_tdma_req *dreq,
+				      u32 status)
+{
+	if (!(status & CESA_SA_INT_ACC0_IDMA_DONE))
+		return -EINPROGRESS;
+
+	if (status & CESA_SA_INT_IDMA_OWN_ERR)
+		return -EINVAL;
+
+	return 0;
+}
+
+void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+			 struct mv_cesa_engine *engine);
+
+void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq);
+
+static inline void
+mv_cesa_tdma_desc_iter_init(struct mv_cesa_tdma_chain *chain)
+{
+	memset(chain, 0, sizeof(*chain));
+}
+
+struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
+					const struct mv_cesa_op_ctx *op_templ,
+					bool skip_ctx,
+					gfp_t flags);
+
+int mv_cesa_dma_add_data_transfer(struct mv_cesa_tdma_chain *chain,
+				  dma_addr_t dst, dma_addr_t src, u32 size,
+				  u32 flags, gfp_t gfp_flags);
+
+int mv_cesa_dma_add_dummy_launch(struct mv_cesa_tdma_chain *chain,
+				 u32 flags);
+
+int mv_cesa_dma_add_dummy_end(struct mv_cesa_tdma_chain *chain, u32 flags);
+
+int mv_cesa_dma_add_op_transfers(struct mv_cesa_tdma_chain *chain,
+				 struct mv_cesa_dma_iter *dma_iter,
+				 struct mv_cesa_sg_dma_iter *sgiter,
+				 gfp_t gfp_flags);
+
 /* Algorithm definitions */
 
 extern struct ahash_alg mv_sha1_alg;
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index e6eea48..a1f4013 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -21,6 +21,48 @@  struct mv_cesa_aes_ctx {
 	struct crypto_aes_ctx aes;
 };
 
+struct mv_cesa_ablkcipher_dma_iter {
+	struct mv_cesa_dma_iter base;
+	struct mv_cesa_sg_dma_iter src;
+	struct mv_cesa_sg_dma_iter dst;
+};
+
+static inline void
+mv_cesa_ablkcipher_req_iter_init(struct mv_cesa_ablkcipher_dma_iter *iter,
+				 struct ablkcipher_request *req)
+{
+	mv_cesa_req_dma_iter_init(&iter->base, req->nbytes);
+	mv_cesa_sg_dma_iter_init(&iter->src, req->src, DMA_TO_DEVICE);
+	mv_cesa_sg_dma_iter_init(&iter->dst, req->dst, DMA_FROM_DEVICE);
+}
+
+static inline bool
+mv_cesa_ablkcipher_req_iter_next_op(struct mv_cesa_ablkcipher_dma_iter *iter)
+{
+	iter->src.op_offset = 0;
+	iter->dst.op_offset = 0;
+
+	return mv_cesa_req_dma_iter_next_op(&iter->base);
+}
+
+static inline void
+mv_cesa_ablkcipher_dma_cleanup(struct ablkcipher_request *req)
+{
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+
+	dma_unmap_sg(cesa_dev->dev, req->dst, creq->dst_nents, DMA_FROM_DEVICE);
+	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
+	mv_cesa_dma_cleanup(&creq->req.dma);
+}
+
+static inline void mv_cesa_ablkcipher_cleanup(struct ablkcipher_request *req)
+{
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+
+	if (creq->req.base.type == CESA_DMA_REQ)
+		mv_cesa_ablkcipher_dma_cleanup(req);
+}
+
 static void mv_cesa_ablkcipher_std_step(struct ablkcipher_request *req)
 {
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
@@ -77,7 +119,11 @@  static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 	struct mv_cesa_engine *engine = sreq->base.engine;
 	int ret;
 
-	ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
+	if (creq->req.base.type == CESA_DMA_REQ)
+		ret = mv_cesa_dma_process(&creq->req.dma, status);
+	else
+		ret = mv_cesa_ablkcipher_std_process(ablkreq, status);
+
 	if (ret)
 		return ret;
 
@@ -90,8 +136,21 @@  static int mv_cesa_ablkcipher_process(struct crypto_async_request *req,
 static void mv_cesa_ablkcipher_step(struct crypto_async_request *req)
 {
 	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(ablkreq);
 
-	mv_cesa_ablkcipher_std_step(ablkreq);
+	if (creq->req.base.type == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->req.dma);
+	else
+		mv_cesa_ablkcipher_std_step(ablkreq);
+}
+
+static inline void
+mv_cesa_ablkcipher_dma_prepare(struct ablkcipher_request *req)
+{
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+
+	mv_cesa_dma_prepare(dreq, dreq->base.engine);
 }
 
 static inline void
@@ -115,12 +174,18 @@  static inline void mv_cesa_ablkcipher_prepare(struct crypto_async_request *req,
 
 	creq->req.base.engine = engine;
 
-	mv_cesa_ablkcipher_std_prepare(ablkreq);
+	if (creq->req.base.type == CESA_DMA_REQ)
+		mv_cesa_ablkcipher_dma_prepare(ablkreq);
+	else
+		mv_cesa_ablkcipher_std_prepare(ablkreq);
 }
 
 static inline void
 mv_cesa_ablkcipher_req_cleanup(struct crypto_async_request *req)
 {
+	struct ablkcipher_request *ablkreq = ablkcipher_request_cast(req);
+
+	mv_cesa_ablkcipher_cleanup(ablkreq);
 }
 
 static const struct mv_cesa_req_ops mv_cesa_ablkcipher_req_ops = {
@@ -166,6 +231,86 @@  static int mv_cesa_aes_setkey(struct crypto_ablkcipher *cipher, const u8 *key,
 	return 0;
 }
 
+static int mv_cesa_ablkcipher_dma_req_init(struct ablkcipher_request *req,
+				const struct mv_cesa_op_ctx *op_templ)
+{
+	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
+	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
+		      GFP_KERNEL : GFP_ATOMIC;
+	struct mv_cesa_tdma_req *dreq = &creq->req.dma;
+	struct mv_cesa_ablkcipher_dma_iter iter;
+	struct mv_cesa_tdma_chain chain;
+	bool skip_ctx = false;
+	int ret;
+
+	dreq->base.type = CESA_DMA_REQ;
+	dreq->chain.first = NULL;
+	dreq->chain.last = NULL;
+
+	ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
+			 DMA_TO_DEVICE);
+	if (!ret)
+		return -ENOMEM;
+
+	creq->src_nents = ret;
+
+	ret = dma_map_sg(cesa_dev->dev, req->dst, creq->dst_nents,
+			 DMA_FROM_DEVICE);
+	if (!ret) {
+		ret = -ENOMEM;
+		goto err_unmap_src;
+	}
+
+	creq->dst_nents = ret;
+
+	mv_cesa_tdma_desc_iter_init(&chain);
+	mv_cesa_ablkcipher_req_iter_init(&iter, req);
+
+	do {
+		struct mv_cesa_op_ctx *op;
+
+		op = mv_cesa_dma_add_op(&chain, op_templ, skip_ctx, flags);
+		if (IS_ERR(op)) {
+			ret = PTR_ERR(op);
+			goto err_free_tdma;
+		}
+		skip_ctx = true;
+
+		mv_cesa_set_crypt_op_len(op, iter.base.op_len);
+
+		/* Add input transfers */
+		ret = mv_cesa_dma_add_op_transfers(&chain, &iter.base,
+						   &iter.src, flags);
+		if (ret)
+			goto err_free_tdma;
+
+		/* Add dummy desc to launch the crypto operation */
+		ret = mv_cesa_dma_add_dummy_launch(&chain, flags);
+		if (ret)
+			goto err_free_tdma;
+
+		/* Add output transfers */
+		ret = mv_cesa_dma_add_op_transfers(&chain, &iter.base,
+						   &iter.dst, flags);
+		if (ret)
+			goto err_free_tdma;
+
+	} while (mv_cesa_ablkcipher_req_iter_next_op(&iter));
+
+	dreq->chain = chain;
+
+	return 0;
+
+err_free_tdma:
+	mv_cesa_dma_cleanup(dreq);
+	dma_unmap_sg(cesa_dev->dev, req->dst, creq->dst_nents, DMA_FROM_DEVICE);
+
+err_unmap_src:
+	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
+
+	return ret;
+}
+
 static inline int
 mv_cesa_ablkcipher_std_req_init(struct ablkcipher_request *req,
 				const struct mv_cesa_op_ctx *op_templ)
@@ -186,6 +331,7 @@  static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	struct mv_cesa_ablkcipher_req *creq = ablkcipher_request_ctx(req);
 	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(req);
 	unsigned int blksize = crypto_ablkcipher_blocksize(tfm);
+	int ret;
 
 	if (!IS_ALIGNED(req->nbytes, blksize))
 		return -EINVAL;
@@ -196,7 +342,13 @@  static int mv_cesa_ablkcipher_req_init(struct ablkcipher_request *req,
 	mv_cesa_update_op_cfg(tmpl, CESA_SA_DESC_CFG_OP_CRYPT_ONLY,
 			      CESA_SA_DESC_CFG_OP_MSK);
 
-	return mv_cesa_ablkcipher_std_req_init(req, tmpl);
+	/* TODO: add a threshold for DMA usage */
+	if (cesa_dev->caps->has_tdma)
+		ret = mv_cesa_ablkcipher_dma_req_init(req, tmpl);
+	else
+		ret = mv_cesa_ablkcipher_std_req_init(req, tmpl);
+
+	return ret;
 }
 
 static int mv_cesa_aes_op(struct ablkcipher_request *req,
@@ -230,7 +382,11 @@  static int mv_cesa_aes_op(struct ablkcipher_request *req,
 	if (ret)
 		return ret;
 
-	return mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base);
+	if (ret && ret != -EINPROGRESS)
+		mv_cesa_ablkcipher_cleanup(req);
+
+	return ret;
 }
 
 static int mv_cesa_ecb_aes_encrypt(struct ablkcipher_request *req)
diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 2d33f68..e7e0774 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -16,6 +16,47 @@ 
 
 #include "cesa.h"
 
+struct mv_cesa_ahash_dma_iter {
+	struct mv_cesa_dma_iter base;
+	struct mv_cesa_sg_dma_iter src;
+};
+
+static inline void
+mv_cesa_ahash_req_iter_init(struct mv_cesa_ahash_dma_iter *iter,
+			    struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	unsigned int len = req->nbytes;
+
+	if (!creq->last_req)
+		len = (len + creq->cache_ptr) & ~CESA_HASH_BLOCK_SIZE_MSK;
+
+	mv_cesa_req_dma_iter_init(&iter->base, len);
+	mv_cesa_sg_dma_iter_init(&iter->src, req->src, DMA_TO_DEVICE);
+	iter->src.op_offset = creq->cache_ptr;
+}
+
+static inline bool
+mv_cesa_ahash_req_iter_next_op(struct mv_cesa_ahash_dma_iter *iter)
+{
+	iter->src.op_offset = 0;
+
+	return mv_cesa_req_dma_iter_next_op(&iter->base);
+}
+
+static inline int mv_cesa_ahash_dma_alloc_cache(struct mv_cesa_ahash_req *creq,
+						gfp_t flags)
+{
+	struct mv_cesa_ahash_dma_req *dreq = &creq->req.dma;
+
+	creq->cache = dma_pool_alloc(cesa_dev->dma->cache_pool, flags,
+				     &dreq->cache_dma);
+	if (!creq->cache)
+		return -ENOMEM;
+
+	return 0;
+}
+
 static inline int mv_cesa_ahash_std_alloc_cache(struct mv_cesa_ahash_req *creq,
 						gfp_t flags)
 {
@@ -31,11 +72,23 @@  static int mv_cesa_ahash_alloc_cache(struct ahash_request *req)
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
 		      GFP_KERNEL : GFP_ATOMIC;
+	int ret;
 
 	if (creq->cache)
 		return 0;
 
-	return mv_cesa_ahash_std_alloc_cache(creq, flags);
+	if (creq->req.base.type == CESA_DMA_REQ)
+		ret = mv_cesa_ahash_dma_alloc_cache(creq, flags);
+	else
+		ret = mv_cesa_ahash_std_alloc_cache(creq, flags);
+
+	return ret;
+}
+
+static inline void mv_cesa_ahash_dma_free_cache(struct mv_cesa_ahash_req *creq)
+{
+	dma_pool_free(cesa_dev->dma->cache_pool, creq->cache,
+		      creq->req.dma.cache_dma);
 }
 
 static inline void mv_cesa_ahash_std_free_cache(struct mv_cesa_ahash_req *creq)
@@ -48,16 +101,69 @@  static void mv_cesa_ahash_free_cache(struct mv_cesa_ahash_req *creq)
 	if (!creq->cache)
 		return;
 
-	mv_cesa_ahash_std_free_cache(creq);
+	if (creq->req.base.type == CESA_DMA_REQ)
+		mv_cesa_ahash_dma_free_cache(creq);
+	else
+		mv_cesa_ahash_std_free_cache(creq);
 
 	creq->cache = NULL;
 }
 
+static int mv_cesa_ahash_dma_alloc_padding(struct mv_cesa_ahash_dma_req *req,
+					   gfp_t flags)
+{
+	if (req->padding)
+		return 0;
+
+	req->padding = dma_pool_alloc(cesa_dev->dma->padding_pool, flags,
+				      &req->padding_dma);
+	if (!req->padding)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void mv_cesa_ahash_dma_free_padding(struct mv_cesa_ahash_dma_req *req)
+{
+	if (!req->padding)
+		return;
+
+	dma_pool_free(cesa_dev->dma->padding_pool, req->padding,
+		      req->padding_dma);
+	req->padding = NULL;
+}
+
+static inline void mv_cesa_ahash_dma_last_cleanup(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+
+	mv_cesa_ahash_dma_free_padding(&creq->req.dma);
+}
+
+static inline void mv_cesa_ahash_dma_cleanup(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+
+	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
+	mv_cesa_dma_cleanup(&creq->req.dma.base);
+}
+
+static inline void mv_cesa_ahash_cleanup(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+
+	if (creq->req.base.type == CESA_DMA_REQ)
+		mv_cesa_ahash_dma_cleanup(req);
+}
+
 static void mv_cesa_ahash_last_cleanup(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
 
 	mv_cesa_ahash_free_cache(creq);
+
+	if (creq->req.base.type == CESA_DMA_REQ)
+		mv_cesa_ahash_dma_last_cleanup(req);
 }
 
 static int mv_cesa_ahash_pad_len(struct mv_cesa_ahash_req *creq)
@@ -183,6 +289,14 @@  static int mv_cesa_ahash_std_process(struct ahash_request *req, u32 status)
 	return 0;
 }
 
+static inline void mv_cesa_ahash_dma_prepare(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	struct mv_cesa_tdma_req *dreq = &creq->req.dma.base;
+
+	mv_cesa_dma_prepare(dreq, dreq->base.engine);
+}
+
 static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
@@ -197,8 +311,12 @@  static void mv_cesa_ahash_std_prepare(struct ahash_request *req)
 static void mv_cesa_ahash_step(struct crypto_async_request *req)
 {
 	struct ahash_request *ahashreq = ahash_request_cast(req);
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(ahashreq);
 
-	mv_cesa_ahash_std_step(ahashreq);
+	if (creq->req.base.type == CESA_DMA_REQ)
+		mv_cesa_dma_step(&creq->req.dma.base);
+	else
+		mv_cesa_ahash_std_step(ahashreq);
 }
 
 static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
@@ -209,7 +327,11 @@  static int mv_cesa_ahash_process(struct crypto_async_request *req, u32 status)
 	unsigned int digsize;
 	int ret, i;
 
-	ret = mv_cesa_ahash_std_process(ahashreq, status);
+	if (creq->req.base.type == CESA_DMA_REQ)
+		ret = mv_cesa_dma_process(&creq->req.dma.base, status);
+	else
+		ret = mv_cesa_ahash_std_process(ahashreq, status);
+
 	if (ret == -EINPROGRESS)
 		return ret;
 
@@ -243,7 +365,10 @@  static void mv_cesa_ahash_prepare(struct crypto_async_request *req,
 
 	creq->req.base.engine = engine;
 
-	mv_cesa_ahash_std_prepare(ahashreq);
+	if (creq->req.base.type == CESA_DMA_REQ)
+		mv_cesa_ahash_dma_prepare(ahashreq);
+	else
+		mv_cesa_ahash_std_prepare(ahashreq);
 
 	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(ahashreq));
 	for (i = 0; i < digsize / 4; i++)
@@ -258,6 +383,8 @@  static void mv_cesa_ahash_req_cleanup(struct crypto_async_request *req)
 
 	if (creq->last_req)
 		mv_cesa_ahash_last_cleanup(ahashreq);
+
+	mv_cesa_ahash_cleanup(ahashreq);
 }
 
 static const struct mv_cesa_req_ops mv_cesa_ahash_req_ops = {
@@ -325,14 +452,269 @@  static int mv_cesa_ahash_cache_req(struct ahash_request *req, bool *cached)
 	return 0;
 }
 
+static struct mv_cesa_op_ctx *
+mv_cesa_ahash_dma_add_cache(struct mv_cesa_tdma_chain *chain,
+			    struct mv_cesa_ahash_dma_iter *dma_iter,
+			    struct mv_cesa_ahash_req *creq,
+			    gfp_t flags)
+{
+	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
+	struct mv_cesa_op_ctx *op = NULL;
+	int ret;
+
+	if (!creq->cache_ptr)
+		return NULL;
+
+	ret = mv_cesa_dma_add_data_transfer(chain,
+					    CESA_SA_DATA_SRAM_OFFSET,
+					    ahashdreq->cache_dma,
+					    creq->cache_ptr,
+					    CESA_TDMA_DST_IN_SRAM,
+					    flags);
+	if (ret)
+		return ERR_PTR(ret);
+
+	if (!dma_iter->base.op_len) {
+		op = mv_cesa_dma_add_op(chain, &creq->op_tmpl, false, flags);
+		if (IS_ERR(op))
+			return op;
+
+		mv_cesa_set_mac_op_frag_len(op, creq->cache_ptr);
+
+		/* Add dummy desc to launch crypto operation */
+		ret = mv_cesa_dma_add_dummy_launch(chain, flags);
+		if (ret)
+			return ERR_PTR(ret);
+	}
+
+	return op;
+}
+
+static struct mv_cesa_op_ctx *
+mv_cesa_ahash_dma_add_data(struct mv_cesa_tdma_chain *chain,
+			   struct mv_cesa_ahash_dma_iter *dma_iter,
+			   struct mv_cesa_ahash_req *creq,
+			   gfp_t flags)
+{
+	struct mv_cesa_op_ctx *op;
+	int ret;
+
+	op = mv_cesa_dma_add_op(chain, &creq->op_tmpl, false, flags);
+	if (IS_ERR(op))
+		return op;
+
+	mv_cesa_set_mac_op_frag_len(op, dma_iter->base.op_len);
+
+	if ((mv_cesa_get_op_cfg(&creq->op_tmpl) & CESA_SA_DESC_CFG_FRAG_MSK) ==
+	    CESA_SA_DESC_CFG_FIRST_FRAG)
+		mv_cesa_update_op_cfg(&creq->op_tmpl,
+				      CESA_SA_DESC_CFG_MID_FRAG,
+				      CESA_SA_DESC_CFG_FRAG_MSK);
+
+	/* Add input transfers */
+	ret = mv_cesa_dma_add_op_transfers(chain, &dma_iter->base,
+					   &dma_iter->src, flags);
+	if (ret)
+		return ERR_PTR(ret);
+
+	/* Add dummy desc to launch crypto operation */
+	ret = mv_cesa_dma_add_dummy_launch(chain, flags);
+	if (ret)
+		return ERR_PTR(ret);
+
+	return op;
+}
+
+static struct mv_cesa_op_ctx *
+mv_cesa_ahash_dma_last_req(struct mv_cesa_tdma_chain *chain,
+			   struct mv_cesa_ahash_dma_iter *dma_iter,
+			   struct mv_cesa_ahash_req *creq,
+			   struct mv_cesa_op_ctx *op,
+			   gfp_t flags)
+{
+	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
+	unsigned int len, trailerlen, padoff = 0;
+	int ret;
+
+	if (!creq->last_req)
+		return op;
+
+	if (op && creq->len <= CESA_SA_DESC_MAC_SRC_TOTAL_LEN_MAX) {
+		u32 frag = CESA_SA_DESC_CFG_NOT_FRAG;
+
+		if ((mv_cesa_get_op_cfg(op) & CESA_SA_DESC_CFG_FRAG_MSK) !=
+		    CESA_SA_DESC_CFG_FIRST_FRAG)
+			frag = CESA_SA_DESC_CFG_LAST_FRAG;
+
+		mv_cesa_update_op_cfg(op, frag, CESA_SA_DESC_CFG_FRAG_MSK);
+
+		return op;
+	}
+
+	ret = mv_cesa_ahash_dma_alloc_padding(ahashdreq, flags);
+	if (ret)
+		return ERR_PTR(ret);
+
+	trailerlen = mv_cesa_ahash_pad_req(creq, ahashdreq->padding);
+
+	if (op) {
+		len = min(CESA_SA_SRAM_PAYLOAD_SIZE - dma_iter->base.op_len,
+			  trailerlen);
+		if (len) {
+			ret = mv_cesa_dma_add_data_transfer(chain,
+						CESA_SA_DATA_SRAM_OFFSET +
+						dma_iter->base.op_len,
+						ahashdreq->padding_dma,
+						len, CESA_TDMA_DST_IN_SRAM,
+						flags);
+			if (ret)
+				return ERR_PTR(ret);
+
+			mv_cesa_update_op_cfg(op, CESA_SA_DESC_CFG_MID_FRAG,
+					      CESA_SA_DESC_CFG_FRAG_MSK);
+			mv_cesa_set_mac_op_frag_len(op,
+					dma_iter->base.op_len + len);
+			padoff += len;
+		}
+	}
+
+	if (padoff >= trailerlen)
+		return op;
+
+	if ((mv_cesa_get_op_cfg(&creq->op_tmpl) & CESA_SA_DESC_CFG_FRAG_MSK) !=
+	    CESA_SA_DESC_CFG_FIRST_FRAG)
+		mv_cesa_update_op_cfg(&creq->op_tmpl,
+				      CESA_SA_DESC_CFG_MID_FRAG,
+				      CESA_SA_DESC_CFG_FRAG_MSK);
+
+	op = mv_cesa_dma_add_op(chain, &creq->op_tmpl, false, flags);
+	if (IS_ERR(op))
+		return op;
+
+	mv_cesa_set_mac_op_frag_len(op, trailerlen - padoff);
+
+	ret = mv_cesa_dma_add_data_transfer(chain,
+					    CESA_SA_DATA_SRAM_OFFSET,
+					    ahashdreq->padding_dma +
+					    padoff,
+					    trailerlen - padoff,
+					    CESA_TDMA_DST_IN_SRAM,
+					    flags);
+	if (ret)
+		return ERR_PTR(ret);
+
+	/* Add dummy desc to launch crypto operation */
+	ret = mv_cesa_dma_add_dummy_launch(chain, flags);
+	if (ret)
+		return ERR_PTR(ret);
+
+	return op;
+}
+
+static int mv_cesa_ahash_dma_req_init(struct ahash_request *req)
+{
+	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	gfp_t flags = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ?
+		      GFP_KERNEL : GFP_ATOMIC;
+	struct mv_cesa_ahash_dma_req *ahashdreq = &creq->req.dma;
+	struct mv_cesa_tdma_req *dreq = &ahashdreq->base;
+	struct mv_cesa_tdma_chain chain;
+	struct mv_cesa_ahash_dma_iter iter;
+	struct mv_cesa_op_ctx *op = NULL;
+	int ret;
+
+	dreq->chain.first = NULL;
+	dreq->chain.last = NULL;
+
+	if (creq->src_nents) {
+		ret = dma_map_sg(cesa_dev->dev, req->src, creq->src_nents,
+				 DMA_TO_DEVICE);
+		if (!ret) {
+			ret = -ENOMEM;
+			goto err;
+		}
+
+		creq->src_nents = ret;
+	}
+
+	mv_cesa_tdma_desc_iter_init(&chain);
+	mv_cesa_ahash_req_iter_init(&iter, req);
+
+	op = mv_cesa_ahash_dma_add_cache(&chain, &iter,
+					 creq, flags);
+	if (IS_ERR(op)) {
+		ret = PTR_ERR(op);
+		goto err_free_tdma;
+	}
+
+	do {
+		if (!iter.base.op_len)
+			break;
+
+		op = mv_cesa_ahash_dma_add_data(&chain, &iter,
+						creq, flags);
+		if (IS_ERR(op)) {
+			ret = PTR_ERR(op);
+			goto err_free_tdma;
+		}
+	} while (mv_cesa_ahash_req_iter_next_op(&iter));
+
+	op = mv_cesa_ahash_dma_last_req(&chain, &iter, creq, op, flags);
+	if (IS_ERR(op)) {
+		ret = PTR_ERR(op);
+		goto err_free_tdma;
+	}
+
+	if (op) {
+		/* Add dummy desc to wait for crypto operation end */
+		ret = mv_cesa_dma_add_dummy_end(&chain, flags);
+		if (ret)
+			goto err_free_tdma;
+	}
+
+	if (!creq->last_req)
+		creq->cache_ptr = req->nbytes + creq->cache_ptr -
+				  iter.base.len;
+	else
+		creq->cache_ptr = 0;
+
+	dreq->chain = chain;
+
+	return 0;
+
+err_free_tdma:
+	mv_cesa_dma_cleanup(dreq);
+	dma_unmap_sg(cesa_dev->dev, req->src, creq->src_nents, DMA_TO_DEVICE);
+
+err:
+	mv_cesa_ahash_last_cleanup(req);
+
+	return ret;
+}
+
 static int mv_cesa_ahash_req_init(struct ahash_request *req, bool *cached)
 {
 	struct mv_cesa_ahash_req *creq = ahash_request_ctx(req);
+	int ret;
+
+	if (cesa_dev->caps->has_tdma)
+		creq->req.base.type = CESA_DMA_REQ;
+	else
+		creq->req.base.type = CESA_STD_REQ;
 
-	creq->req.base.type = CESA_STD_REQ;
 	creq->src_nents = sg_nents_for_len(req->src, req->nbytes);
 
-	return mv_cesa_ahash_cache_req(req, cached);
+	ret = mv_cesa_ahash_cache_req(req, cached);
+	if (ret)
+		return ret;
+
+	if (*cached)
+		return 0;
+
+	if (creq->req.base.type == CESA_DMA_REQ)
+		ret = mv_cesa_ahash_dma_req_init(req);
+
+	return ret;
 }
 
 static int mv_cesa_ahash_update(struct ahash_request *req)
@@ -349,7 +731,13 @@  static int mv_cesa_ahash_update(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	return mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base);
+	if (ret && ret != -EINPROGRESS) {
+		mv_cesa_ahash_cleanup(req);
+		return ret;
+	}
+
+	return ret;
 }
 
 static int mv_cesa_ahash_final(struct ahash_request *req)
@@ -370,7 +758,11 @@  static int mv_cesa_ahash_final(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	return mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base);
+	if (ret && ret != -EINPROGRESS)
+		mv_cesa_ahash_cleanup(req);
+
+	return ret;
 }
 
 static int mv_cesa_ahash_finup(struct ahash_request *req)
@@ -391,7 +783,11 @@  static int mv_cesa_ahash_finup(struct ahash_request *req)
 	if (cached)
 		return 0;
 
-	return mv_cesa_queue_req(&req->base);
+	ret = mv_cesa_queue_req(&req->base);
+	if (ret && ret != -EINPROGRESS)
+		mv_cesa_ahash_cleanup(req);
+
+	return ret;
 }
 
 static int mv_cesa_sha1_init(struct ahash_request *req)
diff --git a/drivers/crypto/marvell/tdma.c b/drivers/crypto/marvell/tdma.c
new file mode 100644
index 0000000..9d6793c
--- /dev/null
+++ b/drivers/crypto/marvell/tdma.c
@@ -0,0 +1,224 @@ 
+/*
+ * Provide TDMA helper functions used by cipher and hash algorithm
+ * implementations.
+ *
+ * Author: Boris Brezillon <boris.brezillon@free-electrons.com>
+ * Author: Arnaud Ebalard <arno@natisbad.org>
+ *
+ * This work is based on an initial version written by
+ * Sebastian Andrzej Siewior < sebastian at breakpoint dot cc >
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include "cesa.h"
+
+bool mv_cesa_req_dma_iter_next_transfer(struct mv_cesa_dma_iter *iter,
+					struct mv_cesa_sg_dma_iter *sgiter,
+					unsigned int len)
+{
+	if (!sgiter->sg)
+		return false;
+
+	sgiter->op_offset += len;
+	sgiter->offset += len;
+	if (sgiter->offset == sgiter->sg->length) {
+		if (sg_is_last(sgiter->sg))
+			return false;
+		sgiter->offset = 0;
+		sgiter->sg = sg_next(sgiter->sg);
+	}
+
+	if (sgiter->op_offset == iter->op_len)
+		return false;
+
+	return true;
+}
+
+void mv_cesa_dma_step(struct mv_cesa_tdma_req *dreq)
+{
+	struct mv_cesa_engine *engine = dreq->base.engine;
+
+	writel(0, engine->regs + CESA_SA_CFG);
+
+	mv_cesa_set_int_mask(engine, CESA_SA_INT_ACC0_IDMA_DONE);
+	writel(CESA_TDMA_DST_BURST_128B | CESA_TDMA_SRC_BURST_128B |
+	       CESA_TDMA_NO_BYTE_SWAP | CESA_TDMA_EN,
+	       engine->regs + CESA_TDMA_CONTROL);
+
+	writel(CESA_SA_CFG_ACT_CH0_IDMA | CESA_SA_CFG_MULTI_PKT |
+	       CESA_SA_CFG_CH0_W_IDMA | CESA_SA_CFG_PARA_DIS,
+	       engine->regs + CESA_SA_CFG);
+	writel(dreq->chain.first->cur_dma,
+	       engine->regs + CESA_TDMA_NEXT_ADDR);
+	writel(CESA_SA_CMD_EN_CESA_SA_ACCL0, engine->regs + CESA_SA_CMD);
+}
+
+void mv_cesa_dma_cleanup(struct mv_cesa_tdma_req *dreq)
+{
+	struct mv_cesa_tdma_desc *tdma;
+
+	for (tdma = dreq->chain.first; tdma;) {
+		struct mv_cesa_tdma_desc *old_tdma = tdma;
+
+		if (tdma->flags & CESA_TDMA_OP)
+			dma_pool_free(cesa_dev->dma->op_pool, tdma->op,
+				      le32_to_cpu(tdma->src));
+
+		tdma = tdma->next;
+		dma_pool_free(cesa_dev->dma->tdma_desc_pool, old_tdma,
+			      le32_to_cpu(old_tdma->cur_dma));
+	}
+
+	dreq->chain.first = NULL;
+	dreq->chain.last = NULL;
+}
+
+void mv_cesa_dma_prepare(struct mv_cesa_tdma_req *dreq,
+			 struct mv_cesa_engine *engine)
+{
+	struct mv_cesa_tdma_desc *tdma;
+
+	for (tdma = dreq->chain.first; tdma; tdma = tdma->next) {
+		if (tdma->flags & CESA_TDMA_DST_IN_SRAM)
+			tdma->dst = cpu_to_le32(tdma->dst + engine->sram_dma);
+
+		if (tdma->flags & CESA_TDMA_SRC_IN_SRAM)
+			tdma->src = cpu_to_le32(tdma->src + engine->sram_dma);
+
+		if (tdma->flags & CESA_TDMA_OP)
+			mv_cesa_adjust_op(engine, tdma->op);
+	}
+}
+
+static struct mv_cesa_tdma_desc *
+mv_cesa_dma_add_desc(struct mv_cesa_tdma_chain *chain, gfp_t flags)
+{
+	struct mv_cesa_tdma_desc *new_tdma = NULL;
+	dma_addr_t dma_handle;
+
+	new_tdma = dma_pool_alloc(cesa_dev->dma->tdma_desc_pool, flags,
+				  &dma_handle);
+	if (!new_tdma)
+		return ERR_PTR(-ENOMEM);
+
+	memset(new_tdma, 0, sizeof(*new_tdma));
+	new_tdma->cur_dma = cpu_to_le32(dma_handle);
+	if (chain->last) {
+		chain->last->next_dma = new_tdma->cur_dma;
+		chain->last->next = new_tdma;
+	} else {
+		chain->first = new_tdma;
+	}
+
+	chain->last = new_tdma;
+
+	return new_tdma;
+}
+
+struct mv_cesa_op_ctx *mv_cesa_dma_add_op(struct mv_cesa_tdma_chain *chain,
+					const struct mv_cesa_op_ctx *op_templ,
+					bool skip_ctx,
+					gfp_t flags)
+{
+	struct mv_cesa_tdma_desc *tdma;
+	struct mv_cesa_op_ctx *op;
+	dma_addr_t dma_handle;
+
+	tdma = mv_cesa_dma_add_desc(chain, flags);
+	if (IS_ERR(tdma))
+		return ERR_CAST(tdma);
+
+	op = dma_pool_alloc(cesa_dev->dma->op_pool, flags, &dma_handle);
+	if (!op)
+		return ERR_PTR(-ENOMEM);
+
+	*op = *op_templ;
+
+	tdma = chain->last;
+	tdma->op = op;
+	tdma->byte_cnt = (skip_ctx ? sizeof(op->desc) : sizeof(*op)) | BIT(31);
+	tdma->src = dma_handle;
+	tdma->flags = CESA_TDMA_DST_IN_SRAM | CESA_TDMA_OP;
+
+	return op;
+}
+
+int mv_cesa_dma_add_data_transfer(struct mv_cesa_tdma_chain *chain,
+				  dma_addr_t dst, dma_addr_t src, u32 size,
+				  u32 flags, gfp_t gfp_flags)
+{
+	struct mv_cesa_tdma_desc *tdma;
+
+	tdma = mv_cesa_dma_add_desc(chain, gfp_flags);
+	if (IS_ERR(tdma))
+		return PTR_ERR(tdma);
+
+	tdma->byte_cnt = size | BIT(31);
+	tdma->src = src;
+	tdma->dst = dst;
+
+	flags &= (CESA_TDMA_DST_IN_SRAM | CESA_TDMA_SRC_IN_SRAM);
+	tdma->flags = flags | CESA_TDMA_DATA;
+
+	return 0;
+}
+
+int mv_cesa_dma_add_dummy_launch(struct mv_cesa_tdma_chain *chain,
+				 u32 flags)
+{
+	struct mv_cesa_tdma_desc *tdma;
+
+	tdma = mv_cesa_dma_add_desc(chain, flags);
+	if (IS_ERR(tdma))
+		return PTR_ERR(tdma);
+
+	return 0;
+}
+
+int mv_cesa_dma_add_dummy_end(struct mv_cesa_tdma_chain *chain, u32 flags)
+{
+	struct mv_cesa_tdma_desc *tdma;
+
+	tdma = mv_cesa_dma_add_desc(chain, flags);
+	if (IS_ERR(tdma))
+		return PTR_ERR(tdma);
+
+	tdma->byte_cnt = BIT(31);
+
+	return 0;
+}
+
+int mv_cesa_dma_add_op_transfers(struct mv_cesa_tdma_chain *chain,
+				 struct mv_cesa_dma_iter *dma_iter,
+				 struct mv_cesa_sg_dma_iter *sgiter,
+				 gfp_t gfp_flags)
+{
+	u32 flags = sgiter->dir == DMA_TO_DEVICE ?
+		    CESA_TDMA_DST_IN_SRAM : CESA_TDMA_SRC_IN_SRAM;
+	unsigned int len;
+
+	do {
+		dma_addr_t dst, src;
+		int ret;
+
+		len = mv_cesa_req_dma_iter_transfer_len(dma_iter, sgiter);
+		if (sgiter->dir == DMA_TO_DEVICE) {
+			dst = CESA_SA_DATA_SRAM_OFFSET + sgiter->op_offset;
+			src = sgiter->sg->dma_address + sgiter->offset;
+		} else {
+			dst = sgiter->sg->dma_address + sgiter->offset;
+			src = CESA_SA_DATA_SRAM_OFFSET + sgiter->op_offset;
+		}
+
+		ret = mv_cesa_dma_add_data_transfer(chain, dst, src, len,
+						    flags, gfp_flags);
+		if (ret)
+			return ret;
+
+	} while (mv_cesa_req_dma_iter_next_transfer(dma_iter, sgiter, len));
+
+	return 0;
+}