[RFC,v2,2/3] crypto: Introduce CRYPTO_ALG_BULK flag

Message ID	47e9ddd8c9ea9ad9e29c8cb027d19d8459ea1479.1464346333.git.baolin.wang@linaro.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Baolin Wang <baolin.wang@linaro.org> To: axboe@kernel.dk, agk@redhat.com, snitzer@redhat.com, dm-devel@redhat.com, herbert@gondor.apana.org.au, davem@davemloft.net Cc: ebiggers3@gmail.com, js1304@gmail.com, tadeusz.struk@intel.com, smueller@chronox.de, standby24x7@gmail.com, shli@kernel.org, dan.j.williams@intel.com, martin.petersen@oracle.com, sagig@mellanox.com, kent.overstreet@gmail.com, keith.busch@intel.com, tj@kernel.org, ming.lei@canonical.com, broonie@kernel.org, arnd@arndb.de, linux-crypto@vger.kernel.org, linux-block@vger.kernel.org, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, baolin.wang@linaro.org Subject: [RFC v2 2/3] crypto: Introduce CRYPTO_ALG_BULK flag Date: Fri, 27 May 2016 19:11:23 +0800 Message-Id: <47e9ddd8c9ea9ad9e29c8cb027d19d8459ea1479.1464346333.git.baolin.wang@linaro.org> In-Reply-To: <cover.1464346333.git.baolin.wang@linaro.org> References: <cover.1464346333.git.baolin.wang@linaro.org> In-Reply-To: <cover.1464346333.git.baolin.wang@linaro.org> References: <cover.1464346333.git.baolin.wang@linaro.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk

(Exiting) Baolin Wang May 27, 2016, 11:11 a.m. UTC

Now some cipher hardware engines prefer to handle bulk block rather than one
sector (512 bytes) created by dm-crypt, cause these cipher engines can handle
the intermediate values (IV) by themselves in one bulk block. This means we
can increase the size of the request by merging request rather than always 512
bytes and thus increase the hardware engine processing speed.

So introduce 'CRYPTO_ALG_BULK' flag to indicate this cipher can support bulk
mode.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 include/crypto/skcipher.h |    7 +++++++
 include/linux/crypto.h    |    6 ++++++
 2 files changed, 13 insertions(+)

Herbert Xu June 2, 2016, 8:26 a.m. UTC | #1

On Fri, May 27, 2016 at 07:11:23PM +0800, Baolin Wang wrote:
> Now some cipher hardware engines prefer to handle bulk block rather than one
> sector (512 bytes) created by dm-crypt, cause these cipher engines can handle
> the intermediate values (IV) by themselves in one bulk block. This means we
> can increase the size of the request by merging request rather than always 512
> bytes and thus increase the hardware engine processing speed.
> 
> So introduce 'CRYPTO_ALG_BULK' flag to indicate this cipher can support bulk
> mode.
> 
> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>

I think a better aproach would be to explicitly move the IV generation
into the crypto API, similar to how we handle IPsec.  Once you do
that then every algorithm can be handled through the bulk interface.

Cheers,

(Exiting) Baolin Wang June 3, 2016, 6:48 a.m. UTC | #2

Hi Herbet,

On 2 June 2016 at 16:26, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, May 27, 2016 at 07:11:23PM +0800, Baolin Wang wrote:
>> Now some cipher hardware engines prefer to handle bulk block rather than one
>> sector (512 bytes) created by dm-crypt, cause these cipher engines can handle
>> the intermediate values (IV) by themselves in one bulk block. This means we
>> can increase the size of the request by merging request rather than always 512
>> bytes and thus increase the hardware engine processing speed.
>>
>> So introduce 'CRYPTO_ALG_BULK' flag to indicate this cipher can support bulk
>> mode.
>>
>> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
>
> I think a better aproach would be to explicitly move the IV generation
> into the crypto API, similar to how we handle IPsec.  Once you do
> that then every algorithm can be handled through the bulk interface.
>

Sorry for late reply.
If we move the IV generation into the crypto API, we also can not
handle every algorithm with the bulk interface. Cause we also need to
use different methods to map one whole bio or map one sector according
to the algorithm whether can support bulk mode or not. Please correct
me if I misunderstand your points. Thanks.


> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Herbert Xu June 3, 2016, 6:51 a.m. UTC | #3

On Fri, Jun 03, 2016 at 02:48:34PM +0800, Baolin Wang wrote:
>
> If we move the IV generation into the crypto API, we also can not
> handle every algorithm with the bulk interface. Cause we also need to
> use different methods to map one whole bio or map one sector according
> to the algorithm whether can support bulk mode or not. Please correct
> me if I misunderstand your points. Thanks.

Which ones can't be handled this way?

Cheers,

(Exiting) Baolin Wang June 3, 2016, 7:10 a.m. UTC | #4

On 3 June 2016 at 14:51, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, Jun 03, 2016 at 02:48:34PM +0800, Baolin Wang wrote:
>>
>> If we move the IV generation into the crypto API, we also can not
>> handle every algorithm with the bulk interface. Cause we also need to
>> use different methods to map one whole bio or map one sector according
>> to the algorithm whether can support bulk mode or not. Please correct
>> me if I misunderstand your points. Thanks.
>
> Which ones can't be handled this way?

What I mean is bulk mode and sector mode's difference is not only the
IV handling method, but also the method to map the data with
scatterlists.
Then we have two processes in dm-crypt ( crypt_convert_block() and
crypt_convert_bulk_block() ) to handle the data, so we can not handle
every algorithm with the bulk interface.

>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Herbert Xu June 3, 2016, 7:54 a.m. UTC | #5

On Fri, Jun 03, 2016 at 03:10:31PM +0800, Baolin Wang wrote:
> On 3 June 2016 at 14:51, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > On Fri, Jun 03, 2016 at 02:48:34PM +0800, Baolin Wang wrote:
> >>
> >> If we move the IV generation into the crypto API, we also can not
> >> handle every algorithm with the bulk interface. Cause we also need to
> >> use different methods to map one whole bio or map one sector according
> >> to the algorithm whether can support bulk mode or not. Please correct
> >> me if I misunderstand your points. Thanks.
> >
> > Which ones can't be handled this way?
> 
> What I mean is bulk mode and sector mode's difference is not only the
> IV handling method, but also the method to map the data with
> scatterlists.
> Then we have two processes in dm-crypt ( crypt_convert_block() and
> crypt_convert_bulk_block() ) to handle the data, so we can not handle
> every algorithm with the bulk interface.

As I asked, which algorithm can't you handle through the bulk
interface, assuming it did all the requisite magic to generate
the correct IV?

Cheers,

(Exiting) Baolin Wang June 3, 2016, 8:15 a.m. UTC | #6

On 3 June 2016 at 15:54, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, Jun 03, 2016 at 03:10:31PM +0800, Baolin Wang wrote:
>> On 3 June 2016 at 14:51, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>> > On Fri, Jun 03, 2016 at 02:48:34PM +0800, Baolin Wang wrote:
>> >>
>> >> If we move the IV generation into the crypto API, we also can not
>> >> handle every algorithm with the bulk interface. Cause we also need to
>> >> use different methods to map one whole bio or map one sector according
>> >> to the algorithm whether can support bulk mode or not. Please correct
>> >> me if I misunderstand your points. Thanks.
>> >
>> > Which ones can't be handled this way?
>>
>> What I mean is bulk mode and sector mode's difference is not only the
>> IV handling method, but also the method to map the data with
>> scatterlists.
>> Then we have two processes in dm-crypt ( crypt_convert_block() and
>> crypt_convert_bulk_block() ) to handle the data, so we can not handle
>> every algorithm with the bulk interface.
>
> As I asked, which algorithm can't you handle through the bulk
> interface, assuming it did all the requisite magic to generate
> the correct IV?

Suppose the cbc(aes) algorithm, which can not be handled through bulk
interface, it need to map the data sector by sector.
If we also handle the cbc(aes) algorithm with bulk interface, we need
to divide the sg table into sectors and need to allocate request
memory for each divided sectors (As Mike pointed out  this is in the
IO mapping
path and we try to avoid memory allocations at all costs -- due to the
risk of deadlock when issuing IO to stacked block devices (dm-crypt
could be part of a much more elaborate IO stack). ), that will
introduce more messy things I think.

>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Herbert Xu June 3, 2016, 8:21 a.m. UTC | #7

On Fri, Jun 03, 2016 at 04:15:28PM +0800, Baolin Wang wrote:
>
> Suppose the cbc(aes) algorithm, which can not be handled through bulk
> interface, it need to map the data sector by sector.
> If we also handle the cbc(aes) algorithm with bulk interface, we need
> to divide the sg table into sectors and need to allocate request
> memory for each divided sectors (As Mike pointed out  this is in the
> IO mapping
> path and we try to avoid memory allocations at all costs -- due to the
> risk of deadlock when issuing IO to stacked block devices (dm-crypt
> could be part of a much more elaborate IO stack). ), that will
> introduce more messy things I think.

Perhaps I'm not making myself very clear.  If you move the IV
generation into the crypto API, those crypto API algorithms will
be operating at the sector level.

For example, assuming you're doing lmk, then the algorithm would
be called lmk(cbc(aes)) and it will take as its input one or more
sectors and for each sector it should generate an IV and operate
on it.

Cheers,

(Exiting) Baolin Wang June 3, 2016, 9:23 a.m. UTC | #8

On 3 June 2016 at 16:21, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, Jun 03, 2016 at 04:15:28PM +0800, Baolin Wang wrote:
>>
>> Suppose the cbc(aes) algorithm, which can not be handled through bulk
>> interface, it need to map the data sector by sector.
>> If we also handle the cbc(aes) algorithm with bulk interface, we need
>> to divide the sg table into sectors and need to allocate request
>> memory for each divided sectors (As Mike pointed out  this is in the
>> IO mapping
>> path and we try to avoid memory allocations at all costs -- due to the
>> risk of deadlock when issuing IO to stacked block devices (dm-crypt
>> could be part of a much more elaborate IO stack). ), that will
>> introduce more messy things I think.
>
> Perhaps I'm not making myself very clear.  If you move the IV
> generation into the crypto API, those crypto API algorithms will
> be operating at the sector level.

Yeah, IV generation is OK. But it is not only related to IV thing. For example:

(1) For ecb(aes) algorithm which don't need to handle IV generation,
so it can support bulk mode:
Assuming one 64K size bio coming , we can map the whole bio with one
sg table in dm-crypt (assume it used 16 scatterlists from sg table),
then issue the 'skcipher_request_set_crypt()' function to set one
request with the mapped sg table, which will be sent to crypto driver
to be handled.

(2) For cbc(aes) algorithm which need to handle IV generation sector
by sector, so it can not support bulk mode and can not use bulk
interface:
Assuming one 64K size bio coming , we should map the bio sector by
sector with one scatterlist at one time. Each time we will issue the
'skcipher_request_set_crypt()' function to set one request with only
one mapped scatterlist, until it handled done the whole bio.

(3) As your suggestion, if we also use bulk interface for cbc(aes)
algorithm assuming it did all the requisite magic to generate the
correct IV:
Assuming one 64K size bio coming, we can map the whole bio with one sg
table in crypt_convert_bulk_block() function. But if we send this bulk
request to crypto layer, we should divide the bulk request into small
requests, and each small request should be one sector size (512 bytes)
with assuming the correct IV, but we need to allocate small requests
memory for the division, which will not good for IO mapping, and how
each small request connect to dm-crypt (how to notify the request is
done?)?

Thus we should not handle every algorithm with bulk interface.

>
> For example, assuming you're doing lmk, then the algorithm would
> be called lmk(cbc(aes)) and it will take as its input one or more
> sectors and for each sector it should generate an IV and operate
> on it.

>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Herbert Xu June 3, 2016, 10:09 a.m. UTC | #9

On Fri, Jun 03, 2016 at 05:23:59PM +0800, Baolin Wang wrote:
>
> Assuming one 64K size bio coming, we can map the whole bio with one sg
> table in crypt_convert_bulk_block() function. But if we send this bulk
> request to crypto layer, we should divide the bulk request into small
> requests, and each small request should be one sector size (512 bytes)
> with assuming the correct IV, but we need to allocate small requests
> memory for the division, which will not good for IO mapping, and how
> each small request connect to dm-crypt (how to notify the request is
> done?)?

Why won't it be good? The actual AES block size is 16 and yet we
have no trouble when you feed it a block of 512 bytes.

Cheers,

(Exiting) Baolin Wang June 3, 2016, 10:47 a.m. UTC | #10

On 3 June 2016 at 18:09, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, Jun 03, 2016 at 05:23:59PM +0800, Baolin Wang wrote:
>>
>> Assuming one 64K size bio coming, we can map the whole bio with one sg
>> table in crypt_convert_bulk_block() function. But if we send this bulk
>> request to crypto layer, we should divide the bulk request into small
>> requests, and each small request should be one sector size (512 bytes)
>> with assuming the correct IV, but we need to allocate small requests
>> memory for the division, which will not good for IO mapping, and how
>> each small request connect to dm-crypt (how to notify the request is
>> done?)?
>
> Why won't it be good? The actual AES block size is 16 and yet we

Like I said, we should avoid memory allocation to improve efficiency
in the IO path. Another hand is how the divided small requests
(allocate request memory at crypt layer) connect with dm-crypt? Since
dm-crypt just send one bulk request to crypt layer, but it will be
divided into small requests at crypt layer.

> have no trouble when you feed it a block of 512 bytes.

That's right.

[RFC,v2,2/3] crypto: Introduce CRYPTO_ALG_BULK flag

Commit Message

Comments

Patch