mbox series

[v4,00/12] RISC-V: support some cryptography accelerations

Message ID 20230711153743.1970625-1-heiko@sntech.de (mailing list archive)
Headers show
Series RISC-V: support some cryptography accelerations | expand

Message

Heiko Stuebner July 11, 2023, 3:37 p.m. UTC
From: Heiko Stuebner <heiko.stuebner@vrull.eu>

This series provides cryptographic implementations using the vector
crypto extensions.

v13 of the vector patchset dropped the patches for in-kernel usage of
vector instructions, I picked the ones from v12 over into this series
for now.

My basic goal was to not re-invent cryptographic code, so the heavy
lifting is done by those perl-asm scripts used in openssl and the perl
code used here-in stems from code that is targetted at openssl [0] and is
unmodified from there to limit needed review effort.

With a matching qemu (there are patches for vector-crypto flying around)
the in-kernel crypto-selftests (also the extended ones) are very happy
so far.


changes in v4:
- split off from scalar crypto patches but base on top of them
- adapt to pending openssl code [0] using the now frozen vector crypto
  extensions - with all its changes
  [0] https://github.com/openssl/openssl/pull/20149

changes in v3:
- rebase on top of 6.3-rc2
- rebase on top of vector-v14 patchset
- add the missing Co-developed-by mentions to showcase
  the people that did the actual openSSL crypto code

changes in v2:
- rebased on 6.2 + zbb series, so don't include already
  applied changes anymore
- refresh code picked from openssl as that side matures
- more algorithms (SHA512, AES, SM3, SM4)

Greentime Hu (2):
  riscv: Add support for kernel mode vector
  riscv: Add vector extension XOR implementation

Heiko Stuebner (10):
  RISC-V: add helper function to read the vector VLEN
  RISC-V: add vector crypto extension detection
  RISC-V: crypto: update perl include with helpers for vector (crypto)
    instructions
  RISC-V: crypto: add Zvbb+Zvbc accelerated GCM GHASH implementation
  RISC-V: crypto: add Zvkg accelerated GCM GHASH implementation
  RISC-V: crypto: add a vector-crypto-accelerated SHA256 implementation
  RISC-V: crypto: add a vector-crypto-accelerated SHA512 implementation
  RISC-V: crypto: add Zvkned accelerated AES encryption implementation
  RISC-V: crypto: add Zvksed accelerated SM4 encryption implementation
  RISC-V: crypto: add Zvksh accelerated SM3 hash implementation

 arch/riscv/crypto/Kconfig                     |  68 ++-
 arch/riscv/crypto/Makefile                    |  44 +-
 arch/riscv/crypto/aes-riscv-glue.c            | 168 ++++++
 arch/riscv/crypto/aes-riscv64-zvkned.pl       | 530 ++++++++++++++++++
 arch/riscv/crypto/ghash-riscv64-glue.c        | 245 ++++++++
 arch/riscv/crypto/ghash-riscv64-zvbb-zvbc.pl  | 380 +++++++++++++
 arch/riscv/crypto/ghash-riscv64-zvkg.pl       | 168 ++++++
 arch/riscv/crypto/riscv.pm                    | 433 +++++++++++++-
 arch/riscv/crypto/sha256-riscv64-glue.c       | 115 ++++
 .../crypto/sha256-riscv64-zvbb-zvknha.pl      | 314 +++++++++++
 arch/riscv/crypto/sha512-riscv64-glue.c       | 106 ++++
 .../crypto/sha512-riscv64-zvbb-zvknhb.pl      | 377 +++++++++++++
 arch/riscv/crypto/sm3-riscv64-glue.c          | 112 ++++
 arch/riscv/crypto/sm3-riscv64-zvksh.pl        | 225 ++++++++
 arch/riscv/crypto/sm4-riscv64-glue.c          | 162 ++++++
 arch/riscv/crypto/sm4-riscv64-zvksed.pl       | 300 ++++++++++
 arch/riscv/include/asm/hwcap.h                |   9 +
 arch/riscv/include/asm/vector.h               |  28 +
 arch/riscv/include/asm/xor.h                  |  82 +++
 arch/riscv/kernel/Makefile                    |   1 +
 arch/riscv/kernel/cpu.c                       |   8 +
 arch/riscv/kernel/cpufeature.c                |  50 ++
 arch/riscv/kernel/kernel_mode_vector.c        | 132 +++++
 arch/riscv/lib/Makefile                       |   1 +
 arch/riscv/lib/xor.S                          |  81 +++
 25 files changed, 4136 insertions(+), 3 deletions(-)
 create mode 100644 arch/riscv/crypto/aes-riscv-glue.c
 create mode 100644 arch/riscv/crypto/aes-riscv64-zvkned.pl
 create mode 100644 arch/riscv/crypto/ghash-riscv64-zvbb-zvbc.pl
 create mode 100644 arch/riscv/crypto/ghash-riscv64-zvkg.pl
 create mode 100644 arch/riscv/crypto/sha256-riscv64-glue.c
 create mode 100644 arch/riscv/crypto/sha256-riscv64-zvbb-zvknha.pl
 create mode 100644 arch/riscv/crypto/sha512-riscv64-glue.c
 create mode 100644 arch/riscv/crypto/sha512-riscv64-zvbb-zvknhb.pl
 create mode 100644 arch/riscv/crypto/sm3-riscv64-glue.c
 create mode 100644 arch/riscv/crypto/sm3-riscv64-zvksh.pl
 create mode 100644 arch/riscv/crypto/sm4-riscv64-glue.c
 create mode 100644 arch/riscv/crypto/sm4-riscv64-zvksed.pl
 create mode 100644 arch/riscv/include/asm/xor.h
 create mode 100644 arch/riscv/kernel/kernel_mode_vector.c
 create mode 100644 arch/riscv/lib/xor.S

Comments

Eric Biggers July 13, 2023, 7:40 a.m. UTC | #1
On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> This series provides cryptographic implementations using the vector
> crypto extensions.
> 
> v13 of the vector patchset dropped the patches for in-kernel usage of
> vector instructions, I picked the ones from v12 over into this series
> for now.
> 
> My basic goal was to not re-invent cryptographic code, so the heavy
> lifting is done by those perl-asm scripts used in openssl and the perl
> code used here-in stems from code that is targetted at openssl [0] and is
> unmodified from there to limit needed review effort.
> 
> With a matching qemu (there are patches for vector-crypto flying around)
> the in-kernel crypto-selftests (also the extended ones) are very happy
> so far.

Where does this patchset apply to?  I tried torvalds/master, linux-next/master,
riscv/for-next, and cryptodev/master.  Nothing worked.  When sending a
patch(set), please always use the '--base' option to 'git format-patch', or
explicitly mention where it applies to, or provide a link to a git repo.

- Eric
Eric Biggers July 14, 2023, 6:27 a.m. UTC | #2
On Thu, Jul 13, 2023 at 12:40:42AM -0700, Eric Biggers wrote:
> On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > This series provides cryptographic implementations using the vector
> > crypto extensions.
> > 
> > v13 of the vector patchset dropped the patches for in-kernel usage of
> > vector instructions, I picked the ones from v12 over into this series
> > for now.
> > 
> > My basic goal was to not re-invent cryptographic code, so the heavy
> > lifting is done by those perl-asm scripts used in openssl and the perl
> > code used here-in stems from code that is targetted at openssl [0] and is
> > unmodified from there to limit needed review effort.
> > 
> > With a matching qemu (there are patches for vector-crypto flying around)
> > the in-kernel crypto-selftests (also the extended ones) are very happy
> > so far.
> 
> Where does this patchset apply to?  I tried torvalds/master, linux-next/master,
> riscv/for-next, and cryptodev/master.  Nothing worked.  When sending a
> patch(set), please always use the '--base' option to 'git format-patch', or
> explicitly mention where it applies to, or provide a link to a git repo.
> 

Hi Heiko, any update on this?  I would like to review, and maybe test, this
patchset but there's no way for me to do so.

- Eric
Heiko Stuebner July 14, 2023, 7:02 a.m. UTC | #3
Hi Eric,

Am Freitag, 14. Juli 2023, 08:27:08 CEST schrieb Eric Biggers:
> On Thu, Jul 13, 2023 at 12:40:42AM -0700, Eric Biggers wrote:
> > On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
> > > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > > 
> > > This series provides cryptographic implementations using the vector
> > > crypto extensions.
> > > 
> > > v13 of the vector patchset dropped the patches for in-kernel usage of
> > > vector instructions, I picked the ones from v12 over into this series
> > > for now.
> > > 
> > > My basic goal was to not re-invent cryptographic code, so the heavy
> > > lifting is done by those perl-asm scripts used in openssl and the perl
> > > code used here-in stems from code that is targetted at openssl [0] and is
> > > unmodified from there to limit needed review effort.
> > > 
> > > With a matching qemu (there are patches for vector-crypto flying around)
> > > the in-kernel crypto-selftests (also the extended ones) are very happy
> > > so far.
> > 
> > Where does this patchset apply to?  I tried torvalds/master, linux-next/master,
> > riscv/for-next, and cryptodev/master.  Nothing worked.  When sending a
> > patch(set), please always use the '--base' option to 'git format-patch', or
> > explicitly mention where it applies to, or provide a link to a git repo.
> > 
> 
> Hi Heiko, any update on this?  I would like to review, and maybe test, this
> patchset but there's no way for me to do so.

sorry about that. As you said, this should've been mentioned in the
cover-letter.

This patchset goes on top of the v6 scalar one [0] which in turn
goes on top of the arch-random patchset [1] and that in turn sits
on top of 6.5-rc1 for me.


Heiko


[0] https://lore.kernel.org/r/20230709154243.1582671-1-heiko@sntech.de
[1] https://lore.kernel.org/r/20230709115549.2666557-1-sameo@rivosinc.com
Eric Biggers July 21, 2023, 5:12 a.m. UTC | #4
Hi Heiko,

On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> This series provides cryptographic implementations using the vector
> crypto extensions.
> 
> v13 of the vector patchset dropped the patches for in-kernel usage of
> vector instructions, I picked the ones from v12 over into this series
> for now.
> 
> My basic goal was to not re-invent cryptographic code, so the heavy
> lifting is done by those perl-asm scripts used in openssl and the perl
> code used here-in stems from code that is targetted at openssl [0] and is
> unmodified from there to limit needed review effort.
> 
> With a matching qemu (there are patches for vector-crypto flying around)
> the in-kernel crypto-selftests (also the extended ones) are very happy
> so far.
> 
> 
> changes in v4:
> - split off from scalar crypto patches but base on top of them
> - adapt to pending openssl code [0] using the now frozen vector crypto
>   extensions - with all its changes
>   [0] https://github.com/openssl/openssl/pull/20149
> 
> changes in v3:
> - rebase on top of 6.3-rc2
> - rebase on top of vector-v14 patchset
> - add the missing Co-developed-by mentions to showcase
>   the people that did the actual openSSL crypto code
> 
> changes in v2:
> - rebased on 6.2 + zbb series, so don't include already
>   applied changes anymore
> - refresh code picked from openssl as that side matures
> - more algorithms (SHA512, AES, SM3, SM4)
> 
> Greentime Hu (2):
>   riscv: Add support for kernel mode vector
>   riscv: Add vector extension XOR implementation
> 
> Heiko Stuebner (10):
>   RISC-V: add helper function to read the vector VLEN
>   RISC-V: add vector crypto extension detection
>   RISC-V: crypto: update perl include with helpers for vector (crypto)
>     instructions
>   RISC-V: crypto: add Zvbb+Zvbc accelerated GCM GHASH implementation
>   RISC-V: crypto: add Zvkg accelerated GCM GHASH implementation
>   RISC-V: crypto: add a vector-crypto-accelerated SHA256 implementation
>   RISC-V: crypto: add a vector-crypto-accelerated SHA512 implementation
>   RISC-V: crypto: add Zvkned accelerated AES encryption implementation
>   RISC-V: crypto: add Zvksed accelerated SM4 encryption implementation
>   RISC-V: crypto: add Zvksh accelerated SM3 hash implementation
> 
>  arch/riscv/crypto/Kconfig                     |  68 ++-
>  arch/riscv/crypto/Makefile                    |  44 +-
>  arch/riscv/crypto/aes-riscv-glue.c            | 168 ++++++
>  arch/riscv/crypto/aes-riscv64-zvkned.pl       | 530 ++++++++++++++++++
>  arch/riscv/crypto/ghash-riscv64-glue.c        | 245 ++++++++
>  arch/riscv/crypto/ghash-riscv64-zvbb-zvbc.pl  | 380 +++++++++++++
>  arch/riscv/crypto/ghash-riscv64-zvkg.pl       | 168 ++++++
>  arch/riscv/crypto/riscv.pm                    | 433 +++++++++++++-
>  arch/riscv/crypto/sha256-riscv64-glue.c       | 115 ++++
>  .../crypto/sha256-riscv64-zvbb-zvknha.pl      | 314 +++++++++++
>  arch/riscv/crypto/sha512-riscv64-glue.c       | 106 ++++
>  .../crypto/sha512-riscv64-zvbb-zvknhb.pl      | 377 +++++++++++++
>  arch/riscv/crypto/sm3-riscv64-glue.c          | 112 ++++
>  arch/riscv/crypto/sm3-riscv64-zvksh.pl        | 225 ++++++++
>  arch/riscv/crypto/sm4-riscv64-glue.c          | 162 ++++++
>  arch/riscv/crypto/sm4-riscv64-zvksed.pl       | 300 ++++++++++
>  arch/riscv/include/asm/hwcap.h                |   9 +
>  arch/riscv/include/asm/vector.h               |  28 +
>  arch/riscv/include/asm/xor.h                  |  82 +++
>  arch/riscv/kernel/Makefile                    |   1 +
>  arch/riscv/kernel/cpu.c                       |   8 +
>  arch/riscv/kernel/cpufeature.c                |  50 ++
>  arch/riscv/kernel/kernel_mode_vector.c        | 132 +++++
>  arch/riscv/lib/Makefile                       |   1 +
>  arch/riscv/lib/xor.S                          |  81 +++
>  25 files changed, 4136 insertions(+), 3 deletions(-)
>  create mode 100644 arch/riscv/crypto/aes-riscv-glue.c
>  create mode 100644 arch/riscv/crypto/aes-riscv64-zvkned.pl
>  create mode 100644 arch/riscv/crypto/ghash-riscv64-zvbb-zvbc.pl
>  create mode 100644 arch/riscv/crypto/ghash-riscv64-zvkg.pl
>  create mode 100644 arch/riscv/crypto/sha256-riscv64-glue.c
>  create mode 100644 arch/riscv/crypto/sha256-riscv64-zvbb-zvknha.pl
>  create mode 100644 arch/riscv/crypto/sha512-riscv64-glue.c
>  create mode 100644 arch/riscv/crypto/sha512-riscv64-zvbb-zvknhb.pl
>  create mode 100644 arch/riscv/crypto/sm3-riscv64-glue.c
>  create mode 100644 arch/riscv/crypto/sm3-riscv64-zvksh.pl
>  create mode 100644 arch/riscv/crypto/sm4-riscv64-glue.c
>  create mode 100644 arch/riscv/crypto/sm4-riscv64-zvksed.pl
>  create mode 100644 arch/riscv/include/asm/xor.h
>  create mode 100644 arch/riscv/kernel/kernel_mode_vector.c
>  create mode 100644 arch/riscv/lib/xor.S
> 

Thanks for working on this patchset!  I'm glad to see that you and others are
working on this and the code in OpenSSL.  And thanks for running all the kernel
crypto self-tests and verifying that they pass.

I'm still a bit worried about there being two competing sets of crypto
extensions for RISC-V: scalar and vector.

However the vector crypto extensions are moving forwards (they were recently
frozen), from what I've heard are being implemented in CPUs, and based on this
patchset implementations of most algorithms are ready already.

So I'm wondering: do you still think that it's valuable to continue with your
other patchset that adds GHASH acceleration using the scalar extensions (which
this patchset is still based on)?  

I'm wondering if we should be 100% focused on the vector extensions for now to
avoid fragmentation of effort.

It's just not super clear to me what is driving the scalar crypto support right
now.  Maybe embedded systems?  Maybe it was just a mistep, perhaps due to being
started before the CPU even had a vector unit?  I don't know.  If you do indeed
have a strong reason for it, then you can go ahead -- I just wanted to make sure
we don't end up doing twice as much work unnecessarily.

- Eric
Eric Biggers Sept. 14, 2023, 12:11 a.m. UTC | #5
On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> 
> This series provides cryptographic implementations using the vector
> crypto extensions.
> 
> v13 of the vector patchset dropped the patches for in-kernel usage of
> vector instructions, I picked the ones from v12 over into this series
> for now.
> 
> My basic goal was to not re-invent cryptographic code, so the heavy
> lifting is done by those perl-asm scripts used in openssl and the perl
> code used here-in stems from code that is targetted at openssl [0] and is
> unmodified from there to limit needed review effort.
> 
> With a matching qemu (there are patches for vector-crypto flying around)
> the in-kernel crypto-selftests (also the extended ones) are very happy
> so far.

Hi Heiko!  Are you still working on this patchset?  And which of its
prerequisites still haven't been merged upstream?

- Eric
Charlie Jenkins Sept. 14, 2023, 1:10 a.m. UTC | #6
On Wed, Sep 13, 2023 at 05:11:44PM -0700, Eric Biggers wrote:
> On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > 
> > This series provides cryptographic implementations using the vector
> > crypto extensions.
> > 
> > v13 of the vector patchset dropped the patches for in-kernel usage of
> > vector instructions, I picked the ones from v12 over into this series
> > for now.
> > 
> > My basic goal was to not re-invent cryptographic code, so the heavy
> > lifting is done by those perl-asm scripts used in openssl and the perl
> > code used here-in stems from code that is targetted at openssl [0] and is
> > unmodified from there to limit needed review effort.
> > 
> > With a matching qemu (there are patches for vector-crypto flying around)
> > the in-kernel crypto-selftests (also the extended ones) are very happy
> > so far.
> 
> Hi Heiko!  Are you still working on this patchset?  And which of its
> prerequisites still haven't been merged upstream?
> 
> - Eric
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
It is my understanding that Heiko is taking a break from development, I
don't think he will be working on this soon.

- Charlie
He-Jie Shih Sept. 15, 2023, 1:48 a.m. UTC | #7
On Sep 14, 2023, at 09:10, Charlie Jenkins <charlie@rivosinc.com> wrote:

> On Wed, Sep 13, 2023 at 05:11:44PM -0700, Eric Biggers wrote:
>> On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
>> 
>> Hi Heiko!  Are you still working on this patchset?  And which of its
>> prerequisites still haven't been merged upstream?
>> 
>> - Eric
> It is my understanding that Heiko is taking a break from development, I
> don't think he will be working on this soon.

We would like to take over these RISC-V vector crypto implementations.
Our proposed implementations is under reviewing in OpenSSL pr.
And I will check the gluing parts between Linux kernel and OpenSSL.

-Jerry
Jerry Shih Sept. 15, 2023, 3:21 a.m. UTC | #8
On Sep 15, 2023, at 09:48, He-Jie Shih <bignose1007@gmail.com> wrote:

> On Sep 14, 2023, at 09:10, Charlie Jenkins <charlie@rivosinc.com> wrote:
> 
>> On Wed, Sep 13, 2023 at 05:11:44PM -0700, Eric Biggers wrote:
>>> On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
>>> 
>>> Hi Heiko!  Are you still working on this patchset?  And which of its
>>> prerequisites still haven't been merged upstream?
>>> 
>>> - Eric
>> It is my understanding that Heiko is taking a break from development, I
>> don't think he will be working on this soon.
> 
> We would like to take over these RISC-V vector crypto implementations.
> Our proposed implementations is under reviewing in OpenSSL pr.
> And I will check the gluing parts between Linux kernel and OpenSSL.

The OpenSSL PR is at [1].
And we are from SiFive.

-Jerry

[1]
https://github.com/openssl/openssl/pull/21923
Eric Biggers Oct. 6, 2023, 7:47 p.m. UTC | #9
On Fri, Sep 15, 2023 at 11:21:28AM +0800, Jerry Shih wrote:
> On Sep 15, 2023, at 09:48, He-Jie Shih <bignose1007@gmail.com> wrote:
> 
> > On Sep 14, 2023, at 09:10, Charlie Jenkins <charlie@rivosinc.com> wrote:
> > 
> >> On Wed, Sep 13, 2023 at 05:11:44PM -0700, Eric Biggers wrote:
> >>> On Tue, Jul 11, 2023 at 05:37:31PM +0200, Heiko Stuebner wrote:
> >>> 
> >>> Hi Heiko!  Are you still working on this patchset?  And which of its
> >>> prerequisites still haven't been merged upstream?
> >>> 
> >>> - Eric
> >> It is my understanding that Heiko is taking a break from development, I
> >> don't think he will be working on this soon.
> > 
> > We would like to take over these RISC-V vector crypto implementations.
> > Our proposed implementations is under reviewing in OpenSSL pr.
> > And I will check the gluing parts between Linux kernel and OpenSSL.
> 
> The OpenSSL PR is at [1].
> And we are from SiFive.
> 
> -Jerry
> 
> [1]
> https://github.com/openssl/openssl/pull/21923

Hi Jerry, I'm wondering if you have an update on this?  Do you need any help?

I'm also wondering about riscv.pm and the choice of generating the crypto
instructions from .words instead of using the assembler.  It makes it
significantly harder to review the code, IMO.  Can we depend on assembler
support for these instructions, or is that just not ready yet?

- Eric
He-Jie Shih Oct. 6, 2023, 9:01 p.m. UTC | #10
On Oct 7, 2023, at 03:47, Eric Biggers <ebiggers@kernel.org> wrote:
> On Fri, Sep 15, 2023 at 11:21:28AM +0800, Jerry Shih wrote:
>> On Sep 15, 2023, at 09:48, He-Jie Shih <bignose1007@gmail.com> wrote:
>> The OpenSSL PR is at [1].
>> And we are from SiFive.
>> 
>> -Jerry
>> 
>> [1]
>> https://github.com/openssl/openssl/pull/21923
> 
> Hi Jerry, I'm wondering if you have an update on this?  Do you need any help?

We have specialized aes-cbc/ecb/ctr patch locally and pass the `testmgr` test
cases. But the test patterns in `testmgr` are quite simple, I think it doesn't test the
corner case(e.g. aes-xts with tail element).

For aes-xts, I'm trying to update the implementation from OpenSSL. The design
philosophy is different between OpenSSL and linux. In linux crypto, the data will
be split into `scatterlist`. I need to preserve the aes-xts's iv for each scatterlist
entry call. And I'm thinking about how to handle the tail data in a simple way.
By the way, the `xts(aes)` implementation for arm and x86 are using
`cra_blocksize= AES_BLOCK_SIZE`. I don't know why we need to handle the tail
element. I think we will hit `EINVAL` error in `skcipher_walk_next()` if the data size
it not be a multiple of block size.

Overall, we will have
1) aes cipher
2) aes with cbc/ecb/ctr/xts mode
3) sha256/512 for `vlen>=128` platform
4) sm3 for `vlen>=128` platform
5) sm4
6) ghash
7) `chacha20` stream cipher

The vector crypto pr in OpenSSL is under reviewing, we are still updating the
perl file into linux.

The most complicated `gcm(aes)` mode will be in our next plan.

> I'm also wondering about riscv.pm and the choice of generating the crypto
> instructions from .words instead of using the assembler.  It makes it
> significantly harder to review the code, IMO.  Can we depend on assembler
> support for these instructions, or is that just not ready yet?

I have asked the same question before[1]. The reason is that Openssl could use
very old compiler for compiling. Thus, the assembler might not know the standard
rvv 1.0[2] and other vector crypto[3] instructions. That's why we use opcode for all
vector instructions. IMO, I would prefer to use opcode for `vector crypto` only. The
gcc-12 and clang-14 are already supporting rvv 1.0. Actually, I just read the `perl`
file instead of the actually generated opcode for OpenSSL pr reviewing. And it's
not hard to read the perl code.


Thanks,
- Jerry

[1]
https://github.com/openssl/openssl/pull/20149#discussion_r1244655440
[2]
https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc
[3
https://github.com/riscv/riscv-crypto/blob/main/doc/vector/riscv-crypto-spec-vector.adoc]
Ard Biesheuvel Oct. 6, 2023, 11:33 p.m. UTC | #11
On Fri, 6 Oct 2023 at 23:01, He-Jie Shih <bignose1007@gmail.com> wrote:
>
> On Oct 7, 2023, at 03:47, Eric Biggers <ebiggers@kernel.org> wrote:
> > On Fri, Sep 15, 2023 at 11:21:28AM +0800, Jerry Shih wrote:
> >> On Sep 15, 2023, at 09:48, He-Jie Shih <bignose1007@gmail.com> wrote:
> >> The OpenSSL PR is at [1].
> >> And we are from SiFive.
> >>
> >> -Jerry
> >>
> >> [1]
> >> https://github.com/openssl/openssl/pull/21923
> >
> > Hi Jerry, I'm wondering if you have an update on this?  Do you need any help?
>
> We have specialized aes-cbc/ecb/ctr patch locally and pass the `testmgr` test
> cases. But the test patterns in `testmgr` are quite simple, I think it doesn't test the
> corner case(e.g. aes-xts with tail element).
>

There should be test cases for that.

> For aes-xts, I'm trying to update the implementation from OpenSSL. The design
> philosophy is different between OpenSSL and linux. In linux crypto, the data will
> be split into `scatterlist`. I need to preserve the aes-xts's iv for each scatterlist
> entry call.

Yes, this applies to all block ciphers that take an IV.

> And I'm thinking about how to handle the tail data in a simple way.

The RISC-V vector ISA is quite advanced, so there may be a better
trick using predicates etc but otherwise, I suppose you could reuse
the same trick that other asm implementations use, which is to use
unaligned loads and stores for the final blocks, and to use a vector
permute with a permute table to shift the bytes in the registers. But
this is not performance critical, given that existing in-kernel users
use sector or page size inputs only.

> By the way, the `xts(aes)` implementation for arm and x86 are using
> `cra_blocksize= AES_BLOCK_SIZE`. I don't know why we need to handle the tail
> element. I think we will hit `EINVAL` error in `skcipher_walk_next()` if the data size
> it not be a multiple of block size.
>

No, both XTS and CBC-CTS permit inputs that are not a multiple of the
block size, and will use some form of ciphertext stealing for the
final tail. There is a generic CTS template that wraps CBC but
combining them in the same way (e.g., using vector permute) will speed
up things substantially. *However*, I'm not sure how relevant CBC-CTS
is in the kernel, given that only fscrypt uses it IIRC but actually
prefers something else so for new systems perhaps you shouldn't
bother.

> Overall, we will have
> 1) aes cipher
> 2) aes with cbc/ecb/ctr/xts mode
> 3) sha256/512 for `vlen>=128` platform
> 4) sm3 for `vlen>=128` platform
> 5) sm4
> 6) ghash
> 7) `chacha20` stream cipher
>
> The vector crypto pr in OpenSSL is under reviewing, we are still updating the
> perl file into linux.
>
> The most complicated `gcm(aes)` mode will be in our next plan.
>
> > I'm also wondering about riscv.pm and the choice of generating the crypto
> > instructions from .words instead of using the assembler.  It makes it
> > significantly harder to review the code, IMO.  Can we depend on assembler
> > support for these instructions, or is that just not ready yet?
>
> I have asked the same question before[1]. The reason is that Openssl could use
> very old compiler for compiling. Thus, the assembler might not know the standard
> rvv 1.0[2] and other vector crypto[3] instructions. That's why we use opcode for all
> vector instructions. IMO, I would prefer to use opcode for `vector crypto` only. The
> gcc-12 and clang-14 are already supporting rvv 1.0. Actually, I just read the `perl`
> file instead of the actually generated opcode for OpenSSL pr reviewing. And it's
> not hard to read the perl code.
>

I understand the desire to reuse code, and OpenSSL already relies on
so-called perlasm for this, but I think this is not a great choice,
and I actually think this was a mistake for RISC-V. OpenSSL relies on
perlasm for things like emittting different function pro-/epilogues
depending on the calling convention (SysV versus MS on x86_64, for
instance), but RISC-V does not have that much variety, and already
supports the insn_r / insn_i pseudo instructions to emit arbitrary
opcodes while still supporting named registers as usual. [Maybe my
experience does not quite extrapolate to the vector ISA, but I managed
to implement scalar AES [0] using the insn_r and insn_i pseudo
instructions (which are generally provided by the assembler but Linux
has fallback CPP macros for them as well), and this results on much
more maintainable code IMO.[

We are using some of the OpenSSL perlasm in the kernel already (and
some of it was introduced by me) but I don't think we should blindly
reuse all of the RISC-V code if  some of it can straight-forwardly be
written as normal .S files.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=riscv-scalar-aes
Eric Biggers Oct. 7, 2023, 9:30 p.m. UTC | #12
On Sat, Oct 07, 2023 at 05:01:45AM +0800, He-Jie Shih wrote:
> Reply-To: 20231006194741.GA68531@google.com
> X-Mailer: Apple Mail (2.3445.9.7)

Can you please fix your email client configuration?  Your emails have a bogus
Reply-To header, which makes replies not be sent to you by default.

- Eric
Eric Biggers Oct. 7, 2023, 10:16 p.m. UTC | #13
On Sat, Oct 07, 2023 at 01:33:48AM +0200, Ard Biesheuvel wrote:
> On Fri, 6 Oct 2023 at 23:01, He-Jie Shih <bignose1007@gmail.com> wrote:
> >
> > On Oct 7, 2023, at 03:47, Eric Biggers <ebiggers@kernel.org> wrote:
> > > On Fri, Sep 15, 2023 at 11:21:28AM +0800, Jerry Shih wrote:
> > >> On Sep 15, 2023, at 09:48, He-Jie Shih <bignose1007@gmail.com> wrote:
> > >> The OpenSSL PR is at [1].
> > >> And we are from SiFive.
> > >>
> > >> -Jerry
> > >>
> > >> [1]
> > >> https://github.com/openssl/openssl/pull/21923
> > >
> > > Hi Jerry, I'm wondering if you have an update on this?  Do you need any help?
> >
> > We have specialized aes-cbc/ecb/ctr patch locally and pass the `testmgr` test
> > cases. But the test patterns in `testmgr` are quite simple, I think it doesn't test the
> > corner case(e.g. aes-xts with tail element).
> >
> 
> There should be test cases for that.

Yes, non-block-aligned AES-XTS test vectors should be added to crypto/testmgr.h.
Though, that case should be already covered by the randomized tests enabled by
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y, which I very strongly recommend enabling
during development.

> 
> > For aes-xts, I'm trying to update the implementation from OpenSSL. The design
> > philosophy is different between OpenSSL and linux. In linux crypto, the data will
> > be split into `scatterlist`. I need to preserve the aes-xts's iv for each scatterlist
> > entry call.
> 
> Yes, this applies to all block ciphers that take an IV.
> 
> > And I'm thinking about how to handle the tail data in a simple way.
> 
> The RISC-V vector ISA is quite advanced, so there may be a better
> trick using predicates etc but otherwise, I suppose you could reuse
> the same trick that other asm implementations use, which is to use
> unaligned loads and stores for the final blocks, and to use a vector
> permute with a permute table to shift the bytes in the registers. But
> this is not performance critical, given that existing in-kernel users
> use sector or page size inputs only.
> 
> > By the way, the `xts(aes)` implementation for arm and x86 are using
> > `cra_blocksize= AES_BLOCK_SIZE`. I don't know why we need to handle the tail
> > element. I think we will hit `EINVAL` error in `skcipher_walk_next()` if the data size
> > it not be a multiple of block size.
> >
> 
> No, both XTS and CBC-CTS permit inputs that are not a multiple of the
> block size, and will use some form of ciphertext stealing for the
> final tail. There is a generic CTS template that wraps CBC but
> combining them in the same way (e.g., using vector permute) will speed
> up things substantially. *However*, I'm not sure how relevant CBC-CTS
> is in the kernel, given that only fscrypt uses it IIRC but actually
> prefers something else so for new systems perhaps you shouldn't
> bother.
> 
> > Overall, we will have
> > 1) aes cipher
> > 2) aes with cbc/ecb/ctr/xts mode
> > 3) sha256/512 for `vlen>=128` platform
> > 4) sm3 for `vlen>=128` platform
> > 5) sm4
> > 6) ghash
> > 7) `chacha20` stream cipher
> >
> > The vector crypto pr in OpenSSL is under reviewing, we are still updating the
> > perl file into linux.
> >
> > The most complicated `gcm(aes)` mode will be in our next plan.
> >
> > > I'm also wondering about riscv.pm and the choice of generating the crypto
> > > instructions from .words instead of using the assembler.  It makes it
> > > significantly harder to review the code, IMO.  Can we depend on assembler
> > > support for these instructions, or is that just not ready yet?
> >
> > I have asked the same question before[1]. The reason is that Openssl could use
> > very old compiler for compiling. Thus, the assembler might not know the standard
> > rvv 1.0[2] and other vector crypto[3] instructions. That's why we use opcode for all
> > vector instructions. IMO, I would prefer to use opcode for `vector crypto` only. The
> > gcc-12 and clang-14 are already supporting rvv 1.0. Actually, I just read the `perl`
> > file instead of the actually generated opcode for OpenSSL pr reviewing. And it's
> > not hard to read the perl code.
> >
> 
> I understand the desire to reuse code, and OpenSSL already relies on
> so-called perlasm for this, but I think this is not a great choice,
> and I actually think this was a mistake for RISC-V. OpenSSL relies on
> perlasm for things like emittting different function pro-/epilogues
> depending on the calling convention (SysV versus MS on x86_64, for
> instance), but RISC-V does not have that much variety, and already
> supports the insn_r / insn_i pseudo instructions to emit arbitrary
> opcodes while still supporting named registers as usual. [Maybe my
> experience does not quite extrapolate to the vector ISA, but I managed
> to implement scalar AES [0] using the insn_r and insn_i pseudo
> instructions (which are generally provided by the assembler but Linux
> has fallback CPP macros for them as well), and this results on much
> more maintainable code IMO.[
> 
> We are using some of the OpenSSL perlasm in the kernel already (and
> some of it was introduced by me) but I don't think we should blindly
> reuse all of the RISC-V code if  some of it can straight-forwardly be
> written as normal .S files.
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=riscv-scalar-aes

I'm not a huge fan of perlasm either.  Normal .S files can be much easier to
understand, and they do still support basic features like macros.  (Of course,
this only works if the .S file is the real source code.  If the real source code
is the perlasm, dumping it to a .S file doesn't make it more readable.)

Ultimately, it needs to decided on an algorithm-by-algorithm basis whether it
makes sense to use the .pl file directly from OpenSSL or write a normal .S file.
Sharing code can save time, but it can also waste time if/when things don't
match up and need to be changed for the kernel anyway.  If you look at the other
architectures, sharing the OpenSSL .pl file is most common for Poly1305 and
SHA-2.  It's rarer for AES modes.

In any case, regardless of .pl or .S, it would be nice to rely on the assembler
for the mapping from readable instruction to 32-bit words.  Yes, I understand
that the algorithm code reads mostly the some either way, but it introduces
nonstandard notation (e.g. due to having to avoid the period character) and a
possibility for error.  It's not the 1940s anymore; we should be able to have an
assembler.  Why not make OpenSSL and Linux only enable this code when the
assembler supports it?  Note that Linux already does this for many of the x86
extensions, so there is precedent for this; see arch/x86/Kconfig.assembler.

- Eric
Jerry Shih Oct. 31, 2023, 2:17 a.m. UTC | #14
On Oct 7, 2023, at 03:47, Eric Biggers <ebiggers@kernel.org> wrote:
> On Fri, Sep 15, 2023 at 11:21:28AM +0800, Jerry Shih wrote:
>> On Sep 15, 2023, at 09:48, He-Jie Shih <bignose1007@gmail.com> wrote:
>> 
>> The OpenSSL PR is at [1].
>> And we are from SiFive.
>> 
>> -Jerry
>> 
>> [1]
>> https://github.com/openssl/openssl/pull/21923
> 
> Hi Jerry, I'm wondering if you have an update on this?  Do you need any help?

The RISC-V vector crypto OpenSSL pr[1] is merged.
And we also sent the vector-crypto patch based on Heiko's and OpenSSL
works.
Here is the link:
https://lore.kernel.org/all/20231025183644.8735-1-jerry.shih@sifive.com/

[1]
https://github.com/openssl/openssl/pull/21923

> I'm also wondering about riscv.pm and the choice of generating the crypto
> instructions from .words instead of using the assembler.  It makes it
> significantly harder to review the code, IMO.  Can we depend on assembler
> support for these instructions, or is that just not ready yet?
> 
> - Eric

There is no public assembler supports the vector-crypto asm mnemonics.
We should still use `opcode` for vector-crypto instructions. But we might
use asm for standard rvv parts.
In order to reuse the codes in OpenSSL as much as possible,  we still use
the `riscv.pm` for all standard rvv and vector-crypto instructions. If the asm
mnemonic is still a better approach,  I will `rewrite` all standard rvv parts
with asm mnemonics in next patch.

-Jerry
Eric Biggers Nov. 2, 2023, 4:03 a.m. UTC | #15
Hi Jerry,

(Just so you know, you still need to fix your email configuration.  Your emails
have a bogus Reply-To header, which makes replies not be sent to you by default.
I had to manually set the "To:" address when replying.)

On Tue, Oct 31, 2023 at 10:17:11AM +0800, Jerry Shih wrote:
> 
> The RISC-V vector crypto OpenSSL pr[1] is merged.
> And we also sent the vector-crypto patch based on Heiko's and OpenSSL
> works.
> Here is the link:
> https://lore.kernel.org/all/20231025183644.8735-1-jerry.shih@sifive.com/
> 
> [1]
> https://github.com/openssl/openssl/pull/21923

Awesome, thanks!

> 
> > I'm also wondering about riscv.pm and the choice of generating the crypto
> > instructions from .words instead of using the assembler.  It makes it
> > significantly harder to review the code, IMO.  Can we depend on assembler
> > support for these instructions, or is that just not ready yet?
> > 
> > - Eric
> 
> There is no public assembler supports the vector-crypto asm mnemonics.
> We should still use `opcode` for vector-crypto instructions. But we might
> use asm for standard rvv parts.
> In order to reuse the codes in OpenSSL as much as possible,  we still use
> the `riscv.pm` for all standard rvv and vector-crypto instructions. If the asm
> mnemonic is still a better approach,  I will `rewrite` all standard rvv parts
> with asm mnemonics in next patch.

Tip-of-tree gcc + binutils seems to support them.  Building some of the sample
code from the riscv-crypto repository:

    $ riscv64-linux-gnu-as --version
    GNU assembler (GNU Binutils) 2.41.50.20231021
    $ riscv64-linux-gnu-gcc --version
    riscv64-linux-gnu-gcc (GCC) 14.0.0 20231021 (experimental)
    $ riscv64-linux-gnu-gcc -march=rv64ivzvkned -c riscv-crypto/doc/vector/code-samples/zvkned.s

And tip-of-tree clang supports them experimentally:

    $ clang --version
    clang version 18.0.0 (https://github.com/llvm/llvm-project 30416f39be326b403e19f23da387009736483119)
    $ clang -menable-experimental-extensions -target riscv64-linux-gnu -march=rv64ivzvkned1 -c riscv-crypto/doc/vector/code-samples/zvkned.s

It would be nice to use a real assembler, so that people won't have to worry
about potential mistakes or inconsistencies in the perl-based "assembler".  Also
keep in mind that if we allow people to compile this code without the real
assembler support from the beginning, it might end up staying that way for quite
a while in order to avoid breaking the build for people.

Ultimately it's up to you though; I think that you and others who have been
working on RISC-V crypto can make the best decision about what to do here.  I
also don't want this patchset to be delayed waiting for other projects, so maybe
that indeed means the perl-based "assembler" needs to be used for now.

- Eric
Eric Biggers Nov. 21, 2023, 11:51 p.m. UTC | #16
On Wed, Nov 01, 2023 at 09:03:33PM -0700, Eric Biggers wrote:
> > 
> > There is no public assembler supports the vector-crypto asm mnemonics.
> > We should still use `opcode` for vector-crypto instructions. But we might
> > use asm for standard rvv parts.
> > In order to reuse the codes in OpenSSL as much as possible,  we still use
> > the `riscv.pm` for all standard rvv and vector-crypto instructions. If the asm
> > mnemonic is still a better approach,  I will `rewrite` all standard rvv parts
> > with asm mnemonics in next patch.
> 
> Tip-of-tree gcc + binutils seems to support them.  Building some of the sample
> code from the riscv-crypto repository:
> 
>     $ riscv64-linux-gnu-as --version
>     GNU assembler (GNU Binutils) 2.41.50.20231021
>     $ riscv64-linux-gnu-gcc --version
>     riscv64-linux-gnu-gcc (GCC) 14.0.0 20231021 (experimental)
>     $ riscv64-linux-gnu-gcc -march=rv64ivzvkned -c riscv-crypto/doc/vector/code-samples/zvkned.s
> 
> And tip-of-tree clang supports them experimentally:
> 
>     $ clang --version
>     clang version 18.0.0 (https://github.com/llvm/llvm-project 30416f39be326b403e19f23da387009736483119)
>     $ clang -menable-experimental-extensions -target riscv64-linux-gnu -march=rv64ivzvkned1 -c riscv-crypto/doc/vector/code-samples/zvkned.s
> 
> It would be nice to use a real assembler, so that people won't have to worry
> about potential mistakes or inconsistencies in the perl-based "assembler".  Also
> keep in mind that if we allow people to compile this code without the real
> assembler support from the beginning, it might end up staying that way for quite
> a while in order to avoid breaking the build for people.
> 
> Ultimately it's up to you though; I think that you and others who have been
> working on RISC-V crypto can make the best decision about what to do here.  I
> also don't want this patchset to be delayed waiting for other projects, so maybe
> that indeed means the perl-based "assembler" needs to be used for now.
> 

Just wanted to bump up this discussion again.  In binutils, the vector crypto
v1.0.0 support was released 4 months ago in 2.41.  See the NEWS file at
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob_plain;f=binutils/NEWS;hb=refs/heads/binutils-2_41-branch

    * The RISC-V port now supports the following new standard extensions:
      - Zicond (conditional zero instructions)
      - Zfa (additional floating-point instructions)
      - Zvbb, Zvbc, Zvkg, Zvkned, Zvknh[ab], Zvksed, Zvksh, Zvkn, Zvknc, Zvkng,
        Zvks, Zvksc, Zvkg, Zvkt (vector crypto instructions)

That's every extension listed in the vector crypto v1.0.0 specification
(https://github.com/riscv/riscv-crypto/releases/download/v1.0.0/riscv-crypto-spec-vector.pdf).

LLVM still has the vector crypto extensions marked as "experimental" extensions.
However, there is an open pull request to mark them non-experimental:
https://github.com/llvm/llvm-project/pull/69000

Could we just go ahead and remove riscv.pm, develop with binutils for now, and
prioritize getting the LLVM support finished?

- Eric
Jerry Shih Nov. 22, 2023, 7:58 a.m. UTC | #17
On Nov 22, 2023, at 07:51, Eric Biggers <ebiggers@kernel.org> wrote:
> On Wed, Nov 01, 2023 at 09:03:33PM -0700, Eric Biggers wrote:
>> 
>> It would be nice to use a real assembler, so that people won't have to worry
>> about potential mistakes or inconsistencies in the perl-based "assembler".  Also
>> keep in mind that if we allow people to compile this code without the real
>> assembler support from the beginning, it might end up staying that way for quite
>> a while in order to avoid breaking the build for people.
>> 
>> Ultimately it's up to you though; I think that you and others who have been
>> working on RISC-V crypto can make the best decision about what to do here.  I
>> also don't want this patchset to be delayed waiting for other projects, so maybe
>> that indeed means the perl-based "assembler" needs to be used for now.
> 
> Just wanted to bump up this discussion again.  In binutils, the vector crypto
> v1.0.0 support was released 4 months ago in 2.41.  See the NEWS file at
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob_plain;f=binutils/NEWS;hb=refs/heads/binutils-2_41-branch
> 
>    * The RISC-V port now supports the following new standard extensions:
>      - Zicond (conditional zero instructions)
>      - Zfa (additional floating-point instructions)
>      - Zvbb, Zvbc, Zvkg, Zvkned, Zvknh[ab], Zvksed, Zvksh, Zvkn, Zvknc, Zvkng,
>        Zvks, Zvksc, Zvkg, Zvkt (vector crypto instructions)
> 
> That's every extension listed in the vector crypto v1.0.0 specification
> (https://github.com/riscv/riscv-crypto/releases/download/v1.0.0/riscv-crypto-spec-vector.pdf).

It doesn't fit all v1.0 spec.
The `Zvkb` is missed in binutils. It's the subset of `Zvbb`. We needs some extra
works if user just try to use `Zvkb`.
https://github.com/riscv/riscv-crypto/blob/main/doc/vector/riscv-crypto-vector-zvkb.adoc
Some crypto algorithms are already checking for `Zvkb` instead of `Zvbb`.

> LLVM still has the vector crypto extensions marked as "experimental" extensions.
> However, there is an open pull request to mark them non-experimental:
> https://github.com/llvm/llvm-project/pull/69000
> 
> Could we just go ahead and remove riscv.pm, develop with binutils for now, and
> prioritize getting the LLVM support finished?

Yes, we could.
But we need to handle the extensions checking for toolchains like:
https://github.com/torvalds/linux/commit/b6fcdb191e36f82336f9b5e126d51c02e7323480
I could do that, but I think I need some times to test the builds. And it will introduce
more dependency patch which is not related to actual crypto algorithms and the
gluing code in kernel. I will send another patch for toolchain part after the v2 patch.
And I am working for v2 patch with your new review comments. The v2 will still
use `perlasm`.
And we could move this discussion to this thread.
https://lore.kernel.org/all/20231025183644.8735-1-jerry.shih@sifive.com/

-Jerry
Eric Biggers Nov. 22, 2023, 11:42 p.m. UTC | #18
On Wed, Nov 22, 2023 at 03:58:17PM +0800, Jerry Shih wrote:
> On Nov 22, 2023, at 07:51, Eric Biggers <ebiggers@kernel.org> wrote:
> > On Wed, Nov 01, 2023 at 09:03:33PM -0700, Eric Biggers wrote:
> >> 
> >> It would be nice to use a real assembler, so that people won't have to worry
> >> about potential mistakes or inconsistencies in the perl-based "assembler".  Also
> >> keep in mind that if we allow people to compile this code without the real
> >> assembler support from the beginning, it might end up staying that way for quite
> >> a while in order to avoid breaking the build for people.
> >> 
> >> Ultimately it's up to you though; I think that you and others who have been
> >> working on RISC-V crypto can make the best decision about what to do here.  I
> >> also don't want this patchset to be delayed waiting for other projects, so maybe
> >> that indeed means the perl-based "assembler" needs to be used for now.
> > 
> > Just wanted to bump up this discussion again.  In binutils, the vector crypto
> > v1.0.0 support was released 4 months ago in 2.41.  See the NEWS file at
> > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob_plain;f=binutils/NEWS;hb=refs/heads/binutils-2_41-branch
> > 
> >    * The RISC-V port now supports the following new standard extensions:
> >      - Zicond (conditional zero instructions)
> >      - Zfa (additional floating-point instructions)
> >      - Zvbb, Zvbc, Zvkg, Zvkned, Zvknh[ab], Zvksed, Zvksh, Zvkn, Zvknc, Zvkng,
> >        Zvks, Zvksc, Zvkg, Zvkt (vector crypto instructions)
> > 
> > That's every extension listed in the vector crypto v1.0.0 specification
> > (https://github.com/riscv/riscv-crypto/releases/download/v1.0.0/riscv-crypto-spec-vector.pdf).
> 
> It doesn't fit all v1.0 spec.
> The `Zvkb` is missed in binutils. It's the subset of `Zvbb`. We needs some extra
> works if user just try to use `Zvkb`.
> https://github.com/riscv/riscv-crypto/blob/main/doc/vector/riscv-crypto-vector-zvkb.adoc
> Some crypto algorithms are already checking for `Zvkb` instead of `Zvbb`.

Yeah, that's unfortunate that Zvkb got missed in binutils.  However, since all
Zvkb instructions are part of Zvbb, which is supported, assembling Zvkb
instructions should still work --- right?

> > LLVM still has the vector crypto extensions marked as "experimental" extensions.
> > However, there is an open pull request to mark them non-experimental:
> > https://github.com/llvm/llvm-project/pull/69000
> > 
> > Could we just go ahead and remove riscv.pm, develop with binutils for now, and
> > prioritize getting the LLVM support finished?
> 
> Yes, we could.
> But we need to handle the extensions checking for toolchains like:
> https://github.com/torvalds/linux/commit/b6fcdb191e36f82336f9b5e126d51c02e7323480
> I could do that, but I think I need some times to test the builds. And it will introduce
> more dependency patch which is not related to actual crypto algorithms and the
> gluing code in kernel. I will send another patch for toolchain part after the v2 patch.
> And I am working for v2 patch with your new review comments. The v2 will still
> use `perlasm`.

Note that perlasm (.pl) vs assembly (.S), and ".inst" vs real assembler
instructions, are actually separate concerns.  We could use real assembler
instructions while still using perlasm.  Or we could use assembly while still
using macros that generate the instructions as .inst.

My preference is indeed both: assembly (.S) with real assembler instructions.
That would keep things more straightforward.

We do not necessarily need to do both before merging the code, though.  It will
be beneficial to get this code merged sooner rather than later, so that other
people can work on improving it.

I would prioritize the change to use real assembler instructions.  I do think
it's worth thinking about getting that change in from the beginning, so that the
toolchain prerequisites are properly in place from the beginning and people can
properly account for them and prioritize the toolchain work as needed.

- Eric
Christoph Müllner Nov. 23, 2023, 12:36 a.m. UTC | #19
On Thu, Nov 23, 2023 at 12:43 AM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Wed, Nov 22, 2023 at 03:58:17PM +0800, Jerry Shih wrote:
> > On Nov 22, 2023, at 07:51, Eric Biggers <ebiggers@kernel.org> wrote:
> > > On Wed, Nov 01, 2023 at 09:03:33PM -0700, Eric Biggers wrote:
> > >>
> > >> It would be nice to use a real assembler, so that people won't have to worry
> > >> about potential mistakes or inconsistencies in the perl-based "assembler".  Also
> > >> keep in mind that if we allow people to compile this code without the real
> > >> assembler support from the beginning, it might end up staying that way for quite
> > >> a while in order to avoid breaking the build for people.
> > >>
> > >> Ultimately it's up to you though; I think that you and others who have been
> > >> working on RISC-V crypto can make the best decision about what to do here.  I
> > >> also don't want this patchset to be delayed waiting for other projects, so maybe
> > >> that indeed means the perl-based "assembler" needs to be used for now.
> > >
> > > Just wanted to bump up this discussion again.  In binutils, the vector crypto
> > > v1.0.0 support was released 4 months ago in 2.41.  See the NEWS file at
> > > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob_plain;f=binutils/NEWS;hb=refs/heads/binutils-2_41-branch
> > >
> > >    * The RISC-V port now supports the following new standard extensions:
> > >      - Zicond (conditional zero instructions)
> > >      - Zfa (additional floating-point instructions)
> > >      - Zvbb, Zvbc, Zvkg, Zvkned, Zvknh[ab], Zvksed, Zvksh, Zvkn, Zvknc, Zvkng,
> > >        Zvks, Zvksc, Zvkg, Zvkt (vector crypto instructions)
> > >
> > > That's every extension listed in the vector crypto v1.0.0 specification
> > > (https://github.com/riscv/riscv-crypto/releases/download/v1.0.0/riscv-crypto-spec-vector.pdf).
> >
> > It doesn't fit all v1.0 spec.
> > The `Zvkb` is missed in binutils. It's the subset of `Zvbb`. We needs some extra
> > works if user just try to use `Zvkb`.
> > https://github.com/riscv/riscv-crypto/blob/main/doc/vector/riscv-crypto-vector-zvkb.adoc
> > Some crypto algorithms are already checking for `Zvkb` instead of `Zvbb`.
>
> Yeah, that's unfortunate that Zvkb got missed in binutils.  However, since all
> Zvkb instructions are part of Zvbb, which is supported, assembling Zvkb
> instructions should still work --- right?

Not forgotten, but the Zvkb extension did not exist when the patchset
was merged.
RISC-V extension support is typically merged when specifications are "frozen".
This means a high bar for changes, but they are possible until the
spec is ratified.
Often nothing is changed until ratification, but here Zvkb has been
(re-)introduced.

I was not aware of this untils I read this email, so I just wrote a
patch that fills the gap:
  https://sourceware.org/pipermail/binutils/2023-November/130762.html

Thanks for reporting!

BR
Christoph

>
> > > LLVM still has the vector crypto extensions marked as "experimental" extensions.
> > > However, there is an open pull request to mark them non-experimental:
> > > https://github.com/llvm/llvm-project/pull/69000
> > >
> > > Could we just go ahead and remove riscv.pm, develop with binutils for now, and
> > > prioritize getting the LLVM support finished?
> >
> > Yes, we could.
> > But we need to handle the extensions checking for toolchains like:
> > https://github.com/torvalds/linux/commit/b6fcdb191e36f82336f9b5e126d51c02e7323480
> > I could do that, but I think I need some times to test the builds. And it will introduce
> > more dependency patch which is not related to actual crypto algorithms and the
> > gluing code in kernel. I will send another patch for toolchain part after the v2 patch.
> > And I am working for v2 patch with your new review comments. The v2 will still
> > use `perlasm`.
>
> Note that perlasm (.pl) vs assembly (.S), and ".inst" vs real assembler
> instructions, are actually separate concerns.  We could use real assembler
> instructions while still using perlasm.  Or we could use assembly while still
> using macros that generate the instructions as .inst.
>
> My preference is indeed both: assembly (.S) with real assembler instructions.
> That would keep things more straightforward.
>
> We do not necessarily need to do both before merging the code, though.  It will
> be beneficial to get this code merged sooner rather than later, so that other
> people can work on improving it.
>
> I would prioritize the change to use real assembler instructions.  I do think
> it's worth thinking about getting that change in from the beginning, so that the
> toolchain prerequisites are properly in place from the beginning and people can
> properly account for them and prioritize the toolchain work as needed.
>
> - Eric
Eric Biggers Nov. 28, 2023, 8:19 p.m. UTC | #20
On Thu, Nov 23, 2023 at 01:36:34AM +0100, Christoph Müllner wrote:
> On Thu, Nov 23, 2023 at 12:43 AM Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > On Wed, Nov 22, 2023 at 03:58:17PM +0800, Jerry Shih wrote:
> > > On Nov 22, 2023, at 07:51, Eric Biggers <ebiggers@kernel.org> wrote:
> > > > On Wed, Nov 01, 2023 at 09:03:33PM -0700, Eric Biggers wrote:
> > > >>
> > > >> It would be nice to use a real assembler, so that people won't have to worry
> > > >> about potential mistakes or inconsistencies in the perl-based "assembler".  Also
> > > >> keep in mind that if we allow people to compile this code without the real
> > > >> assembler support from the beginning, it might end up staying that way for quite
> > > >> a while in order to avoid breaking the build for people.
> > > >>
> > > >> Ultimately it's up to you though; I think that you and others who have been
> > > >> working on RISC-V crypto can make the best decision about what to do here.  I
> > > >> also don't want this patchset to be delayed waiting for other projects, so maybe
> > > >> that indeed means the perl-based "assembler" needs to be used for now.
> > > >
> > > > Just wanted to bump up this discussion again.  In binutils, the vector crypto
> > > > v1.0.0 support was released 4 months ago in 2.41.  See the NEWS file at
> > > > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob_plain;f=binutils/NEWS;hb=refs/heads/binutils-2_41-branch
> > > >
> > > >    * The RISC-V port now supports the following new standard extensions:
> > > >      - Zicond (conditional zero instructions)
> > > >      - Zfa (additional floating-point instructions)
> > > >      - Zvbb, Zvbc, Zvkg, Zvkned, Zvknh[ab], Zvksed, Zvksh, Zvkn, Zvknc, Zvkng,
> > > >        Zvks, Zvksc, Zvkg, Zvkt (vector crypto instructions)
> > > >
> > > > That's every extension listed in the vector crypto v1.0.0 specification
> > > > (https://github.com/riscv/riscv-crypto/releases/download/v1.0.0/riscv-crypto-spec-vector.pdf).
> > >
> > > It doesn't fit all v1.0 spec.
> > > The `Zvkb` is missed in binutils. It's the subset of `Zvbb`. We needs some extra
> > > works if user just try to use `Zvkb`.
> > > https://github.com/riscv/riscv-crypto/blob/main/doc/vector/riscv-crypto-vector-zvkb.adoc
> > > Some crypto algorithms are already checking for `Zvkb` instead of `Zvbb`.
> >
> > Yeah, that's unfortunate that Zvkb got missed in binutils.  However, since all
> > Zvkb instructions are part of Zvbb, which is supported, assembling Zvkb
> > instructions should still work --- right?
> 
> Not forgotten, but the Zvkb extension did not exist when the patchset
> was merged.
> RISC-V extension support is typically merged when specifications are "frozen".
> This means a high bar for changes, but they are possible until the
> spec is ratified.
> Often nothing is changed until ratification, but here Zvkb has been
> (re-)introduced.
> 
> I was not aware of this untils I read this email, so I just wrote a
> patch that fills the gap:
>   https://sourceware.org/pipermail/binutils/2023-November/130762.html
> 

Thanks Christoph!  That binutils patch looks good to me.

- Eric