-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zvk vector crypto support (v5) #1303
Zvk vector crypto support (v5) #1303
Conversation
@aswaterman Andrew, this is my first PR in this repository. As such I am quite unfamiliar with the review process. Don't hesitate to point me in the right direction. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look at the details of the instructions, but generally I'd say this uses too many macros when compilable C++ code would suffice. (The existing instructions also overuse macros, IMO; let's try to improve that, not make it worse.)
@aswaterman is there a good example of another extension that uses custom data types and/or helper functions instead of macros?
do { \ | ||
require_vector(true); \ | ||
require_extension(EXT_ZVBB); \ | ||
} while (0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Zvbb requires V, that should be enforced in isa_parser.cc
. It does not need to be enforced here in every instruction. (Same comment for all of this family of macros; it looks like none of these macros are needed.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the point here is to raise an exception when an instruction is executed without proper support. I am not that familiar with spike, but I would imagine require_extension(EXT_ZVBB);
is required to raise the exception if the instruction is not supported. The dependency on V (which may be V
or Zve*
for Zvbb
I think) should indeed be handled when the ISA support is initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The require_vector(true)
checks sstatus.VS
and vtype.vill
. So, even though the ISA parser should enforce the extension dependency, this code (or something similar) is still necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
require_vector(true)
checkssstatus.VS
andvtype.vill
. So, even though the ISA parser should enforce the extension dependency, this code (or something similar) is still necessary.
Good point. Should those checks be part of require_extension(EXT_ZVBB)
, do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the current formulation is OK.
The only problem with it, I think, is that it assumes Zvbb requires V. This isn't strictly true; Zvbb only requires a subset of V: Zve64x, IIRC. But since Spike doesn't support any of the V subsets currently, that is not a problem currently.
Furthermore, if Spike is augmented to support Zve64x etc., then essentially all of the vector instructions will need to be retooled in the same manner as this one. In other words, this approach doesn't create a new type of technical debt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the discussion. I updated the documentation of those new require_xyz macros to state
// Ensures that the ... extension is present,
// and the vector unit is enabled and in a valid state.```
This captures the semantics of require_vector(true) more clearly than the previous "and that vector is present".
riscv/zvk_ext_macros.h
Outdated
// Ensures that the vector instruction is not using a mask. | ||
#define require_no_vmask require(insn.v_vm() == 1) | ||
|
||
// Ensures that the current VSEW is the given N number of bits. | ||
#define require_vsew(N) require(P.VU.vsew == (N)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two are so trivial and self-expanatory that I'd say they should just be inlined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree for require_vsew()
and got rid of them.
I have a stronger attachment to require_no_vmask
as I find it a lot clearer than require(insn.v_vm() == 1)
, especially since VM is a highly overloaded acronym and "VM == 1" meaning "vector mask is not used" is very counter-intuitive.
It is a case where someone intimate with the vector spec may find the "insn.v_vm() == 1" very clear but newcomers or more casual contributors will either misinterpret or need to go back to the spec to understand. When I wrote that code I very much needed the more explicit semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you still want me to remove the require_no_vmask
.
// containing element groups of 8 uint32_t values (EGW=256, EEW=32, EGS=8), | ||
// *ignoring* the current SEW that applies to the vectors. | ||
// | ||
// IMPORTANT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this info important? Isn't it obvious from reading the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this documentation is for most readers to not have to read the code. I also want to express what semantics are intended, vs. what the code is actually encoding.
The V macros are inscrutable for multiple reasons. For me the main one is the lack of documentation. One has to expand the macros level-by-level to figure what the uses of those macros is achieving (and even then...). I want readers of the Zvk code to only have to read the macro documentation to get an 80% understanding of what is happening.
// into the (mutable) variables named by the W arguments, provided in | ||
// "Little Endian" (LE) order, i.e., from the least significant (W0) | ||
// to the most significant (W3). | ||
#define EXTRACT_EGU32x4_WORDS_LE(X, W0, W1, W2, W3) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of the macros seem like they would be better off as methods on a class EGU32x4_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm 100% with you on the desire to reduce macro usage. I wish I had known that when I started writing that code, instead of following the vector idioms.
EGU32x4_t ends up being std:array<uint32_t, 4>, and using instance methods is not going to work as-is. We can use freestanding methods. We could hide the std::array<> as an implementation detail of a class but it will require forwarding quite a few operations.
Disabling P extension doesn't sound good. I'm pretty sure this is not the first time we've had conflicting opcodes, though I don't remember how we dealt with it last time. |
@scottj97 As a very small example, adding the various I fully agree that, in cases where we can use functions, possibly with templates, rather than macros, we should. Although Spike is full of examples to the contrary, we are trying to move away from that style. |
As for handling overlapping opcodes, the problematic ones need to be added to the overlap list; here's an example: 087626c#diff-19e2a333b37ddb8e4a5ec7706ce201da3561f4ff223756fa8b9e692ea4679a8fR10 |
Also please fix the CI failure:
|
@egouriou-rivos can you tell us how you've tested this? |
Just a heads-up for anyone following this, as this is needed for ACT testing. The SAIL implementation is written for Vector Crypto Draft v20230303 which leads to incorrect ACT testing. In order to have some(not all Extensions are possible) valid results between DUT(Spike) & Sail Signatures you'd have to apply a fair amount of workarounds so the ACT tooling can match the correct Extensions between Spike <-> SAIL. Successful Test reports could be generated only for the following extensions:
Failing Test reports, with the reason, are:
|
5fd534f
to
0be690f
Compare
I am overjoyed to read this, as getting into the vector code for this extension was quite frustrating due to those layers of (unhygienic) macros and the utter lack of documentation of their intent. Unfortunately the Zvk implementation strove to fit within the existing idioms we found, with the exception of (over-)documenting. I will not have the time to perform the suggested cleanup until later this year. I feel that such a cleanup needs to involve the V instructions and will represent a significant effort. As such, I would like to see it decoupled from landing the Zvk support. |
This should be fixed now. |
I pushed updated commits with the following changes:
With this, the PR is up to date with the current specification, which should allow for the spec to move into freeze for this deliverable. Issues I intend to address in upcoming versions:
I would like to defer the cleanup from C-style macros to more idiomatic C++ to subsequent PRs. As discussed above, this PR was built over many months following the example of the V extension which was considered idiomatic. I wish I had checked when that work started. |
One issue I am aware of and would like recommendation on: the element-group accessor
I worry that no such trick will allow us to fix |
Most of the validation for these changes have been done by running the tests / proof-of-concept code in riscv/riscv-crypto#310. Thanks to the NIST and SM test vectors I am pretty confident in the soundness of the crypto logic itself (mistakes show up quickly). In the absence of a SAIL model and ACT support for vector I did not automate validation that the Spike logic matches the spec pseudo-code, but Ken and I spent quite a bit of time looking for discrepancies, especially related to endianness issues. I am more wary of issues related to details of the RISC-V vector specification, or my misunderstandings of Spike internals. |
0be690f
to
c130ebc
Compare
I uploaded a new version of this series. Differences from v2:
As mentioned above, the one known issue is the handling of WORDS_BIGENDIAN. I'd appreciate if people familiar with big-endian and the vector unit register layout could give some thought as to whether elt_group<>(...) is implementable for big-endian or if we need to change that interface. |
c130ebc
to
2e4e026
Compare
I uploaded a new version of this series (v4). The only difference from the previous version is in 'isa_parsing.cc' in the commit titled 'Zvk extension parsing':
This matches the changes that went into the specification in version 0.9.5, 2023-05-09 |
8842b8a
to
54f2d0b
Compare
No worries about the delay, and thanks again for the reviews and suggestions. I rebased on top of current /master. No changes to the commits, only resolving merge issues. I regenerated riscv/encoding.h from riscv-opcodes /master (3ca60c5). Note that the previous encoding file was generated from a commit (8d70e77) that I could not find in the riscv-opcodes upstream repository. It is not clear to me what else we are picking up between those two versions. |
Thanks! There's a new batch of CI failures, but it looks like just a missing |
4fa1262
to
57ece8d
Compare
I uploaded a new patch series against the new /master (7e50839) that addresses a couple of issues:
I missed both issues as I spend most of my time compiling macOS with Clang. The issues showed up when using GCC on Linux. Hopefully this will please the CI gods. |
I believe the issues @scottj97 raised have been addressed or are now moot.
Checking those CI failures. I probably need to annotate some unused variables from the V macros, or derive new loop macros that do not declare those variables in the first place. I'll also rebase. |
Feel free to deploy the |
Also, note I just merged #1364, which bumped |
The previous order lacks any obvious logic. Alphabetical order, while making it difficult to create interesting groupings, makes it easy to find which extensions are compiled in. Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Zvk is the short name for the Vector Cryptography Instruction Set Extension Specification being defined at <https://github.com/riscv/riscv-crypto/tree/master/doc/vector>. This commit adds support for parsing/enabling the Zvk extensions (Zvbb, Zvbc, Zvkg, Zvkned, Zvknha, Zvknhb, Zvksed, Zvksh, Zvkt) and the combo extensions (Zvkn, Zvknc, Zvkng, Zvks, Zvksc, Zvksg). This is an early commit in a series implementing Zvk. No instructions are actually defined here, only infastructure that will support the coming extensions. The encodings for Zvk instructions have some conflicts with Zpn encodings. This commit marks those Zpn instructions as overlapping, and adds checks to error out if conflicting extensions are enabled. Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Introduce types and macros useful across multiple Zvk sub-extensions, including Zvbb and Zvbc. Those will be used by upcoming per-sub-extension commits. In particular we introduce "Element Group" types and loop macros handling those element groups. The concept of element group is described in <https://github.com/riscv/riscv-crypto/blob/master/doc/vector/riscv-crypto-vector-element-groups.adoc>. Note that the element group access method is not implemented for WORDS_BIGENDIAN setup. As such, isa_parser.cc is modified to emit an error when WORDS_BIGENDIAN is defined and extensions using element groups are enabled. Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Implement the proposed instructions in Zvbb: - vandn.{vv,vx}, vector bitwise and-not - vbrev.v, vector bit reverse in element - vbrev8.v, vector bit reverse in bytes - vrev8.v, vector byte reverse - vctz.v, vector count trailing zeros - vclz.v, vector count leading zeros - vcpop.v, vector population count - vrol.{vv,vx}, vector rotate left - vror.{vi,vv,vx}, vector rotate right - vwsll.{vi,vv,vx} vector widening shift left logical A new instruction field, 'zimm6', is introduced, encoded in bits [15, 19] and [26].. It is used by "vror.vi" to encode a shift immediate in [0, 63]. Co-authored-by: Raghav Gupta <rgupta@rivosinc.com> Co-authored-by: Stanislaw Kardach <kda@semihalf.com> Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Implement the Zvbc instructions - vclmul.{vv,vx}, vector carryless multiply low - vclmulh.{vv,vx}, vector carryless multiply high Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Implement the proposed instruction in Zvkg, vghmac.vv, Vector Carryless Multiply Accumulate over GHASH Galois-Field. The instruction performs one step of GHASH routine as described in "NIST Special Publication 800-38D" a.k.a the AES-GCM specification. The logic was written to closely track the pseudo-code in the Zvk specification. Signed-off-by: Eric Gouriou <ego@rivosinc.com> Co-authored-by: Kornel Duleba <mindal@semihalf.com> Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Implement the instructions part of the Zvknha and Zvknhb sub-extensions: - vsha2ms.vv, message schedule - vsha2ch.vv / vsha2cl.vv, compression rounds A header files for common macros is added. Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Implement the Zvkned extension, "NIST Suite: Vector AES Encryption & Decryption (Single Round)". - vaeskf1.vi: AES forward key scheduling, AES-128. - vaeskf2.vi: AES forward key scheduling, AES-256. - vaesz.vs: AES encryption/decryption, 0-th round. - vaesdm.{vs,vv}: AES decryption, middle rounds. - vaesdf.{vs,vv}: AES decryption, final round. - vaesem.{vs,vv}: AES encryption, middle rounds. - vaesef.{vs,vv}: AES encryption, final round. An extension specific header containing common logic is added. Co-authored-by: Stanislaw Kardach <kda@semihalf.com> Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Implement the Zvksed sub-extension, "ShangMi Suite: SM4 Block Cipher": - vsm4k.vi, vector SM4 key expansion, - vsm4r.{vs,vv}, vector SM4 rounds. This also introduces a header for common vector SM4 logic. Co-authored-by: Raghav Gupta <rgupta@rivosinc.com> Co-authored-by: Albert Jakieła <aja@semihalf.com> Signed-off-by: Eric Gouriou <ego@rivosinc.com>
Implement the Zvksh sub-extension, "ShangMi Suite: SM3 Hash Function Instructions": - vsm3me.vv, message expansion, - vsm3c.vi, compression rounds. This also introduces a SM3 specific header for common logic. Co-authored-by: Raghav Gupta <rgupta@rivosinc.com> Co-authored-by: Albert Jakieła <aja@semihalf.com> Co-authored-by: Kornel Dulęba <mindal@semihalf.com> Signed-off-by: Eric Gouriou <ego@rivosinc.com>
57ece8d
to
a55f96a
Compare
I uploaded a new version. It took me a little while to reproduce the failures. Changes since last version:
It would have been easier to sprinkle more UNUSED, but I appreciated the warnings emitted in the CI build. They did not reveal significant issues but some sloppiness. With that I was able to build cleanly on macOS+Clang and Linux+GCC when adding the same warnings/errors that are used in the CI. |
Thanks for working through the remaining issues, @egouriou-rivos. Time to pull the trigger. |
Woohoo! Thank you. |
[Updated 2023-06-02]
This patch series implements Zvk at version Version v0.9.7, 31 May, 2023, from https://github.com/riscv/riscv-crypto/tree/master/doc/vector:
no-instruction extension (only accepted as part of the ISA string, no-op otherwise):
and combo extensions (expanded to the underlying extensions):
Some Zvk instructions conflict with instructions part of Zvpn, part of the proposed packed SIMD extensions. The conflicting instructions are listed in the overlapped file and only enabled when their extension is declared. Extension in conflicts are detected in isa_parser.cc, triggering errors.
The implementation of element groups does not support WORDS_BIGENDIAN correctly. An error is emitted when WORDS_BIGENDIAN is enabled together with extensions using element group accesses.