-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[i1] Implement packed_storage
layout encoding attribute
#19354
base: main
Are you sure you want to change the base?
Conversation
i1_packed_storage
layout encoding attributepacked_storage
layout encoding attribute
1f2e200
to
8588cfd
Compare
bac621f
to
349a6bd
Compare
packed_storage
layout encoding attributepacked_storage
layout encoding attribute
A small next step is to allow casting from non-attributed tensor to attributed tensor, such as bitcasting: |
bit casting unpacked i1 <-> packed i1 won't work as they are different storage formats - you could do i8/i16/i32/etc <-> packed i1, though - otherwise you need ext/trunc ops |
@benvanik Sorry I should have said bitcasting only tensor types. Tensor types are virtual. |
1b8a414
to
97bdadb
Compare
On a second thought, I agree with @benvanik that a "bitcast" is actually reinterpretation, we need a simple "cast" in this case, or actual bitcasting such as: "tensor<8xi8> -> tensor<64xi1, #packed>" (notice the bit sizes are the same) which does not alter the underlying storage layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes are not part of the PR. E.g., the changes in EncodingUtils
and HALToStream/Patterns
. We could easily mis-track what's happening in the PR chain; I can't really follow what is needed and what is not needed in this case.
Can you drop such changes from the PR? I think the scope of this PR can be "Introduce packed_storage encoding attribute + Add support for lowering stream.tensor.sizeof with the encoding".
btw, do you have a prototype or a doc to explain how e2e works? I mostly want to prefetch if there is something bad in the design.
compiler/src/iree/compiler/Dialect/Encoding/IR/EncodingAttrs.td
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Dialect/Stream/Conversion/HALToStream/Patterns.cpp
Outdated
Show resolved
Hide resolved
compiler/src/iree/compiler/Dialect/Stream/Transforms/test/encode_host_tensors_packing_i1.mlir
Outdated
Show resolved
Hide resolved
97bdadb
to
f53923d
Compare
The thing is: those parts in I can separate those parts and put them into another PR, but I feel that is even more ..... confusing and segregated. But that is only my opinion, please let me know what you think. |
The change of materialization pass is fine. I'm not sure about HAL part, it would be good to add a test. btw, can those part be landed with an e2e test? Does it sound better? |
I am actually working on that. To enable e2e tests for it I had to build a constant packed tensor (or bitcast from another tensor). I am half way through it so perhaps I can add those parts into this PR (but that will make things more complicated). |
compiler/src/iree/compiler/Dialect/Stream/Conversion/HALToStream/Patterns.cpp
Outdated
Show resolved
Hide resolved
I see your point, thanks! My concern was about the change in |
d26a392
to
7ac00fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just few nits!
* make `packed_storage` as a type of `iree_encoding` attribute, and make type converters accept it. * `i1` tensors with `#iree_encoding.packed_storage` will be interpreted as packed i1 type, same as specifying `--iree-experimental-packed-i1-storage`. * `--iree-experimental-packed-i1-storage` are kept for testing purposes. We can drop this option after frontend enables emitting `i1` tensors with attributes. Signed-off-by: Alan Li <me@alanli.org>
@benvanik Can you review again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure packed_storage is the right attribute long-term? It feels like everything should be packed_storage and only the special case (i1 stored expanded in i8) should be called out explicitly. Otherwise we should really have every tensor in the program include packed_storage except i1-in-i8 and that's definitely not what we want.
I'd expect an "unpacked_storage" (though we should have a better name) that indicated what the storage type was that is overriding the logical element type. For example, #encoding.element_type<i8>
on an i1 tensor indicates it's encoded in i8s even if logically treated as i1. In this form it also extends to other bit widths (not just i1/i8) and makes the logic for calculating size something that can be done generically (see the comment about an interface methods on EncodingAttr instead of all the hardcoded isPacked checks) - you have the tensor element type and the encoded element type and can use the alignment helpers to get the total physical storage size as required.
@@ -0,0 +1,22 @@ | |||
// RUN: iree-opt --split-input-file --iree-stream-encode-host-tensors %s | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this uses the same flags as the main test you can move it back into it (encode_host_tensors.mlir) - each individual .mlir file adds test overhead and makes it harder to manage the tests
// RUN: iree-opt --split-input-file --iree-stream-encode-host-tensors %s | FileCheck %s | ||
|
||
#packed = #iree_encoding.packed_storage | ||
func.func @unaligned_i1_size() -> index { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use util.func within the IREE codebase if outside of codegen (func.func is converted to util.func very early on)
@@ -100,6 +101,13 @@ struct ConvertTensorImportOp | |||
RankedTensorType tensorType, | |||
ValueRange dynamicDims, | |||
OpBuilder &builder) { | |||
// If the encoding attr is about packed storage then we don't need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true - the shape and element type are still useful metadata even if the encoding is opaque. Whether a 4x2xi1 is packed or unpacked doesn't change that it's logical shape is 4x2 and the element type is i1.
We may have to fix the buffer_diagnostics.c impl of iree_hal_modules_buffer_view_assert to ignore the encoding check if IREE_HAL_ENCODING_TYPE_OPAQUE - currently it just does actual_encoding_type != IREE_HAL_ENCODING_TYPE_OPAQUE
but should also do && expected_encoding_type != IREE_HAL_ENCODING_TYPE_OPAQUE
.
You can change this to pass IREE_HAL_ENCODING_TYPE_OPAQUE
for expectedEncodingType if the tensor has a non-null encoding.
(we wouldn't want packed storage to be the special case for behavior change here - that removes the benefits of using an attribute to separate parts of the codebase: every single possible tensor encoding attribute cannot be checked here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benvanik So the design is to make the iree_encoding.packed_storage
attribute to be transient and temporary. By the time we finish emulating narrow types, this attribute will be gone, as it only indicates how we emulate this specific tensor.
I am wondering if this still applies?
// Two paths to enable packed storage for i1 tensors: the attribute or cl | ||
// option. The cl option will be dropped once frontend supports emitting | ||
// tensors with attributes. | ||
bool isPackedStorage = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't be doing boolean checks on the encoding, but asking the encoding attr what the size is - that's why we have it. None of this file should ever know about packed encodings, just that there's an encoding attr and it has a method that allows computing the bit width.
Concretely: EncodingAttr should have a getAlignedStorageElementType
or some other methods that defaults to what legalizeStorageElementTypeImpl is doing (legalizeStorageElementType
has always bothered me, it doesn't mean what it thinks it means!).
The flag and all checks for isPackedStorage need to be removed from this file. needToPackSubByteElementBitWidth
was always a hack and the it was promised when it was added that it would be cleaned up - and yet here we are :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point taken. Creating an issue to track it: #19618
@benvanik That is another way to think of the problem, and I have to admit this looks cleaner. For historical reasons IREE expands I honestly don't know if we should revert such convention, and it requires a bit of coordination to make everything by default "packed". I'd like to involve more people in such discussion, as when we internally decide to introduce an attribute for packed As for this PR, I do have some future PRs dependent on it, so I prefer to address the issue separately. About this comment: #19354 (comment), if you don't mind, I would like to address that in a later patch, as a chain of changes needs to be taken for subsequent PRs if we do everything now here in one place. |
packed_storage
as a type ofiree_encoding
attribute, and make type converters accept it.i1
tensors with#iree_encoding.packed_storage
will be interpreted as packed i1 type, same as specifying--iree-experimental-packed-i1-storage
. Other i1 tensors are treated as non-packed datatype, and will be extended.--iree-experimental-packed-i1-storage
are kept for testing purposes.i1
tensors with attributes.