Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[i1] Implement packed_storage layout encoding attribute #19354

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lialan
Copy link
Contributor

@lialan lialan commented Dec 3, 2024

  • make packed_storage as a type of iree_encoding attribute, and make type converters accept it.
  • i1 tensors with #iree_encoding.packed_storage will be interpreted as packed i1 type, same as specifying --iree-experimental-packed-i1-storage. Other i1 tensors are treated as non-packed datatype, and will be extended.
  • --iree-experimental-packed-i1-storage are kept for testing purposes.
    • We can drop this option after frontend enables emitting i1 tensors with attributes.

@lialan lialan self-assigned this Dec 3, 2024
@lialan lialan linked an issue Dec 3, 2024 that may be closed by this pull request
@lialan lialan changed the title Implement i1_packed_storage layout encoding attribute Implement packed_storage layout encoding attribute Dec 5, 2024
@lialan lialan force-pushed the lialan/i1_attr branch 3 times, most recently from 1f2e200 to 8588cfd Compare December 13, 2024 03:37
@lialan lialan marked this pull request as ready for review December 13, 2024 05:28
@lialan lialan force-pushed the lialan/i1_attr branch 2 times, most recently from bac621f to 349a6bd Compare December 13, 2024 08:31
@lialan lialan changed the title Implement packed_storage layout encoding attribute [i1] Implement packed_storage layout encoding attribute Dec 13, 2024
@lialan
Copy link
Contributor Author

lialan commented Dec 16, 2024

A small next step is to allow casting from non-attributed tensor to attributed tensor, such as bitcasting: flow.bitcast tensor<?xi1> -> tensor<?xi1, #packed> so that we can completely retire the command line option.

@benvanik
Copy link
Collaborator

bit casting unpacked i1 <-> packed i1 won't work as they are different storage formats - you could do i8/i16/i32/etc <-> packed i1, though - otherwise you need ext/trunc ops

@lialan
Copy link
Contributor Author

lialan commented Dec 16, 2024

bit casting unpacked i1 <-> packed i1 won't work as they are different storage formats - you could do i8/i16/i32/etc <-> packed i1, though - otherwise you need ext/trunc ops

@benvanik Sorry I should have said bitcasting only tensor types. Tensor types are virtual.

@lialan lialan force-pushed the lialan/i1_attr branch 2 times, most recently from 1b8a414 to 97bdadb Compare December 16, 2024 05:30
@lialan
Copy link
Contributor Author

lialan commented Dec 16, 2024

bit casting unpacked i1 <-> packed i1 won't work as they are different storage formats - you could do i8/i16/i32/etc <-> packed i1, though - otherwise you need ext/trunc ops

On a second thought, I agree with @benvanik that a "bitcast" is actually reinterpretation, we need a simple "cast" in this case, or actual bitcasting such as: "tensor<8xi8> -> tensor<64xi1, #packed>" (notice the bit sizes are the same) which does not alter the underlying storage layout.

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes are not part of the PR. E.g., the changes in EncodingUtils and HALToStream/Patterns. We could easily mis-track what's happening in the PR chain; I can't really follow what is needed and what is not needed in this case.
Can you drop such changes from the PR? I think the scope of this PR can be "Introduce packed_storage encoding attribute + Add support for lowering stream.tensor.sizeof with the encoding".

btw, do you have a prototype or a doc to explain how e2e works? I mostly want to prefetch if there is something bad in the design.

@lialan
Copy link
Contributor Author

lialan commented Dec 16, 2024

There are some changes are not part of the PR. E.g., the changes in EncodingUtils and HALToStream/Patterns. We could easily mis-track what's happening in the PR chain; I can't really follow what is needed and what is not needed in this case. Can you drop such changes from the PR? I think the scope of this PR can be "Introduce packed_storage encoding attribute + Add support for lowering stream.tensor.sizeof with the encoding".

btw, do you have a prototype or a doc to explain how e2e works? I mostly want to prefetch if there is something bad in the design.

The thing is: those parts in EncodingUtils and HALToStream/Patterns make the new attribute support complete. i.e. without those updates in those parts, however small they are, the added attribute wouldn't be handled properly. As you can see changes in those parts are mostly trying to bypass existing checks on encoding attribute. So I feel like this is somewhat the "minimal set" we should have for adding a new attribute.

I can separate those parts and put them into another PR, but I feel that is even more ..... confusing and segregated. But that is only my opinion, please let me know what you think.

@hanhanW
Copy link
Contributor

hanhanW commented Dec 16, 2024

As you can see changes in those parts are mostly trying to bypass existing checks on encoding attribute.
I can separate those parts and put them into another PR, but I feel that is even more ..... confusing and segregated. But that is only my opinion, please let me know what you think.

The change of materialization pass is fine. I'm not sure about HAL part, it would be good to add a test.

btw, can those part be landed with an e2e test? Does it sound better?

@lialan
Copy link
Contributor Author

lialan commented Dec 16, 2024

The change of materialization pass is fine. I'm not sure about HAL part, it would be good to add a test.

btw, can those part be landed with an e2e test? Does it sound better?

I am actually working on that. To enable e2e tests for it I had to build a constant packed tensor (or bitcast from another tensor). I am half way through it so perhaps I can add those parts into this PR (but that will make things more complicated).

@hanhanW
Copy link
Contributor

hanhanW commented Dec 17, 2024

I am actually working on that. To enable e2e tests for it I had to build a constant packed tensor (or bitcast from another tensor). I am half way through it so perhaps I can add those parts into this PR (but that will make things more complicated).

I see your point, thanks! My concern was about the change in HALToStream/Patterns.cpp because I'm not familiar with that part. As Ben said, please add a comment to explain why. So the newbie like me can understand it better. The PR looks okay to me, please address Ben's comments.

@lialan
Copy link
Contributor Author

lialan commented Jan 3, 2025

@benvanik @hanhanW Please review again

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just few nits!

tests/e2e/subbyte_types/BUILD.bazel Outdated Show resolved Hide resolved
compiler/src/iree/compiler/Utils/ElementPackingUtils.cpp Outdated Show resolved Hide resolved
compiler/src/iree/compiler/Utils/ElementPackingUtils.cpp Outdated Show resolved Hide resolved
* make `packed_storage` as a type of `iree_encoding` attribute, and make type converters accept it.
* `i1` tensors with `#iree_encoding.packed_storage` will be interpreted as packed i1 type, same as specifying `--iree-experimental-packed-i1-storage`.
* `--iree-experimental-packed-i1-storage` are kept for testing purposes. We can drop this option after frontend enables emitting `i1` tensors with attributes.

Signed-off-by: Alan Li <me@alanli.org>
@lialan
Copy link
Contributor Author

lialan commented Jan 6, 2025

@benvanik Can you review again?

Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure packed_storage is the right attribute long-term? It feels like everything should be packed_storage and only the special case (i1 stored expanded in i8) should be called out explicitly. Otherwise we should really have every tensor in the program include packed_storage except i1-in-i8 and that's definitely not what we want.

I'd expect an "unpacked_storage" (though we should have a better name) that indicated what the storage type was that is overriding the logical element type. For example, #encoding.element_type<i8> on an i1 tensor indicates it's encoded in i8s even if logically treated as i1. In this form it also extends to other bit widths (not just i1/i8) and makes the logic for calculating size something that can be done generically (see the comment about an interface methods on EncodingAttr instead of all the hardcoded isPacked checks) - you have the tensor element type and the encoded element type and can use the alignment helpers to get the total physical storage size as required.

@@ -0,0 +1,22 @@
// RUN: iree-opt --split-input-file --iree-stream-encode-host-tensors %s | FileCheck %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this uses the same flags as the main test you can move it back into it (encode_host_tensors.mlir) - each individual .mlir file adds test overhead and makes it harder to manage the tests

// RUN: iree-opt --split-input-file --iree-stream-encode-host-tensors %s | FileCheck %s

#packed = #iree_encoding.packed_storage
func.func @unaligned_i1_size() -> index {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use util.func within the IREE codebase if outside of codegen (func.func is converted to util.func very early on)

@@ -100,6 +101,13 @@ struct ConvertTensorImportOp
RankedTensorType tensorType,
ValueRange dynamicDims,
OpBuilder &builder) {
// If the encoding attr is about packed storage then we don't need
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true - the shape and element type are still useful metadata even if the encoding is opaque. Whether a 4x2xi1 is packed or unpacked doesn't change that it's logical shape is 4x2 and the element type is i1.

We may have to fix the buffer_diagnostics.c impl of iree_hal_modules_buffer_view_assert to ignore the encoding check if IREE_HAL_ENCODING_TYPE_OPAQUE - currently it just does actual_encoding_type != IREE_HAL_ENCODING_TYPE_OPAQUE but should also do && expected_encoding_type != IREE_HAL_ENCODING_TYPE_OPAQUE.

You can change this to pass IREE_HAL_ENCODING_TYPE_OPAQUE for expectedEncodingType if the tensor has a non-null encoding.

(we wouldn't want packed storage to be the special case for behavior change here - that removes the benefits of using an attribute to separate parts of the codebase: every single possible tensor encoding attribute cannot be checked here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benvanik So the design is to make the iree_encoding.packed_storage attribute to be transient and temporary. By the time we finish emulating narrow types, this attribute will be gone, as it only indicates how we emulate this specific tensor.

I am wondering if this still applies?

// Two paths to enable packed storage for i1 tensors: the attribute or cl
// option. The cl option will be dropped once frontend supports emitting
// tensors with attributes.
bool isPackedStorage =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be doing boolean checks on the encoding, but asking the encoding attr what the size is - that's why we have it. None of this file should ever know about packed encodings, just that there's an encoding attr and it has a method that allows computing the bit width.

Concretely: EncodingAttr should have a getAlignedStorageElementType or some other methods that defaults to what legalizeStorageElementTypeImpl is doing (legalizeStorageElementType has always bothered me, it doesn't mean what it thinks it means!).

The flag and all checks for isPackedStorage need to be removed from this file. needToPackSubByteElementBitWidth was always a hack and the it was promised when it was added that it would be cleaned up - and yet here we are :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken. Creating an issue to track it: #19618

@lialan
Copy link
Contributor Author

lialan commented Jan 7, 2025

I'd expect an "unpacked_storage" (though we should have a better name) that indicated what the storage type was that is overriding the logical element type.

@benvanik That is another way to think of the problem, and I have to admit this looks cleaner. For historical reasons IREE expands i1 into i8, and I always thought that is IREE's data layout convention.

I honestly don't know if we should revert such convention, and it requires a bit of coordination to make everything by default "packed". I'd like to involve more people in such discussion, as when we internally decide to introduce an attribute for packed i1, the unpacked attribute was never on the table.

As for this PR, I do have some future PRs dependent on it, so I prefer to address the issue separately.

About this comment: #19354 (comment), if you don't mind, I would like to address that in a later patch, as a chain of changes needs to be taken for subsequent PRs if we do everything now here in one place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Support packed i1 storage through encodings
3 participants