[i1] Implement `packed_storage` layout encoding attribute #19354

lialan · 2024-12-03T17:14:47Z

make packed_storage as a type of iree_encoding attribute, and make type converters accept it.
i1 tensors with #iree_encoding.packed_storage will be interpreted as packed i1 type, same as specifying --iree-experimental-packed-i1-storage. Other i1 tensors are treated as non-packed datatype, and will be extended.
--iree-experimental-packed-i1-storage are kept for testing purposes.
- We can drop this option after frontend enables emitting i1 tensors with attributes.

lialan · 2024-12-16T01:56:53Z

A small next step is to allow casting from non-attributed tensor to attributed tensor, such as bitcasting: flow.bitcast tensor<?xi1> -> tensor<?xi1, #packed> so that we can completely retire the command line option.

benvanik · 2024-12-16T02:02:12Z

bit casting unpacked i1 <-> packed i1 won't work as they are different storage formats - you could do i8/i16/i32/etc <-> packed i1, though - otherwise you need ext/trunc ops

lialan · 2024-12-16T03:20:36Z

bit casting unpacked i1 <-> packed i1 won't work as they are different storage formats - you could do i8/i16/i32/etc <-> packed i1, though - otherwise you need ext/trunc ops

@benvanik Sorry I should have said bitcasting only tensor types. Tensor types are virtual.

lialan · 2024-12-16T07:28:10Z

bit casting unpacked i1 <-> packed i1 won't work as they are different storage formats - you could do i8/i16/i32/etc <-> packed i1, though - otherwise you need ext/trunc ops

On a second thought, I agree with @benvanik that a "bitcast" is actually reinterpretation, we need a simple "cast" in this case, or actual bitcasting such as: "tensor<8xi8> -> tensor<64xi1, #packed>" (notice the bit sizes are the same) which does not alter the underlying storage layout.

hanhanW

There are some changes are not part of the PR. E.g., the changes in EncodingUtils and HALToStream/Patterns. We could easily mis-track what's happening in the PR chain; I can't really follow what is needed and what is not needed in this case.
Can you drop such changes from the PR? I think the scope of this PR can be "Introduce packed_storage encoding attribute + Add support for lowering stream.tensor.sizeof with the encoding".

btw, do you have a prototype or a doc to explain how e2e works? I mostly want to prefetch if there is something bad in the design.

compiler/src/iree/compiler/Codegen/Common/EncodingUtils.cpp

compiler/src/iree/compiler/Dialect/Encoding/IR/EncodingAttrs.td

compiler/src/iree/compiler/DispatchCreation/SetEncoding.cpp

compiler/src/iree/compiler/Dialect/Stream/Conversion/HALToStream/Patterns.cpp

compiler/src/iree/compiler/Dialect/Stream/Transforms/test/encode_host_tensors_packing_i1.mlir

compiler/src/iree/compiler/Codegen/Common/EncodingUtils.cpp

lialan · 2024-12-16T10:56:12Z

There are some changes are not part of the PR. E.g., the changes in EncodingUtils and HALToStream/Patterns. We could easily mis-track what's happening in the PR chain; I can't really follow what is needed and what is not needed in this case. Can you drop such changes from the PR? I think the scope of this PR can be "Introduce packed_storage encoding attribute + Add support for lowering stream.tensor.sizeof with the encoding".

btw, do you have a prototype or a doc to explain how e2e works? I mostly want to prefetch if there is something bad in the design.

The thing is: those parts in EncodingUtils and HALToStream/Patterns make the new attribute support complete. i.e. without those updates in those parts, however small they are, the added attribute wouldn't be handled properly. As you can see changes in those parts are mostly trying to bypass existing checks on encoding attribute. So I feel like this is somewhat the "minimal set" we should have for adding a new attribute.

I can separate those parts and put them into another PR, but I feel that is even more ..... confusing and segregated. But that is only my opinion, please let me know what you think.

hanhanW · 2024-12-16T13:22:58Z

As you can see changes in those parts are mostly trying to bypass existing checks on encoding attribute.
I can separate those parts and put them into another PR, but I feel that is even more ..... confusing and segregated. But that is only my opinion, please let me know what you think.

The change of materialization pass is fine. I'm not sure about HAL part, it would be good to add a test.

btw, can those part be landed with an e2e test? Does it sound better?

lialan · 2024-12-16T14:18:00Z

The change of materialization pass is fine. I'm not sure about HAL part, it would be good to add a test.

btw, can those part be landed with an e2e test? Does it sound better?

I am actually working on that. To enable e2e tests for it I had to build a constant packed tensor (or bitcast from another tensor). I am half way through it so perhaps I can add those parts into this PR (but that will make things more complicated).

compiler/src/iree/compiler/Dialect/Stream/Conversion/HALToStream/Patterns.cpp

compiler/src/iree/compiler/Utils/ElementPackingUtils.h

hanhanW · 2024-12-17T05:22:47Z

I am actually working on that. To enable e2e tests for it I had to build a constant packed tensor (or bitcast from another tensor). I am half way through it so perhaps I can add those parts into this PR (but that will make things more complicated).

I see your point, thanks! My concern was about the change in HALToStream/Patterns.cpp because I'm not familiar with that part. As Ben said, please add a comment to explain why. So the newbie like me can understand it better. The PR looks okay to me, please address Ben's comments.

lialan · 2025-01-03T02:12:23Z

@benvanik @hanhanW Please review again

hanhanW

LGTM, just few nits!

compiler/src/iree/compiler/Codegen/Common/EncodingUtils.cpp

tests/e2e/subbyte_types/BUILD.bazel

compiler/src/iree/compiler/Utils/ElementPackingUtils.cpp

* make `packed_storage` as a type of `iree_encoding` attribute, and make type converters accept it. * `i1` tensors with `#iree_encoding.packed_storage` will be interpreted as packed i1 type, same as specifying `--iree-experimental-packed-i1-storage`. * `--iree-experimental-packed-i1-storage` are kept for testing purposes. We can drop this option after frontend enables emitting `i1` tensors with attributes. Signed-off-by: Alan Li <me@alanli.org>

lialan · 2025-01-06T09:13:37Z

@benvanik Can you review again?

benvanik

Are we sure packed_storage is the right attribute long-term? It feels like everything should be packed_storage and only the special case (i1 stored expanded in i8) should be called out explicitly. Otherwise we should really have every tensor in the program include packed_storage except i1-in-i8 and that's definitely not what we want.

I'd expect an "unpacked_storage" (though we should have a better name) that indicated what the storage type was that is overriding the logical element type. For example, #encoding.element_type<i8> on an i1 tensor indicates it's encoded in i8s even if logically treated as i1. In this form it also extends to other bit widths (not just i1/i8) and makes the logic for calculating size something that can be done generically (see the comment about an interface methods on EncodingAttr instead of all the hardcoded isPacked checks) - you have the tensor element type and the encoded element type and can use the alignment helpers to get the total physical storage size as required.

benvanik · 2025-01-06T15:41:26Z

...er/src/iree/compiler/Dialect/Stream/Transforms/test/encode_host_tensors_packing_i1_attr.mlir

@@ -0,0 +1,22 @@
+// RUN: iree-opt --split-input-file --iree-stream-encode-host-tensors %s | FileCheck %s


since this uses the same flags as the main test you can move it back into it (encode_host_tensors.mlir) - each individual .mlir file adds test overhead and makes it harder to manage the tests

benvanik · 2025-01-06T15:41:53Z

...er/src/iree/compiler/Dialect/Stream/Transforms/test/encode_host_tensors_packing_i1_attr.mlir

+// RUN: iree-opt --split-input-file --iree-stream-encode-host-tensors %s | FileCheck %s
+
+#packed = #iree_encoding.packed_storage
+func.func @unaligned_i1_size() -> index {


use util.func within the IREE codebase if outside of codegen (func.func is converted to util.func very early on)

benvanik · 2025-01-06T15:48:22Z

compiler/src/iree/compiler/Dialect/Stream/Conversion/HALToStream/Patterns.cpp

@@ -100,6 +101,13 @@ struct ConvertTensorImportOp
                                               RankedTensorType tensorType,
                                               ValueRange dynamicDims,
                                               OpBuilder &builder) {
+    // If the encoding attr is about packed storage then we don't need


This is not true - the shape and element type are still useful metadata even if the encoding is opaque. Whether a 4x2xi1 is packed or unpacked doesn't change that it's logical shape is 4x2 and the element type is i1.

We may have to fix the buffer_diagnostics.c impl of iree_hal_modules_buffer_view_assert to ignore the encoding check if IREE_HAL_ENCODING_TYPE_OPAQUE - currently it just does actual_encoding_type != IREE_HAL_ENCODING_TYPE_OPAQUE but should also do && expected_encoding_type != IREE_HAL_ENCODING_TYPE_OPAQUE.

You can change this to pass IREE_HAL_ENCODING_TYPE_OPAQUE for expectedEncodingType if the tensor has a non-null encoding.

(we wouldn't want packed storage to be the special case for behavior change here - that removes the benefits of using an attribute to separate parts of the codebase: every single possible tensor encoding attribute cannot be checked here)

@benvanik So the design is to make the iree_encoding.packed_storage attribute to be transient and temporary. By the time we finish emulating narrow types, this attribute will be gone, as it only indicates how we emulate this specific tensor.

I am wondering if this still applies?

benvanik · 2025-01-06T15:53:39Z

compiler/src/iree/compiler/Utils/ElementPackingUtils.cpp

+  // Two paths to enable packed storage for i1 tensors: the attribute or cl
+  // option. The cl option will be dropped once frontend supports emitting
+  // tensors with attributes.
+  bool isPackedStorage =


We shouldn't be doing boolean checks on the encoding, but asking the encoding attr what the size is - that's why we have it. None of this file should ever know about packed encodings, just that there's an encoding attr and it has a method that allows computing the bit width.

Concretely: EncodingAttr should have a getAlignedStorageElementType or some other methods that defaults to what legalizeStorageElementTypeImpl is doing (legalizeStorageElementType has always bothered me, it doesn't mean what it thinks it means!).

The flag and all checks for isPackedStorage need to be removed from this file. needToPackSubByteElementBitWidth was always a hack and the it was promised when it was added that it would be cleaned up - and yet here we are :)

Point taken. Creating an issue to track it: #19618

lialan · 2025-01-07T16:21:44Z

I'd expect an "unpacked_storage" (though we should have a better name) that indicated what the storage type was that is overriding the logical element type.

@benvanik That is another way to think of the problem, and I have to admit this looks cleaner. For historical reasons IREE expands i1 into i8, and I always thought that is IREE's data layout convention.

I honestly don't know if we should revert such convention, and it requires a bit of coordination to make everything by default "packed". I'd like to involve more people in such discussion, as when we internally decide to introduce an attribute for packed i1, the unpacked attribute was never on the table.

As for this PR, I do have some future PRs dependent on it, so I prefer to address the issue separately.

About this comment: #19354 (comment), if you don't mind, I would like to address that in a later patch, as a chain of changes needs to be taken for subsequent PRs if we do everything now here in one place.

lialan self-assigned this Dec 3, 2024

lialan linked an issue Dec 3, 2024 that may be closed by this pull request

[Feature Request] Support packed i1 storage through encodings #19220

Open

lialan mentioned this pull request Dec 5, 2024

General support of i1 mask in attention #19380

Open

lialan changed the title ~~Implement i1_packed_storage layout encoding attribute~~ Implement packed_storage layout encoding attribute Dec 5, 2024

lialan mentioned this pull request Dec 5, 2024

[Feature Request] Frontend to emit i1 packed tensor for attention masks #19382

Open

lialan force-pushed the lialan/i1_attr branch 3 times, most recently from 1f2e200 to 8588cfd Compare December 13, 2024 03:37

lialan marked this pull request as ready for review December 13, 2024 05:28

lialan requested review from hanhanW, MaheshRavishankar, IanWood1, bjacob and benvanik as code owners December 13, 2024 05:28

lialan force-pushed the lialan/i1_attr branch 2 times, most recently from bac621f to 349a6bd Compare December 13, 2024 08:31

lialan changed the title ~~Implement packed_storage layout encoding attribute~~ [i1] Implement packed_storage layout encoding attribute Dec 13, 2024

lialan force-pushed the lialan/i1_attr branch 2 times, most recently from 1b8a414 to 97bdadb Compare December 16, 2024 05:30

hanhanW mentioned this pull request Dec 16, 2024

Adding e2e tests for i1 mask attentions #19312

Merged

hanhanW requested changes Dec 16, 2024

View reviewed changes

lialan force-pushed the lialan/i1_attr branch from 97bdadb to f53923d Compare December 16, 2024 09:11

hanhanW reviewed Dec 16, 2024

View reviewed changes

compiler/src/iree/compiler/Codegen/Common/EncodingUtils.cpp Outdated Show resolved Hide resolved

benvanik requested changes Dec 16, 2024

View reviewed changes

compiler/src/iree/compiler/Dialect/Stream/Conversion/HALToStream/Patterns.cpp Outdated Show resolved Hide resolved

compiler/src/iree/compiler/Utils/ElementPackingUtils.h Outdated Show resolved Hide resolved

compiler/src/iree/compiler/Utils/ElementPackingUtils.h Outdated Show resolved Hide resolved

lialan requested review from benvanik and hanhanW December 19, 2024 13:31

lialan force-pushed the lialan/i1_attr branch 3 times, most recently from d26a392 to 7ac00fe Compare December 29, 2024 09:10

lialan mentioned this pull request Dec 31, 2024

[i1] Remove command line option to enable packed storage #19528

Open

hanhanW approved these changes Jan 6, 2025

View reviewed changes

lialan force-pushed the lialan/i1_attr branch from 7ac00fe to 7fc957d Compare January 6, 2025 08:39

benvanik requested changes Jan 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[i1] Implement `packed_storage` layout encoding attribute #19354

[i1] Implement `packed_storage` layout encoding attribute #19354

lialan commented Dec 3, 2024 •

edited

Loading

lialan commented Dec 16, 2024

benvanik commented Dec 16, 2024

lialan commented Dec 16, 2024

lialan commented Dec 16, 2024

hanhanW left a comment

lialan commented Dec 16, 2024

hanhanW commented Dec 16, 2024

lialan commented Dec 16, 2024

hanhanW commented Dec 17, 2024

lialan commented Jan 3, 2025

hanhanW left a comment

lialan commented Jan 6, 2025

benvanik left a comment

benvanik Jan 6, 2025

benvanik Jan 6, 2025

benvanik Jan 6, 2025

lialan Jan 7, 2025

benvanik Jan 6, 2025

lialan Jan 7, 2025

lialan commented Jan 7, 2025

		@@ -0,0 +1,22 @@
		// RUN: iree-opt --split-input-file --iree-stream-encode-host-tensors %s \| FileCheck %s

[i1] Implement packed_storage layout encoding attribute #19354

Are you sure you want to change the base?

[i1] Implement packed_storage layout encoding attribute #19354

Conversation

lialan commented Dec 3, 2024 • edited Loading

lialan commented Dec 16, 2024

benvanik commented Dec 16, 2024

lialan commented Dec 16, 2024

lialan commented Dec 16, 2024

hanhanW left a comment

Choose a reason for hiding this comment

lialan commented Dec 16, 2024

hanhanW commented Dec 16, 2024

lialan commented Dec 16, 2024

hanhanW commented Dec 17, 2024

lialan commented Jan 3, 2025

hanhanW left a comment

Choose a reason for hiding this comment

lialan commented Jan 6, 2025

benvanik left a comment

Choose a reason for hiding this comment

benvanik Jan 6, 2025

Choose a reason for hiding this comment

benvanik Jan 6, 2025

Choose a reason for hiding this comment

benvanik Jan 6, 2025

Choose a reason for hiding this comment

lialan Jan 7, 2025

Choose a reason for hiding this comment

benvanik Jan 6, 2025

Choose a reason for hiding this comment

lialan Jan 7, 2025

Choose a reason for hiding this comment

lialan commented Jan 7, 2025

[i1] Implement `packed_storage` layout encoding attribute #19354

[i1] Implement `packed_storage` layout encoding attribute #19354

lialan commented Dec 3, 2024 •

edited

Loading