-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge OpenAI Triton commit 4dac289
#265
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
AMD is enabled by default, but not ripe for usage (not tested). Lots of work will be necessary to make everything robust and maintainable.
Solves triton-lang/triton#2898 . With the [MLIR VS Code](https://marketplace.visualstudio.com/items?itemName=llvm-vs-code-extensions.vscode-mlir) plugin, here is how the result looks like: <img width="1195" alt="image" src="https://github.com/openai/triton/assets/23236638/529c02a0-6448-4221-90fc-78d5d416356e"> Further efforts require managing the file extension to be `.mlir` rather than `.ttlr`.
…o avoid being dependent on numpy by default (#2904) Fixes triton-lang/triton#2899 .
…tup.py (#2906) Init submodule before trying to check if something is in it
…ods of defining target link libraries (#2907) Cmake requires that you either specify PUBLIC/PRIVATE keyword in target_link_libraries, or you don't. Mixing two methods is not supported.
…s (#2908) * Adding new `tl.clamp(x, min, max, propagate_nan)` function to triton language. Lowering it to a sequence of minimum(x, maximum(x, min), max) in the general case, and to `min.xorsign.abs` inline assembly when `clamp(x, -limit, limit)` is detected. * Refactoring the `tl.PropagateNan` enum, so it is defined directly in MLIR and exported to python FE. * New tests for clamp and symmetric clamp
…… (#2910) …rong type
Those tests are deprecated, since we have comprehensive test_conversions now
…ing swap file (#2912)
…t now (#2911) This PR triton-lang/triton#2887 removes `third_party/triton_shared`, and the corresponding test should be removed. Otherwise it will fail (and now it indeed fail) all the CI tests.
On Hopper when storing mma tensor to shared memory we can use stmatrix to reduce the number of store intrusctions. This give a very small improvements to the epilogue for fp16 output. It will later be combined with cp.async.bulked to improve performance further.
`DistributedEncodingTrait::getCTAOrder()` returns a SmallVector by value, which is deleted as soon as it is assigned to `ref`. `ref` then becomes a dangling reference. To prevent that, we now use a vector instead of an array reference.
whitneywhtsang
force-pushed
the
whitneywhtsang/merge
branch
2 times, most recently
from
January 16, 2024 03:41
e505031
to
7e28281
Compare
whitneywhtsang
changed the title
Merge OpenAI Triton commit
Merge OpenAI Triton commit Jan 16, 2024
9a38395
4dac289
etiotto
approved these changes
Jan 16, 2024
whitneywhtsang
force-pushed
the
whitneywhtsang/merge
branch
from
January 16, 2024 15:32
7e28281
to
7f911ad
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR change the Triton base from bbfdc0d to 4dac289 (Jan 11).
Please do not squash and merge this PR.