[SYCL]Add Marlin Kernel for SYCL runtime #33
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@Godofnothing Thanks for creating this repository and supporting faster gemms.
I am currently working on AutoGPTQ extension for SYCL runtime (AutoGPTQ/AutoGPTQ#638) . Since the build of the asm instructions for Marlin are from here, I propose to have an analogous SYCL counterpart in this repository.
I believe this addition would help us (Intel and SYCL in general) to actively benchmark against ptx ISA and check for performance gaps . This would also open avenues on non Intel hardware to use the SYCL runtime. [Creating a draft PR now]
Also tagging @fxmarty (autoGPTQ) for info. Thanks