forked from intel/llvm
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Graph] Update doc for UR PR moving reset commands to a dedicated cmd-list #357
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Currently, `phaseParity` argument of `nvgpu.mbarrier.try_wait.parity` is index. This can cause a problem if it's passed any value different than 0 or 1. Because the PTX instruction only accepts even or odd phase. This PR makes phaseParity argument i1 to avoid misuse. Here is the information from PTX doc: ``` The .parity variant of the instructions test for the completion of the phase indicated by the operand phaseParity, which is the integer parity of either the current phase or the immediately preceding phase of the mbarrier object. An even phase has integer parity 0 and an odd phase has integer parity of 1. So the valid values of phaseParity operand are 0 and 1. ``` See for more information: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-mbarrier-try-wait
…81239) This function will be useful when we change the behavior of record-type prvalues so that they directly initialize the associated result object. See also the comment here for more details: https://github.com/llvm/llvm-project/blob/9e73656af524a2c592978aec91de67316c5ce69f/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h#L354 As part of this patch, we document and assert that synthetic fields may not have reference type. There is no practical use case for this: A `StorageLocation` may not have reference type, and a synthetic field of the corresponding non-reference type can serve the same purpose.
llvm.dbg.assign intrinsics have 2 {value, expression} pairs; fix hwasan to update the second expression. Fixes #76545. This is #78606 rebased and with the addition of DPValue handling. Note the addition of --try-experimental-debuginfo-iterators in the tests and some shuffling of code in MemoryTaggingSupport.cpp.
The strictfp attribute has the requirement that "LLVM will not introduce any new floating-point instructions that may trap". The llvm.is.fpclass intrinsic is documented as "The function never raises floating-point exceptions", and the fcmp instruction may raise one, so we can't transform the former into the latter in functions with the strictfp attribute.
…#81585) This reverts commit a034e65. Some protobuf users reported that this patch caused a significant compile-time regression because `TailDuplicator` works poorly with a specific pattern. We will reland it once the codegen issue is fixed.
…ugprone-unused-local-non-trivial-variable (#81563)
…(#81482) Use templates instead. Part of <llvm/llvm-project#62629>.
This patch adds full support for linking SystemZ (ELF s390x) object files. Support should be generally complete: - All relocation types are supported. - Full shared library support (DYNAMIC, GOT, PLT, ifunc). - Relaxation of TLS and GOT relocations where appropriate. - Platform-specific test cases. In addition to new platform code and the obvious changes, there were a few additional changes to common code: - Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and R_PLT_GOTREL) needed to support certain s390x relocations. I chose not to use a platform-specific name since nothing in the definition of these relocs is actually platform-specific; it is well possible that other platforms will need the same. - A couple of tweaks to TLS relocation handling, as the particular semantics of the s390x versions differ slightly. See comments in the code. This was tested by building and testing >1500 Fedora packages, with only a handful of failures; as these also have issues when building with LLD on other architectures, they seem unrelated. Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
The motivation here was a suggestion over in Compiler Explorer. You can use `-mllvm` already to do this but since gfortran supports `-masm`, I figured I'd try to add it. This is done by flang expanding `-masm` into `-mllvm x86-asm-syntax=`, then passing that to fc1. Which then collects all the `-mllvm` options and forwards them on. The code to expand it comes from clang `Clang::AddX86TargetArgs` (there are some other places doing the same thing too). However I've removed the `-inline-asm` that clang adds, as fortran doesn't have inline assembly. So `-masm` for flang purely changes the style of assembly output. ``` $ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu <...> pushq %rbp $ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu -masm=att <...> pushq %rbp $ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu -masm=intel <...> push rbp ``` The test is adapted from `clang/test/Driver/masm.c` by removing the clang-cl related lines and changing the 32 bit triples to 64 bit triples since flang doesn't support 32 bit targets.
…(#80991) Although in a normal implementation the assumption is reasonable, it seems that some esoteric implementation are not returning a T&. This should be handled correctly and the values be propagated. --------- Co-authored-by: martinboehme <mboehme@google.com>
… (#80966) The 1-D case directly maps to LLVM intrinsics. The n-D case will be handled by unrolling to 1-D first (in a later patch). Depends on: #80965
Without this I would hit errors with libstdc++-12 like: /usr/include/c++/12/bits/stl_iterator_base_funcs.h:230:5: note: candidate template ignored: substitution failure [with _InputIterator = llvm::const_set_bits_iterator_impl<llvm::BitVector>]: argument may not have 'void' type next(_InputIterator __x, typename ^
…oading directives (#81081) This patch adds support for the depend clause in a number of OpenMP directives/constructs related to offloading. Specifically, it adds the handling of the depend clause when it is used with the following constructs - target - target enter data - target update data - target exit data
Fix crash raised in comments for 5c9f768
…1500) Adds a test to help document Linalg Ops that are currently not supported by the vectoriser (i.e. the logic to vectorise these is missing). The list is not exhaustive.
Common backends (LLVM, SPIR-V) only supports 1D vectors, LLVM conversion handles ND vectors (N >= 2) as `array<array<... vector>>` and SPIR-V conversion doesn't handle them at all at the moment. Sometimes it's preferable to treat multidim vectors as linearized 1D. Add pass to do this. Only constants and simple elementwise ops are supported for now. @krzysz00 I've extracted yours result type conversion code from LegalizeToF32 and moved it to common place. Also, add ConversionPattern class operating on traits.
This fixes a crash when lowering an extract_subvector like: t0:v1i64 = extract_subvector t1:v2i64, 1 Whilst we never need a vslidedown with M1 on scalable vector types, we might need to do it for v1i64/v1f64, since the smallest container type for it is nxv1i64/nxv1f64. The lowering code is still correct for this case, but the assertion was too strict. The actual invariant we're relying on is that ContainerSubVecVT's LMUL <= M1, not < M1. Hence why we handled v2i32 fine, because its container type was nxv1i32 and MF2.
Allocate storage and initialize it with the given APValue contents.
…#80735) zOS doesn't support aligned allocation, so mark these testcases as unsupported. Continuation of https://reviews.llvm.org/D102798
Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make sure them not initialized as zero. FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
… (#81602) In a few places we test whether sets (i.e. sorted ranges) intersect by computing the set_intersection and then testing whether it is empty. For this purpose it should be more efficient to use a std:vector instead of a std::set to hold the result of the set_intersection, since insertion is simpler.
Just emit their satisfaction state, which is what the current interpreter does as well.
'serial', 'parallel', and 'kernel' constructs are all considered 'Compute' constructs. This patch creates the AST type, plus the required infrastructure for such a type, plus some base types that will be useful in the future for breaking this up. The only difference between the three is the 'kind'( plus some minor clause legalization rules, but those can be differentiated easily enough), so rather than representing them as separate AST nodes, it seems to make sense to make them the same. Additionally, no clause AST functionality is being implemented yet, as that fits better in a separate patch, and this is enough to get the 'naked' constructs implemented. This is otherwise an 'NFC' patch, as it doesn't alter execution at all, so there aren't any tests. I did this to break up the review workload and to get feedback on the layout.
For now just convert BB with convertFromNewDbgValues, will figure out something smarter a bit later. I've updated several tests with dbg.declare intrinsic adding --experimental-debuginfo-iterators=1 to check if it works. Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@0e87aefecf7c500
The SPIR-V Specification allows `OpConstantNull` types to be scalar or vector booleans, integers, or floats. Update an assert for this and add a SPIR-V -> LLVM IR test. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9ec969c1c379bde
Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@262395da9234fe4
…supported (intel#12700) Final PR in the series of intel#12636. Refer to it for a description. After a discussion with @AlexeySachkov we've decided its best to not rewrite USM and syclcompat tests with buffers/accessors. For USM, the reason is obvious and for syclcompat you can reach out to Alexey. Therefore, these tests are handled using if statements or requring aspect to be supported. Once this PR is merged, the behavior of malloc_shared will be to throw if the usm_shared_allocations is not supported which is conformant with the spec.
…#12730) This is a generalization of the existing workarounds: https://github.com/intel/llvm/blob/sycl/sycl/plugins/unified_runtime/CMakeLists.txt#L40-L54 etc.
…s.txt (intel#12714) Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.6 to 42.0.0 to resolve identified security vulnerability in 3rd party dependency. Refer to [cryptography's changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst).
despite having a unit test for default context, realized there is not one to affirm the new default configuration.
Some clean-up for SYCL-Graph E2E tests: * Remove redundant `Event` variables that are initialized over loop iterations but never used. * Remove all instances of the no immediate command-list property, and use environment variable instead to test both paths. * Always use FileCheck leak checking rather than `CHECK-NOT: Leak`. * Remove unnecessary threading code from `Inputs/basic_usm.cpp`
Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com
… time properties (intel#12675)
EwanC
approved these changes
Feb 19, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've nitpicked the language a bit, but this is a nice improvement to the documentation. The new diagram is clearer too
…2680) Improves management of inter-partition dependencies, so that only required dependencies are added. As removing these dependencies can results in multiple executions paths, we have added a map to track all events returned from submitted partitions. All these events are linked to the main event returned to user. Adds tests.
Grad flag was set to 0x3 (meaning Lod + Bias) instead of 0x4. See https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Image_Operands Signed-off-by: Victor Lomuller <victor@codeplay.com>
Bring the fix for MaxRegsPerBlock check from oneapi-src/unified-runtime#1299 to `intel/llvm`. No changes needed other than updating the UR repo hash. --------- Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
`LoaderConfig` is created and stored in a local pointer and never released when done using, causing it to be leaked. This patch releases the `LoaderConfig` when finished using it.
Old builtins implementation is going to be removed in the next ABI breaking window and that helper is only used there.
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.0 to 42.0.2. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's changelog</a>.</em></p> <blockquote> <p>42.0.2 - 2024-01-30</p> <pre><code> * Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL 3.2.1. * Fixed an issue that prevented the use of Python buffer protocol objects in ``sign`` and ``verify`` methods on asymmetric keys. * Fixed an issue with incorrect keyword-argument naming with ``EllipticCurvePrivateKey`` :meth:`~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePrivateKey.exchange`, ``X25519PrivateKey`` :meth:`~cryptography.hazmat.primitives.asymmetric.x25519.X25519PrivateKey.exchange`, ``X448PrivateKey`` :meth:`~cryptography.hazmat.primitives.asymmetric.x448.X448PrivateKey.exchange`, and ``DHPrivateKey`` :meth:`~cryptography.hazmat.primitives.asymmetric.dh.DHPrivateKey.exchange`. <p>.. _v42-0-1:</p> <p>42.0.1 - 2024-01-24 </code></pre></p> <ul> <li>Fixed an issue with incorrect keyword-argument naming with <code>EllipticCurvePrivateKey</code> :meth:<code>~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePrivateKey.sign</code>.</li> <li>Resolved compatibility issue with loading certain RSA public keys in :func:<code>~cryptography.hazmat.primitives.serialization.load_pem_public_key</code>.</li> </ul> <p>.. _v42-0-0:</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/pyca/cryptography/commit/2202123b50de1b8788f909a3e5afe350c56ad81e"><code>2202123</code></a> changelog and version bump 42.0.2 (<a href="https://redirect.github.com/pyca/cryptography/issues/10268">#10268</a>)</li> <li><a href="https://github.com/pyca/cryptography/commit/f7032bdd409838f67fc2b93343f897fb5f397d80"><code>f7032bd</code></a> bump openssl in CI (<a href="https://redirect.github.com/pyca/cryptography/issues/10298">#10298</a>) (<a href="https://redirect.github.com/pyca/cryptography/issues/10299">#10299</a>)</li> <li><a href="https://github.com/pyca/cryptography/commit/002e886f16d8857151c09b11dc86b35f2ac9aec3"><code>002e886</code></a> Fixes <a href="https://redirect.github.com/pyca/cryptography/issues/10294">#10294</a> -- correct accidental change to exchange kwarg (<a href="https://redirect.github.com/pyca/cryptography/issues/10295">#10295</a>) (<a href="https://redirect.github.com/pyca/cryptography/issues/10296">#10296</a>)</li> <li><a href="https://github.com/pyca/cryptography/commit/92fa9f2f606caea5d499c825e832be5bac6f0c23"><code>92fa9f2</code></a> support bytes-like consistently across our asym sign/verify APIs (<a href="https://redirect.github.com/pyca/cryptography/issues/10260">#10260</a>) (<a href="https://redirect.github.com/pyca/cryptography/issues/1">#1</a>...</li> <li><a href="https://github.com/pyca/cryptography/commit/6478f7e28be54b51931277235de01b249ceabd96"><code>6478f7e</code></a> explicitly support bytes-like for signature/data in RSA sign/verify (<a href="https://redirect.github.com/pyca/cryptography/issues/10259">#10259</a>) ...</li> <li><a href="https://github.com/pyca/cryptography/commit/4bb8596ae02d95bb054dbcf55e8771379dbe0c19"><code>4bb8596</code></a> fix the release script (<a href="https://redirect.github.com/pyca/cryptography/issues/10233">#10233</a>) (<a href="https://redirect.github.com/pyca/cryptography/issues/10254">#10254</a>)</li> <li><a href="https://github.com/pyca/cryptography/commit/337437dc2e62772bde4ad5544f4b1db9ee7572d9"><code>337437d</code></a> 42.0.1 bump (<a href="https://redirect.github.com/pyca/cryptography/issues/10252">#10252</a>)</li> <li><a href="https://github.com/pyca/cryptography/commit/56255de6b2d1a2d2e502b0275231ca81907f33f1"><code>56255de</code></a> allow SPKI RSA keys to be parsed even if they have an incorrect delimiter (<a href="https://redirect.github.com/pyca/cryptography/issues/1">#1</a>...</li> <li><a href="https://github.com/pyca/cryptography/commit/12f038b38af76e36efe8cef09597010c97647e8f"><code>12f038b</code></a> fixes <a href="https://redirect.github.com/pyca/cryptography/issues/10237">#10237</a> -- correct EC sign parameter name (<a href="https://redirect.github.com/pyca/cryptography/issues/10239">#10239</a>) (<a href="https://redirect.github.com/pyca/cryptography/issues/10240">#10240</a>)</li> <li>See full diff in <a href="https://github.com/pyca/cryptography/compare/42.0.0...42.0.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cryptography&package-manager=pip&previous-version=42.0.0&new-version=42.0.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/intel/llvm/network/alerts). </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Alexey Bader <alexey.bader@intel.com>
…tel#12748) Warnings fixed: - deprecated scatter_rgba - deprecated get_cl_code - deprecated lsc_fence - deprecated uchar type usage - deprecated get_access on HOST - deprecated get_pointer - usage of isfinite with -ffast-math - deprecated dpas_argument_type::s1 - deprecated gpu_selector() Also, the memory alloc/free in historgram*.cpp tests were updated to simplify the potential memory leak avoidance. Signed-off-by: Klochkov, Vyacheslav N <vyacheslav.n.klochkov@intel.com>
Scheduled drivers uplift Co-authored-by: GitHub Actions <actions@github.com>
…ed cmd-list Update the design doc. Update the UR tag.
mfrancepillois
force-pushed
the
maxime/UR-improve-ZE-enqueue-delay
branch
from
February 20, 2024 17:06
2840382
to
8e21a1d
Compare
Upstream PR intel#12770 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Update the design doc.
Update the UR tag.