-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge from main with loop wrappers + composite support + liboffload #71
Commits on Apr 22, 2024
-
[nfc][llvm] Fix a typo in MathExtras.h testing (llvm#89653)
I made a small typo when writing a test for MathExtras.h, sorry!
Configuration menu - View commit details
-
Copy full SHA for 7c58546 - Browse repository at this point
Copy the full SHA 7c58546View commit details -
[libc++] Remove _LIBCPP_DISABLE_NODISCARD_EXTENSIONS and refactor the…
… tests (llvm#87094) This also adds a few tests that were missing.
Configuration menu - View commit details
-
Copy full SHA for 83bc7b5 - Browse repository at this point
Copy the full SHA 83bc7b5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8482dbd - Browse repository at this point
Copy the full SHA 8482dbdView commit details -
Reapply "[compiler-rt][ctx_instr] Add
ctx_profile
component" (llvm#……89625) This reverts commit 8b2ba6a. The uild errors (see below) were likely due to the same issue PR llvm#88074 fixed. Addressed by following that PR. https://lab.llvm.org/buildbot/#/builders/165/builds/52789 https://lab.llvm.org/buildbot/#/builders/91/builds/25273
Configuration menu - View commit details
-
Copy full SHA for a3e7a12 - Browse repository at this point
Copy the full SHA a3e7a12View commit details -
[Frontend][OpenMP] Add missing "return" statement after 40137ff
When responding to review comments, `return {}` was accidentally replaced by `std::nullptr` instead of `return std::nullptr`.
Configuration menu - View commit details
-
Copy full SHA for b8ff08d - Browse repository at this point
Copy the full SHA b8ff08dView commit details -
[RISCV] Implement RISCVISD::SHL_ADD and move patterns into combine (l…
…lvm#89263) This implements a RISCV specific version of the SHL_ADD node proposed in llvm#88791. If that lands, the infrastructure from this patch should seamlessly switch over the to generic DAG node. I'm posting this separately because I've run out of useful multiply strength reduction work to do without having a way to represent MUL X, 3/5/9 as a single instruction. The majority of this change is moving two sets of patterns out of tablgen and into the post-legalize combine. The major reason for this is that I have an upcoming change which needs to reuse the expansion logic, but it also helps common up some code between zba and the THeadBa variants. On the test changes, there's a couple major categories: * We chose a different lowering for mul x, 25. The new lowering involves one fewer register and the same critical path, so this seems like a win. * The order of the two multiplies changes in (3,5,9)*(3,5,9) in some cases. I don't believe this matters. * I'm removing the one use restriction on the multiply. This restriction doesn't really make sense to me, and the test changes appear positive.
Configuration menu - View commit details
-
Copy full SHA for 5a7c80c - Browse repository at this point
Copy the full SHA 5a7c80cView commit details -
[mlir][test] Reorganize the test dialect (llvm#89424)
This PR massively reorganizes the Test dialect's source files. It moves manually-written op hooks into `TestOpDefs.cpp`, moves format custom directive parsers and printers into `TestFormatUtils`, adds missing comment blocks, and moves around where generated source files are included for types, attributes, enums, etc. into their own source file. This will hopefully help navigate the test dialect source code, but also speeds up compile time of the test dialect by putting generated source files into separate compilation units. This also sets up the test dialect to shard its op definitions, done in the next PR.
Configuration menu - View commit details
-
Copy full SHA for e95e94a - Browse repository at this point
Copy the full SHA e95e94aView commit details -
[Frontend][OpenMP] Add suggested brackets in array initialization
Fixes -Werror build after 40137ff.
Configuration menu - View commit details
-
Copy full SHA for 14e6f63 - Browse repository at this point
Copy the full SHA 14e6f63View commit details -
[flang] Don't emit conversion error for max(a,b, optionalCharacter) (l…
…lvm#88156) A recent patch added an error message for whole optional dummy argument usage as optional arguments (third or later) to MAX and MIN when those names required type conversion, since that conversion only works when the optional arguments are present. This check shouldn't care about character lengths. Make it so.
Configuration menu - View commit details
-
Copy full SHA for e8572d0 - Browse repository at this point
Copy the full SHA e8572d0View commit details -
[flang] Improve error reporting for procedures determined by usage (l…
…lvm#88184) When a symbol is known to be a procedure due to its being referenced as a function or subroutine, improve the error messages that appear if the symbol is also used as an object by attaching the source location of its procedural use. Also, for errors spotted in name resolution due to how a given symbol has been used, don't unconditionally set the symbol's error flag (which is otherwise generally a good idea, to prevent cascades of errors), so that more unrelated errors related to usage will appear.
Configuration menu - View commit details
-
Copy full SHA for cb1b846 - Browse repository at this point
Copy the full SHA cb1b846View commit details -
[flang] Fix spurious overflow warning folding exponentiation by integ… (
llvm#88188) …er powers The code that folds exponentiation by an integer power can report a spurious overflow warning because it calculates one last unnecessary square of the base value. 10.**(+/-32) exposes the problem -- the value of 10.**64 is calculated but not needed. Rearrange the implementation to only calculate squares that are necessary. Fixes llvm#88151.
Configuration menu - View commit details
-
Copy full SHA for 31505c4 - Browse repository at this point
Copy the full SHA 31505c4View commit details -
[lldb][Core] Fix pointless if conditon (llvm#89650)
Addresses llvm#85984 Signed-off-by: Troy-Butler <squintik@outlook.com> Co-authored-by: Troy-Butler <squintik@outlook.com>
Configuration menu - View commit details
-
Copy full SHA for 2987fca - Browse repository at this point
Copy the full SHA 2987fcaView commit details -
Configuration menu - View commit details
-
Copy full SHA for ce1b678 - Browse repository at this point
Copy the full SHA ce1b678View commit details -
[test][GWP-ASan] Only add check-gwp_asan when its dependencies are bu…
…ilt (llvm#89164) Currently, `check-gwp_asan` is added no matter its dependencies are built or not, this is wrong and will cause cmake error when scudo is not built. This patch includes the target in the dependencies check.
Configuration menu - View commit details
-
Copy full SHA for 6884c1f - Browse repository at this point
Copy the full SHA 6884c1fView commit details -
[flang] Fix crash on erroneous program (llvm#88192)
Constant folding had a CHECK on array subscript rank that should more gracefully handle a bad program with a subscript that is a matrix or higher rank. Fixes llvm#88112.
Configuration menu - View commit details
-
Copy full SHA for 138524e - Browse repository at this point
Copy the full SHA 138524eView commit details -
[flang] Fix bogus error on statement function (llvm#89402)
When a statement function in a nested scope has a name that clashes with a name that exists in the host scope, the compiler can handle it correctly (with a portability warning)... unless the host scope acquired the name via USE association. Fix. Fixes llvm#88678.
Configuration menu - View commit details
-
Copy full SHA for 59bf49a - Browse repository at this point
Copy the full SHA 59bf49aView commit details -
[libc] Clean up alternate test framework support (llvm#89659)
This replaces the old macros LIBC_COPT_TEST_USE_FUCHSIA and LIBC_COPT_TEST_USE_PIGWEED with LIBC_COPT_TEST_ZXTEST and LIBC_COPT_TEST_GTEST, respectively. These are really not about whether the code is in the Fuchsia build or in the Pigweed build, but just about what test framework is being used. The gtest framework can be used in many contexts, and the zxtest framework is not always what's used in the Fuchsia build. The test/UnitTest/Test.h wrapper header now provides the macro LIBC_TEST_HAS_MATCHERS() for use in `#if` conditionals on use of gmock-style matchers, replacing `#if` conditionals that test the framework selection macros directly.
Configuration menu - View commit details
-
Copy full SHA for d2be982 - Browse repository at this point
Copy the full SHA d2be982View commit details -
[flang] Make proc characterization error conditional for generics (ll…
…vm#89429) When the characteristics of a procedure depend on a procedure that hasn't yet been defined, the compiler currently emits an unconditional error message. This includes the case of a procedure whose characteristics depend, perhaps indirectly, on itself. However, in the case where the characteristics of a procedure are needed to resolve a generic, we should not emit an error for a hitherto undefined procedure -- either the call will resolve to another specific procedure, in which case the error is spurious, or it won't, and then an error will issue anyway. Fixes llvm#88677.
Configuration menu - View commit details
-
Copy full SHA for cb26391 - Browse repository at this point
Copy the full SHA cb26391View commit details -
[docs] Rewrite cmake LLVM_RAM_PER_*_JOB description (llvm#88570)
Rewrite `LLVM_PARALLEL_{}_JOBS` and `LLVM_RAM_PER_{}_JOB` documentation.
Configuration menu - View commit details
-
Copy full SHA for 2f77757 - Browse repository at this point
Copy the full SHA 2f77757View commit details -
[flang] C_LOC is PURE (llvm#89437)
The standard defines C_LOC as being PURE (actually SIMPLE now in F'2023); characterize it appropriately. Fixes llvm#88747.
Configuration menu - View commit details
-
Copy full SHA for fde5e47 - Browse repository at this point
Copy the full SHA fde5e47View commit details -
[flang] Complete implementation of OUT_OF_RANGE() (llvm#89334)
The intrinsic function OUT_OF_RANGE() lacks support in lowering and the runtime. This patch obviates a need for any such support by implementing OUT_OF_RANGE() via rewriting in semantics. This rewriting of OUT_OF_RANGE() calls replaces the existing code that folds OUT_OF_RANGE() calls with constant arguments. Some changes and fixes were necessary outside of OUT_OF_RANGE()'s folding code (now rewriting code), whose testing exposed some other issues worth fixing. - The common::RealDetails<> template class was recoded in terms of a new base class with a constexpr constructor, so that the the characteristics of the various REAL kinds could be queried dynamically as well. This affected some client usage. - There were bugs in the code that folds TRANSFER() when the type of X or MOLD was REAL(10) -- this is a type that occupies 16 bytes per element in execution memory but only 10 bytes (was 12) in the data of std::vector<Scalar<>> in a Constant<>. - Folds of REAL->REAL conversions weren't preserving infinities.
Configuration menu - View commit details
-
Copy full SHA for 1444e5a - Browse repository at this point
Copy the full SHA 1444e5aView commit details -
Temporarily remove
clang_rt.ctx_profile
targetTrying to address the build failure on the `clang-ve-ninja`bot, which appears hard to repro locally. The target isn't needed currently (there are unit tests exercising the new functionality). Removing it for now to green-ify the build bot.
Configuration menu - View commit details
-
Copy full SHA for 579efe0 - Browse repository at this point
Copy the full SHA 579efe0View commit details -
[GlobalISel] matchSDivByConst should use isNullValue() (llvm#89666)
It has been using isZeroValue(), which is for floats, not integers.
Configuration menu - View commit details
-
Copy full SHA for 5fef5e6 - Browse repository at this point
Copy the full SHA 5fef5e6View commit details -
[ORC] Unify task dispatch across ExecutionSession and ExecutorProcess…
…Control. Updates ExecutionSession to use the ExecutorProcessControl object's TaskDispatcher rather than having a separate dispatch function. This gives the TaskDispatcher a global view of all tasks to be executed, and provides a single point to wait on for tasks to complete when shutting down the JIT.
Configuration menu - View commit details
-
Copy full SHA for 6094b3b - Browse repository at this point
Copy the full SHA 6094b3bView commit details -
[flang] Fix build warning (llvm#89686)
A recent patch had three declared but unused variables in it, triggering a warning in some build bots. Remove them.
Configuration menu - View commit details
-
Copy full SHA for 2e2ac6f - Browse repository at this point
Copy the full SHA 2e2ac6fView commit details -
Revert "[ORC] Unify task dispatch across ExecutionSession and Executo…
…rProcessControl." This reverts commit 6094b3b. Multiple bots are broken.
Configuration menu - View commit details
-
Copy full SHA for a28557a - Browse repository at this point
Copy the full SHA a28557aView commit details -
[hwasan] Add intrinsics for fixed shadow on Aarch64 (llvm#89319)
This patch introduces HWASan memaccess intrinsics that assume a fixed shadow (with the offset provided by --hwasan-mapping-offset=...), with and without short granule support. The behavior of HWASan is not meaningfully changed by this patch; future work ("Optimize outlined memaccess for fixed shadow on Aarch64": llvm#88544) will make HWASan use these intrinsics. We currently only support lowering the LLVM IR intrinsic to AArch64. The test case is adapted from hwasan-check-memaccess.ll.
Configuration menu - View commit details
-
Copy full SHA for 365bddf - Browse repository at this point
Copy the full SHA 365bddfView commit details -
Update CHECK lines in tests after 14e6f63 added new output causing th…
…e tests to fail on multiple bots. (llvm#89689) Update the check lines added in llvm#87247 after 14e6f63 updated the output causing the tests to fail. This should hopefully unbreak the bots failing due to these two tests failing.
Configuration menu - View commit details
-
Copy full SHA for 8f54ed2 - Browse repository at this point
Copy the full SHA 8f54ed2View commit details
Commits on Apr 23, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 28cea99 - Browse repository at this point
Copy the full SHA 28cea99View commit details -
Make createReadOrMaskedRead and isValidMaskedInputVector vector utili…
…ties (llvm#89119) Made the createReadOrMaskedRead and isValidMaskedInputVector utility functions - to be accessible outside of the CU. Needed by the IREE new TopK implementation.
Configuration menu - View commit details
-
Copy full SHA for 30d4f6a - Browse repository at this point
Copy the full SHA 30d4f6aView commit details -
Revert "[RISCV] Implement RISCVISD::SHL_ADD and move patterns into co…
…mbine (llvm#89263)" This reverts commit 5a7c80c. Noticed failures with the following command: $ llc -mtriple=riscv64 -mattr=+m,+xtheadba -verify-machineinstrs < test/CodeGen/RISCV/rv64zba.ll I think I know the cause and will likely reland with a fix tomorrow.
Configuration menu - View commit details
-
Copy full SHA for dc3f943 - Browse repository at this point
Copy the full SHA dc3f943View commit details -
Re-apply "[ORC] Unify task dispatch across ExecutionSession and..." w…
Configuration menu - View commit details
-
Copy full SHA for 1effa19 - Browse repository at this point
Copy the full SHA 1effa19View commit details -
[AIX][TLS][clang] Add -maix-small-local-dynamic-tls clang option (llv…
…m#88829) This patch adds the clang portion of an AIX-specific option to inform the compiler that it can use a faster access sequence for the local-dynamic TLS model (formally named aix-small-local-dynamic-tls). This patch mainly references Amy's work on small local-exec TLS support.
Configuration menu - View commit details
-
Copy full SHA for 16efd2a - Browse repository at this point
Copy the full SHA 16efd2aView commit details -
[lldb][DAP] Fix test failure from llvm#73393 (llvm#89692)
llvm#73393 introduced a mandatory column field. Update test for that.
Configuration menu - View commit details
-
Copy full SHA for aa89c1b - Browse repository at this point
Copy the full SHA aa89c1bView commit details -
Revert "Re-apply [ORC] Unify task dispatch across ExecutionSession an…
…d..." This reverts commit 1effa19 while I investigate the test failure at https://lab.llvm.org/buildbot/#/builders/285/builds/888.
Configuration menu - View commit details
-
Copy full SHA for e7efd37 - Browse repository at this point
Copy the full SHA e7efd37View commit details -
Configuration menu - View commit details
-
Copy full SHA for ff153bd - Browse repository at this point
Copy the full SHA ff153bdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 28d85e2 - Browse repository at this point
Copy the full SHA 28d85e2View commit details -
[lldb] Replace condition that always evaluates to false (llvm#89685)
Addresses issue llvm#87243. The current code incorrectly checks the validity of ```obj``` twice when it should be checking the new ```str_obj``` pointer. Signed-off-by: Troy-Butler <squintik@outlook.com> Co-authored-by: Troy-Butler <squintik@outlook.com>
Configuration menu - View commit details
-
Copy full SHA for af8445e - Browse repository at this point
Copy the full SHA af8445eView commit details -
[SimplifyQuery] Avoid PatternMatch.h include (NFC)
Move the one method that uses it out of line. This is primarily to reduce the number of files to rebuild when changing PatternMatch.h.
Configuration menu - View commit details
-
Copy full SHA for f8a19a8 - Browse repository at this point
Copy the full SHA f8a19a8View commit details -
This patch fixes: third-party/unittest/googletest/include/gtest/gtest.h:1379:11: error: comparison of integers of different signs: 'const int' and 'const unsigned long' [-Werror,-Wsign-compare]
Configuration menu - View commit details
-
Copy full SHA for 4127a69 - Browse repository at this point
Copy the full SHA 4127a69View commit details -
Configuration menu - View commit details
-
Copy full SHA for 34ee77c - Browse repository at this point
Copy the full SHA 34ee77cView commit details -
[ADT] Remove StringRef::{startswith,endswith} (llvm#89548)
These functions have been deprecated since: commit 5ac1295 Author: Kazu Hirata <kazu@google.com> Date: Sun Dec 17 15:52:50 2023 -0800
Configuration menu - View commit details
-
Copy full SHA for 4ec9a66 - Browse repository at this point
Copy the full SHA 4ec9a66View commit details -
[RISCV][TableGen] Generate RISCVTargetParserDef.inc from the new RISC…
…VExtension tblgen information. (llvm#89335) Instead of using RISCVISAInfo's extension information, use the extension found in tblgen after llvm#89326. We still need to use RISCVISAInfo code to get the sorting rules for the ISA string. The ISA string we generate now is not quite the same extension we had before. No implied extensions are included in the generate string unless they are explicitly listed in RISCVProcessors.td. This primarily affects Zicsr being implied by F, V implying Zve*, and Zvl*b implying a smaller Zvl*b. All of these implication should be picked up when the string is used by the frontend. The benefit is that we get a more manageable ISA string for humans to deal with. This is a step towards generating RISCVISAInfo's extension list from tblgen.
Configuration menu - View commit details
-
Copy full SHA for b64e483 - Browse repository at this point
Copy the full SHA b64e483View commit details -
[SimplifyCFG] Check alignment when speculating stores
When speculating a store based on a preceding load/store, we need to ensure that the speculated store does not have a higher alignment (which might only be guaranteed by the branch condition). There are various ways in which this could be strengthened (we could get or enforce the alignment), but for now just do the simple check against the preceding load/store. Fixes llvm#89672.
Configuration menu - View commit details
-
Copy full SHA for 8838874 - Browse repository at this point
Copy the full SHA 8838874View commit details -
[clang][CodeGen][NFC] Make ConstExprEmitter a ConstStmtVisitor (llvm#…
…89041) No reason for this to not be one. This gets rid of a few const_casts.
Configuration menu - View commit details
-
Copy full SHA for e5f9de8 - Browse repository at this point
Copy the full SHA e5f9de8View commit details -
[RISCV] Sink some repeated code into parseVTypeToken. NFC (llvm#89694)
Both calls to parseVTypeToken were proceeded by check for an Identifier token and a call to getIdentifier. Sync those into the parseVTypeToken to reduce repetition.
Configuration menu - View commit details
-
Copy full SHA for 25a391c - Browse repository at this point
Copy the full SHA 25a391cView commit details -
[NFC] [Serialization] Use semantical type DeclID instead of raw type …
…'uint32_t' This patch tries to use DeclID in the code bases to avoid use the raw type 'uint32_t'. It is problematic to use the raw type 'uint32_t' if we want to change the type of DeclID some day.
Configuration menu - View commit details
-
Copy full SHA for 07b1177 - Browse repository at this point
Copy the full SHA 07b1177View commit details -
[FunctionAttrs] Fix incorrect noundef inference with poison attrs (ll…
…vm#89348) Currently, when inferring noundef, we only check that the return value is not undef/poison. However, we fail to account for the possibility that a poison-generating return attribute will convert the value to poison, and then violate the noundef attribute, resulting in immediate UB. For the relevant return attributes (align, nonnull and range), check whether we can trivially re-prove the relevant property, otherwise do not infer noundef. This fixes the FunctionAttrs side of llvm#88026.
Configuration menu - View commit details
-
Copy full SHA for a2ccd5d - Browse repository at this point
Copy the full SHA a2ccd5dView commit details -
[NFC] Remove unused LocalRedeclarationsInfo from ASTBitcodes.h
As the title suggested.
Configuration menu - View commit details
-
Copy full SHA for 02d00ec - Browse repository at this point
Copy the full SHA 02d00ecView commit details -
[memprof] Omit the key length for the record table (llvm#89527)
The record table has a constant key length, so we don't need to serialize or deserialize it for every key-data pair. Omitting the key length saves 0.06% of the indexed MemProf file size. Note that it's OK to change the format because Version2 is still under development.
Configuration menu - View commit details
-
Copy full SHA for b28f4d4 - Browse repository at this point
Copy the full SHA b28f4d4View commit details -
[MLIR] Harmonize the behavior of the folding API functions (llvm#88508)
This commit changes `OpBuilder::tryFold` to behave more similarly to `Operation::fold`. Concretely, this ensures that even an in-place fold returns `success`. This is necessary to fix a bug in the dialect conversion that occurred when an in-place folding made an operation legal. The dialect conversion infrastructure did not check if the result of an in-place folding legalized the operation and just went ahead and tried to apply pattern anyways. The added test contains a simplified version of a breakage we observed downstream.
Configuration menu - View commit details
-
Copy full SHA for 4513050 - Browse repository at this point
Copy the full SHA 4513050View commit details -
Reapply "[clang][dataflow] Model conditional operator correctly." wit…
…h fixes (llvm#89596) I reverted llvm#89213 beause it was causing buildbots to fail with assertion failures. Embarrassingly, it turns out I had been running tests locally in `Release` mode, i.e. with `assert()` compiled away. This PR re-lands llvm#89213 with fixes for the failing assertions.
Configuration menu - View commit details
-
Copy full SHA for 9ba6961 - Browse repository at this point
Copy the full SHA 9ba6961View commit details -
[mlir][linalg] Add patterns to convert matmul to transposed variants (l…
…lvm#89075) This adds patterns to convert from the Linalg matmul and batch_matmul ops to the transposed variants. By default the LHS matrix is transposed. Our work enabling a lowering path from linalg.matmul to ArmSME has revealed the current lowering results in non-contiguous memory accesses for the A matrix and very poor performance. These patterns provide a simple option to fix this.
Configuration menu - View commit details
-
Copy full SHA for 7922534 - Browse repository at this point
Copy the full SHA 7922534View commit details -
[NFC] [Serialization] Remove unused readVisibleDeclContextStorage fro…
…m ASTRecordReader.h As the title suggested.
Configuration menu - View commit details
-
Copy full SHA for 87a2159 - Browse repository at this point
Copy the full SHA 87a2159View commit details -
[NFC] Rename hlsl semantics to hlsl annotations (llvm#89309)
The attribute name "HLSLSemantics" is confusing, because semantics aren't always the annotation that are applied to specific variables. The name for this attribute needs to be less specific. This PR changes the attribute name from HLSLSemantic to HLSLAnnotation, and changes the associated function and variable names to support this conceptual change. The HLSLAnnotation attribute will never be output in ast-dump due to it being parsed for the attribute that it represents. There is no functional change, so there are no accompanying tests.
Configuration menu - View commit details
-
Copy full SHA for eaab97a - Browse repository at this point
Copy the full SHA eaab97aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 561b3de - Browse repository at this point
Copy the full SHA 561b3deView commit details -
[clang] Set correct FPOptions if attribute 'optnone' presents (llvm#8…
…5605) Attribute `optnone` must turn off all optimizations including fast-math ones. Actually AST nodes in the 'optnone' function still had fast-math flags. This change implements fixing FP options before function body is parsed.
Configuration menu - View commit details
-
Copy full SHA for a046242 - Browse repository at this point
Copy the full SHA a046242View commit details -
[flang] handle intrinsic interfaces in FunctionRef::GetType (llvm#89583)
User functions may be declared with an interface that is a specific intrinsic. In such case, there is no result type available from the procedure symbol (at least without using evaluate::Probe), and FunctionRef::GetType() returned nullopt. This caused lowering to crash. The result type of specific intrinsic procedures is always a lengthless intrinsic type, so it is fully defined in the template argument of FunctionRef. Use it.
Configuration menu - View commit details
-
Copy full SHA for 35159c2 - Browse repository at this point
Copy the full SHA 35159c2View commit details -
[GlobalISel] Expand IRTranslator docs. NFC (llvm#89186)
Add some more details about how calls are lowered and what APIs are available.
Configuration menu - View commit details
-
Copy full SHA for 3ea9ed4 - Browse repository at this point
Copy the full SHA 3ea9ed4View commit details -
EmitC: Add emitc.global and emitc.get_global (llvm#145) (llvm#88701)
This adds - `emitc.global` and `emitc.get_global` ops to model global variables similar to how `memref.global` and `memref.get_global` work. - translation of those ops to C++ - lowering of `memref.global` and `memref.get_global` into those ops --------- Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
Configuration menu - View commit details
-
Copy full SHA for 6548465 - Browse repository at this point
Copy the full SHA 6548465View commit details -
[clang][ExtractAPI] Serialize platform specific unavailable attribute…
… in symbol graphs (llvm#89277) rdar://125622225
Configuration menu - View commit details
-
Copy full SHA for 05c1447 - Browse repository at this point
Copy the full SHA 05c1447View commit details -
[analyzer] Fix performance of getTaintedSymbolsImpl() (llvm#89606)
Previously the function ``` std::vector<SymbolRef> taint::getTaintedSymbolsImpl(ProgramStateRef State, const MemRegion *Reg, TaintTagType K, bool returnFirstOnly) ``` (one of the 4 overloaded variants under this name) was handling element regions in a highly inefficient manner: it performed the "also examine the super-region" step twice. (Once in the branch for element regions, and once in the more general branch for all `SubRegion`s -- note that `ElementRegion` is a subclass of `SubRegion`.) As pointer arithmetic produces `ElementRegion`s, it's not too difficult to get a chain of N nested element regions where this inefficient recursion would produce 2^N calls. This commit is essentially NFC, apart from the performance improvements and the removal of (probably irrelevant) duplicate entries from the return value of `getTaintedSymbols()` calls. Fixes llvm#89045
Configuration menu - View commit details
-
Copy full SHA for ce763bf - Browse repository at this point
Copy the full SHA ce763bfView commit details -
[LV] Add additional cost model tests with inductions and truncates.
Add test coverage for additional cases not covered by current tests with multiple inductions and truncates.
Configuration menu - View commit details
-
Copy full SHA for 55fc5eb - Browse repository at this point
Copy the full SHA 55fc5ebView commit details -
[DWARF] Add option to add linkage_names to call_origin declaration re…
…fs (llvm#89640) If -mllvm -add-linkage-names-to-external-call-origins is true then add DW_AT_linkage_name attributes to DW_TAG_subprogram DIEs referenced by DW_AT_call_origin attributes that would otherwise be omitted. A debugger may use DW_TAG_call_origin attributes to determine whether any frames in a callstack are missing due to optimisations (e.g. tail calls). For example, say a() calls b() tail-calls c(), and you stop in your debugger in c(): The callstack looks like this: c() a() Looking "up" from c(), call site information can be found in a(). This includes a DW_AT_call_origin referencing b()'s subprogram DIE, which means the call at this call site was to b(), not c() where we are currently stopped. This indicates b()'s frame has been lost due to optimisation (or is misleading due to ICF). This patch makes it easier for a debugger to check whether the referenced DIE describes the target function or not, for example by comparing the referenced function name to the current frame. There's already an option to apply DW_AT_linkage_name in a targeted manner: -dwarf-linkage-names=Abstract, which limits adding DW_AT_linkage_names to abstract subprogram DIEs (this is default for SCE tuning). The new flag shouldn't affect non-SCE-tuned behaviour whether it is enabled or not because the non-SCE-tuned behaviour is to always add linkage names to subprogram DIEs.
Configuration menu - View commit details
-
Copy full SHA for 0e44ffe - Browse repository at this point
Copy the full SHA 0e44ffeView commit details -
[PAC][MC][AArch64] Fix error message for AUTH_ABS64 reloc with ILP32 (l…
…lvm#89563) The `LP64 eqv:` should say that the equivalent is `AUTH_ABS64` rather than `ABS64` when trying to emit an AUTH absolute reloc with ILP32.
Configuration menu - View commit details
-
Copy full SHA for da57609 - Browse repository at this point
Copy the full SHA da57609View commit details -
[WebAssembly] Enable multivalue return when multivalue ABI is used (l…
…lvm#88492) Multivalue feature of WebAssembly has been standardized for several years now. I think it makes sense to be able to enable it in the feature section by default for our clang/llvm-produced binaries so that the multivalue feature can be used as necessary when necessary within our toolchain and also when running other optimizers (e.g. wasm-opt) after the LLVM code generation. But some WebAssembly toolchains, such as Emscripten, do not provide both mulvalue-returning and not-multivalue-returning versions of libraries. Also allowing the uses of multivalue in the features section does not necessarily mean we generate them whenever we can to the fullest, which is a different code generation / optimization option. So this makes the lowering of multivalue returns conditional on the use of 'experimental-mv' target ABI. This ABI is turned off by default and turned on by passing `-Xclang -target-abi -Xclang experimental-mv` to `clang`, or `-target-abi experimental-mv` to `clang -cc1` or `llc`. But the purpose of this PR is not tying the multivalue lowering to this specific 'experimental-mv'. 'experimental-mv' is just one multivalue ABI we currently have, and it is still experimental, meaning it is not very well optimized or tuned for performance. (e.g. it does not have the limitation of the max number of multivalue-lowered values, which can be detrimental to performance.) We may change the name of this ABI, or improve it, or add a new multivalue ABI in the future. Also I heard that WASI is planning to add their multivalue ABI soon. So the plan is, whenever any one of multivalue ABIs is enabled, we enable the lowering of multivalue returns in the backend. We currently have only 'experimental-mv' in the repo so we only check for that in this PR. Related past discussions: llvm#82714 WebAssembly/tool-conventions#223 (comment)
Configuration menu - View commit details
-
Copy full SHA for c921ac7 - Browse repository at this point
Copy the full SHA c921ac7View commit details -
[NFC] [Serialization] Turn type alias LocalDeclID into class
Previously, the LocalDeclID and GlobalDeclID are defined as: ``` using LocalDeclID = DeclID; using GlobalDeclID = DeclID; ``` This is more or less concerning that we may misuse LocalDeclID and GlobalDeclID without understanding it. There is also a FIXME saying this. This patch tries to turn LocalDeclID into a class to improve the type safety here.
Configuration menu - View commit details
-
Copy full SHA for b8e3b2a - Browse repository at this point
Copy the full SHA b8e3b2aView commit details -
[WebAssembly] Make RefTypeMem2Local recognize target-features (llvm#8…
…8916) Currently we check `Subtarget->hasReferenceTypes()` to decide whether to run `RefTypeMem2Local` pass: https://github.com/llvm/llvm-project/blob/6133878227efc30355c02c2f089e06ce58231a3d/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp#L491-L495 This works fine when `-mattr=+reference-types` is given in the command line (of `llc` or of `wasm-ld` in case of LTO). This also works fine if the backend is called by Clang, because Clang's feature set will be passed to the backend when creating a `TargetMachine`: https://github.com/llvm/llvm-project/blob/ac791888bbbe58651e597cf7a4b2276424b77a92/clang/lib/CodeGen/BackendUtil.cpp#L549-L550 https://github.com/llvm/llvm-project/blob/ac791888bbbe58651e597cf7a4b2276424b77a92/clang/lib/CodeGen/BackendUtil.cpp#L561-L562 But if the backend compilation is called by `llc`, a `TargetMachine` is created here: https://github.com/llvm/llvm-project/blob/bf1ad1d267b1f911cb9846403d2c3d3250a40870/llvm/tools/llc/llc.cpp#L554-L555 And if the backend is called by `wasm-ld`'s LTO, a `TargetMachine` is created here: https://github.com/llvm/llvm-project/blob/ac791888bbbe58651e597cf7a4b2276424b77a92/llvm/lib/LTO/LTOBackend.cpp#L513 At this point, in the both places, the created `TargetMachine` only has access to target features given by the command line with `-mattr=` and doesn't have access to bitcode functions' `target-features` attribute. We later gather the target features used by functions and store that info in the `TargetMachine` in `CoalesceFeaturesAndStripAtomics`, https://github.com/llvm/llvm-project/blob/ac791888bbbe58651e597cf7a4b2276424b77a92/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp#L202-L206 but this runs in the pass pipeline driven by the pass manager, so this has not run by the time we check `Subtarget->hasReferenceTypes()` in `WebAssemblyPassConfig::addISelPrepare`. So currently `RefTypeMem2Local` would not run on those functions with `"target-features"="+reference-types"` attributes if the backend is called by `llc` or `wasm-ld`. So this makes `RefTypeMem2Local` pass run unconditionally, and checks `target-featurs` function attribute to decide whether to run the pass on each function. This allows the pass to run with `wasm-ld` + LTO and `llc`, even if `-mattr=+reference-types` is not explicitly given in the command line again, as long as `+reference-types` is in the function's `target-features` attribute. This also covers the case we give the target features by the command line like `llc -mattr=+reference-types` and not in the bitcode function's attribute, because attributes given in the command line will be stored in the function's attributes anyway: https://github.com/llvm/llvm-project/blob/bd28889732e14ac6baca686c3ec99a82fc9cd89d/llvm/lib/CodeGen/CommandFlags.cpp#L673-L674 https://github.com/llvm/llvm-project/blob/bd28889732e14ac6baca686c3ec99a82fc9cd89d/llvm/lib/CodeGen/CommandFlags.cpp#L732-L733 With this PR, - `lto0.test_externref_emjs` - `thinlto0.test_externref_emjs`, - `lto0.test_externref_emjs_dynlink`, - `thinlto0.test_externref_emjs_dynlnk` pass. These currently fail but don't get checked in the CI. I think they used to pass but started to fail after llvm#83196, because we used to run mem2reg even with `-O0` before that. (`ltoN` (N > 0) tests are not affected because they run mem2reg anyway so they don't need `RefTypeMem2Local`)
Configuration menu - View commit details
-
Copy full SHA for a22ffe5 - Browse repository at this point
Copy the full SHA a22ffe5View commit details -
[mlir][bazel] drop unnecessary rule
llvm#75960 added a bazel rule for generating enums for the async dialects, but there are no enums defined, and no cmake rule for that. Delete this rule.
Configuration menu - View commit details
-
Copy full SHA for d5093aa - Browse repository at this point
Copy the full SHA d5093aaView commit details -
Revert "[mlir][linalg] Enable fuse consumer" (llvm#89722)
Reverts llvm#85528. This was committed without tests, despite reviewers requesting tests to be added. The post-commit discussion leans towards revert, which would be consistent with the policy.
Configuration menu - View commit details
-
Copy full SHA for f220c35 - Browse repository at this point
Copy the full SHA f220c35View commit details -
Revert b28f4d4 "[memprof] Omit the key length for the record table (l…
…lvm#89527)" Breaks on EXPENSIVE_CHECKS builds which still use the static ReadKeyDataLength implementation in several locations
Configuration menu - View commit details
-
Copy full SHA for 20cb2ed - Browse repository at this point
Copy the full SHA 20cb2edView commit details -
I misunderstood what is the function looking up
Configuration menu - View commit details
-
Copy full SHA for dbcfb43 - Browse repository at this point
Copy the full SHA dbcfb43View commit details -
Configuration menu - View commit details
-
Copy full SHA for a68ea36 - Browse repository at this point
Copy the full SHA a68ea36View commit details -
[flang][OpenMP] Support reduction of allocatable variables (llvm#88392)
Both arrays and trivial scalars are supported. Both cases must use by-ref reductions because both are boxed. My understanding of the standards are that OpenMP says that this should follow the rules of the intrinsic reduction operators in fortran, and fortran says that unallocated allocatable variables can only be referenced to allocate them or test if they are already allocated. Therefore we do not need a null pointer check in the combiner region.
Configuration menu - View commit details
-
Copy full SHA for 8cc34fa - Browse repository at this point
Copy the full SHA 8cc34faView commit details -
[bazel] Add a bazel flag to enable building MLIR with CUDA support (l…
…lvm#88856) This makes it possible to specify `--@llvm-project//mlir:enable_cuda=true` on the bazel command line and get a build that includes NVIDIA GPU support in MLIR.
Configuration menu - View commit details
-
Copy full SHA for bc72048 - Browse repository at this point
Copy the full SHA bc72048View commit details -
Configuration menu - View commit details
-
Copy full SHA for 719112c - Browse repository at this point
Copy the full SHA 719112cView commit details -
[mlir][linalg] Move transpose_matmul to targeted transform op (llvm#8…
…9717) More targeted than a blanket "apply everywhere" pattern. Follow up to llvm#89075 to address @ftynse's feedback.
Configuration menu - View commit details
-
Copy full SHA for be1c72d - Browse repository at this point
Copy the full SHA be1c72dView commit details -
[NFC] [Serialization] Turn type alias GlobalDeclID into a class
Succsessor of b8e3b2a. This patch also converts the type alias GlobalDeclID to a class to improve the readability and type safety.
Configuration menu - View commit details
-
Copy full SHA for b467c6b - Browse repository at this point
Copy the full SHA b467c6bView commit details -
[mlir][aarch64] Remove LIT config for lli (llvm#89545)
This change will only affect MLIR integration tests to be run on AArch64. When originally introduced, these tests would run with `lli`. Those tests has since been updated to use `mlir-cpu-runner` instead, see e.g.: * https://reviews.llvm.org/D155405 * https://reviews.llvm.org/D146917 This patch removes all the leftover `lli` configuration in LIT that's currently not needed (and is unlikely to be needed any time soon).
Configuration menu - View commit details
-
Copy full SHA for 132bf4a - Browse repository at this point
Copy the full SHA 132bf4aView commit details -
[VectorCombine][X86] Add test showing foldShuffleOfShuffles folding s…
…huffles that would be better separate On AVX+ targets a broadcast load can be treated as free.
Configuration menu - View commit details
-
Copy full SHA for b4c6607 - Browse repository at this point
Copy the full SHA b4c6607View commit details -
[CostModel][X86] Add costs test coverage for broadcast loads
Broadcast shuffles can be free is fed from a one-use load
Configuration menu - View commit details
-
Copy full SHA for a9e8730 - Browse repository at this point
Copy the full SHA a9e8730View commit details -
[CostModel][X86] Broadcast shuffles can be free if they are from a on…
…e-use load AVX1+ can handle 32/64-bit broadcast loads, AVX2+ can handle all broadcast loads (we should be able to improve isLegalBroadcastLoad to handle more of this type matching).
Configuration menu - View commit details
-
Copy full SHA for f89f670 - Browse repository at this point
Copy the full SHA f89f670View commit details -
[LLVM][CodeGen][AArch64] Simplify lowering for predicate inserts. (ll…
…vm#89072) The original code has an invalid use of UZP1 because the result vector type does not match its input vector types. Rather than insert extra nop casts I figure it would be better to use CONCAT_VECTORS because that's the operation we're performing. NOTE: This is a step to enable more asserts in verifyTargetSDNode.
Configuration menu - View commit details
-
Copy full SHA for 34caafe - Browse repository at this point
Copy the full SHA 34caafeView commit details -
RenameIndependentSubregs: Add missing sub-range for new IMPLICIT_DEFs (…
…llvm#89050) Existing sub-ranges are correctly updated because new IMPLICIT_DEF is added, but there is missing sub-range for IMPLICIT_DEF itself. Because of missing sub-range in live-intervals for IMPLICIT_DEF, register allocator does not know that IMPLICIT_DEF rewrites its virtual sub-registers and can end up assigning overlapping physical registers to them. This results in deleting instructions that were defined by sub-registers overwritten by IMPLICIT_DEF as they are now dead.
Configuration menu - View commit details
-
Copy full SHA for d610a51 - Browse repository at this point
Copy the full SHA d610a51View commit details -
[LLVM][CodeGen][SVE] rev(whilelo(a,b)) -> whilehi(b,a). (llvm#88294)
Add similar isel patterns for lt, gt and hi comparison types.
Configuration menu - View commit details
-
Copy full SHA for a9689c6 - Browse repository at this point
Copy the full SHA a9689c6View commit details -
[VPlan] Skip extending ICmp results in trunateToMinimalBitwidth.
Results of icmp don't need extending after truncating their operands, as the result will always be i1. Skip them during extending. Fixes llvm#79742 Fixes llvm#85185
Configuration menu - View commit details
-
Copy full SHA for 17fb3e8 - Browse repository at this point
Copy the full SHA 17fb3e8View commit details -
[VectorCombine] foldShuffleOfShuffles - add missing arguments to getS…
…huffleCost calls. Ensure the getShuffleCost arguments/instruction args are populated - minor extension to llvm#88743 to help improve shuffle costs for certain corner cases (e.g. shuffles of loads)
Configuration menu - View commit details
-
Copy full SHA for 7f4f237 - Browse repository at this point
Copy the full SHA 7f4f237View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8a631d7 - Browse repository at this point
Copy the full SHA 8a631d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for bac5d8e - Browse repository at this point
Copy the full SHA bac5d8eView commit details -
[clang-tidy] Avoid overflow when dumping unsigned integer values (llv…
…m#85060) Some options take the maximum unsigned integer value as default, but they are being dumped to a string as integers. This makes -dump-config write invalid '-1' values for these options. This change fixes this issue by using utostr if the option is unsigned. Fixes llvm#60217
Configuration menu - View commit details
-
Copy full SHA for c52b18d - Browse repository at this point
Copy the full SHA c52b18dView commit details -
Make default initialization explicit
Coverity (a static analysis tool) reported that the emitted 'Features' variable inside emitComputeAvailableFeatures in TableGen might be unitialized. Silence this warning by adding brackets for the default initialization. Adapt test cases to take additional brackets into account.
Configuration menu - View commit details
-
Copy full SHA for b817451 - Browse repository at this point
Copy the full SHA b817451View commit details -
[InstCombine] Fold fcmp into select (llvm#86482)
This patch simplifies `fcmp (select Cond, C1, C2), C3` patterns in ceres: Alive2: https://alive2.llvm.org/ce/z/fWh_sD ``` define i1 @src(double %x) { %cmp1 = fcmp ord double %x, 0.000000e+00 %sel = select i1 %cmp1, double 0xFFFFFFFFFFFFFFFF, double 0.000000e+00 %cmp2 = fcmp oeq double %sel, 0.000000e+00 ret i1 %cmp2 } define i1 @tgt(double %x) { %cmp1 = fcmp uno double %x, 0.000000e+00 ret i1 %cmp1 } ```
Configuration menu - View commit details
-
Copy full SHA for 9fb7a73 - Browse repository at this point
Copy the full SHA 9fb7a73View commit details -
Pre-commit reproducer for argument copy elison related bug
Adding test case related to llvm#89060 It shows that after argument copy elison the scheduler may reorder a load of the input argument and a store to the same fixed stack entry (the fixed stack entry that is reused for the local variable).
Configuration menu - View commit details
-
Copy full SHA for 56ed3dd - Browse repository at this point
Copy the full SHA 56ed3ddView commit details -
[SelectionDAG] Mark frame index as "aliased" at argument copy elison (l…
…lvm#89712) This is a fix for miscompiles reported in llvm#89060 After argument copy elison the IR value for the eliminated alloca is aliasing with the fixed stack object. This patch is making sure that we mark the fixed stack object as being aliased with IR values to avoid that for example schedulers are reordering accesses to the fixed stack object. This could otherwise happen when there is a mix of MemOperands refering the shared fixed stack slow via both the IR value for the elided alloca, and via a fixed stack pseudo source value (as would be the case when lowering the arguments).
Configuration menu - View commit details
-
Copy full SHA for d8b253b - Browse repository at this point
Copy the full SHA d8b253bView commit details -
[Flang][OpenMP] Add restriction about subobjects to firstprivate and … (
llvm#89608) …lastprivate OpenMP 5.2 standard (Section 5.3) defines privatization for list items. Section 3.2.1 in the standard defines list items to exclude variables that are part of other variables. This patch adds the restriction to firstprivate and lastprivates, it was previously added for privates. Fixes llvm#67227 Note: The specific checks that are added here are explicitly called out in OpenMP 4.0 (https://www.openmp.org/wp-content/uploads/OpenMP4.0.0.pdf) Sections 2.14.3.4 and 2.14.3.5 but in later standards have become implicit through other definitions.
Configuration menu - View commit details
-
Copy full SHA for 0661af8 - Browse repository at this point
Copy the full SHA 0661af8View commit details -
[DAGCombiner] Pre-commit test case for miscompile bug in combineShift…
…OfShiftedLogic DAGCombiner is trying to fold shl over binops, and in the process combining it with another shl. However it needs to be more careful to ensure that the sum of the shift counts fits in the type used for the shift amount. For example, X86 is using i8 as shift amount type. So we need to make sure that the sum of the shift amounts isn't greater than 255. Fix will be applied in a later commit. This only pre-commits the test case to show that we currently get the wrong result. Bug was found when testing the C23 BitInt feature.
Configuration menu - View commit details
-
Copy full SHA for 5fd9bbd - Browse repository at this point
Copy the full SHA 5fd9bbdView commit details -
[DAGCombiner] Fix miscompile bug in combineShiftOfShiftedLogic (llvm#…
…89616) Ensure that the sum of the shift amounts does not overflow the shift amount type when combining shifts in combineShiftOfShiftedLogic. Solves a miscompile bug found when testing the C23 BitInt feature. Targets like X86 that only use an i8 for shift amounts after legalization seems to be extra susceptible for bugs like this as it isn't legal to shift more than 255 steps.
Configuration menu - View commit details
-
Copy full SHA for f9b419b - Browse repository at this point
Copy the full SHA f9b419bView commit details -
[X86] getTargetShuffleMask - update to take a SDValue instead of a SD…
…Node. NFC. Also just get the value type from the SDValue instead of passing it separately.
Configuration menu - View commit details
-
Copy full SHA for 304dfe1 - Browse repository at this point
Copy the full SHA 304dfe1View commit details -
[Clang][Parser] Don't always destroy template annotations at the end …
…of a declaration (llvm#89494) Since [6163aa9](llvm@6163aa9#diff-3a7ef0bff7d2b73b4100de636f09ea68b72eda191b39c8091a6a1765d917c1a2), we have introduced an optimization that almost always destroys TemplateIdAnnotations at the end of a function declaration. This doesn't always work properly: a lambda within a default template argument could also result in such deallocation and hence a use-after-free bug while building a type constraint on the template parameter. This patch adds another flag to the parser to tell apart cases when we shouldn't do such cleanups eagerly. A bit complicated as it is, this retains the optimization on a highly templated function with lots of generic lambdas. Note the test doesn't always trigger a conspicuous bug/crash even with a debug build. But a sanitizer build can detect them, I believe. Fixes llvm#67235 Fixes llvm#89127
Configuration menu - View commit details
-
Copy full SHA for 8ab3caf - Browse repository at this point
Copy the full SHA 8ab3cafView commit details -
[VPlan] Ignore incoming values with constant false mask. (llvm#89384)
Ignore incoming values with constant false masks when trying to simplify VPBlendRecipes. As a follow-on optimization, we should also be able to drop all incoming values with false masks by creating a new VPBlendRecipe with those operands dropped. PR: llvm#89384
Configuration menu - View commit details
-
Copy full SHA for dadf6f2 - Browse repository at this point
Copy the full SHA dadf6f2View commit details -
AtomicExpand: Emit or with constant on RHS
This will save later code from commuting it.
Configuration menu - View commit details
-
Copy full SHA for 31af5e9 - Browse repository at this point
Copy the full SHA 31af5e9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 70d3ddb - Browse repository at this point
Copy the full SHA 70d3ddbView commit details -
[libc++] Add some private headers to libcxx.imp (llvm#89568)
llvm#78295 dropped private headers in top level directory from libcxx.imp. This PR re-adds them to libcxx.imp.
Configuration menu - View commit details
-
Copy full SHA for b926f75 - Browse repository at this point
Copy the full SHA b926f75View commit details -
[RemoveDIs][MLIR] Don't process debug records in the LLVM-IR translat…
…or (llvm#89735) We are almost ready to enable the use of debug records everywhere in LLVM by default; part of the prep-work for this means ensuring that every tool supports them. Every tool in the `llvm/` project supports them, front-ends that use the `DIBuilder` will support them, and as far as I can tell, the only other tool in the LLVM repo that needs to support them but doesn't is `mlir-translate`. This patch trivially unblocks them by converting from debug records to debug intrinsics before translating a module.
Configuration menu - View commit details
-
Copy full SHA for 670ac23 - Browse repository at this point
Copy the full SHA 670ac23View commit details -
Configuration menu - View commit details
-
Copy full SHA for a9e3fbf - Browse repository at this point
Copy the full SHA a9e3fbfView commit details -
[AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (llvm#89622)
As well as flipping the sense of the bit, GFX12 moved it from bit 0 to bit 1 in the encoded simm16 operand.
Configuration menu - View commit details
-
Copy full SHA for e0a763c - Browse repository at this point
Copy the full SHA e0a763cView commit details -
[SLP]Fix PR89635: do not try to vectorize single-gather alternate node.
No need to try to vectorize single gather/buildvector with alternate opcode graph, it is not profitable. In other cases, need to use last instruction for inserting the vectorized code.
Configuration menu - View commit details
-
Copy full SHA for b4a0fd4 - Browse repository at this point
Copy the full SHA b4a0fd4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 282ab54 - Browse repository at this point
Copy the full SHA 282ab54View commit details -
Reapply "[Clang][Sema] placement new initializes typedef array with c…
…orrect size (llvm#83124)" (llvm#89036) When in-place new-ing a local variable of an array of trivial type, the generated code calls 'memset' with the correct size of the array, earlier it was generating size (squared of the typedef array + size). The cause: typedef TYPE TArray[8]; TArray x; The type of declarator is Tarray[8] and in SemaExprCXX.cpp::BuildCXXNew we check if it's of typedef and of constant size then we get the original type and it works fine for non-dependent cases. But in case of template we do TreeTransform.h:TransformCXXNEWExpr and there we again check the allocated type which is TArray[8] and it stays that way, so ArraySize=(Tarray[8] type, alloc Tarray[8*type]) so the squared size allocation. ArraySize gets calculated earlier in TreeTransform.h so that if(!ArraySize) condition was failing. fix: I changed that condition to if(ArraySize). fixes llvm#41441 --------- Co-authored-by: erichkeane <ekeane@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 74cab54 - Browse repository at this point
Copy the full SHA 74cab54View commit details -
[SystemZ][z/OS] Make z/OS personality function known (llvm#89679)
This change adds the z/OS personality function to the list of known EH personality functions. It enables removing of the EH data/labels if the personality function is not invoked.
Configuration menu - View commit details
-
Copy full SHA for d5022d9 - Browse repository at this point
Copy the full SHA d5022d9View commit details -
[VPlan] Add scalar inferencing support for Not and Or insns (llvm#89160)
Fixes llvm#87394. PR: llvm#89160
Configuration menu - View commit details
-
Copy full SHA for adb0126 - Browse repository at this point
Copy the full SHA adb0126View commit details -
[libc++][ranges] P2387R3: Pipe support for user-defined range adaptors (
llvm#89148) This patch finalizes the std::ranges::range_adaptor_closure class template from https://wg21.link/P2387R3. // [range.adaptor.object], range adaptor objects template<class D> requires is_class_v<D> && same_as<D, remove_cv_t<D>> class range_adaptor_closure { }; The current implementation of __range_adaptor_closure was introduced in ee44dd8 and has served as the foundation for the range adaptors in libc++ for a while. This patch keeps its implementation, with the exception of the following changes: - __range_adaptor_closure now includes the missing constraints `is_class_v<D> && same_as<D, remove_cv_t<D>>` to restrict the type of class that can inherit from it. (https://eel.is/c++draft/ranges.syn) - The operator| of __range_adaptor_closure no longer requires its first argument to model viewable_range. (https://eel.is/c++draft/range.adaptor.object#1) - The _RangeAdaptorClosure concept is refined to exclude cases where T models range or where T has base classes of type range_adaptor_closure<U> for another type U. (https://eel.is/c++draft/range.adaptor.object#2)
Configuration menu - View commit details
-
Copy full SHA for c108653 - Browse repository at this point
Copy the full SHA c108653View commit details -
[mlir][linalg] Add runtime verification for linalg ops (llvm#89342)
This commit implements runtime verification for LinalgStructuredOps using the existing `RuntimeVerifiableOpInterface`. The verification checks that the runtime sizes of the operands match the runtime sizes inferred by composing the loop ranges with the op's indexing maps.
Configuration menu - View commit details
-
Copy full SHA for 8317d36 - Browse repository at this point
Copy the full SHA 8317d36View commit details -
clang/win: Add a flag to disable default-linking of compiler-rt libra…
…ries (llvm#89642) For ASan, users already manually have to pass in the path to the lib, and for other libraries they have to pass in the path to the libpath. With LLVM's unreliable name of the lib (due to LLVM_ENABLE_PER_TARGET_RUNTIME_DIR confusion and whatnot), it's useful to be able to opt in to just explicitly passing the paths to the libs everywhere. Follow-up of sorts to https://reviews.llvm.org/D65543, and to llvm#87866.
Configuration menu - View commit details
-
Copy full SHA for 1d7086e - Browse repository at this point
Copy the full SHA 1d7086eView commit details -
Reapply "[RISCV] Implement RISCVISD::SHL_ADD and move patterns into c…
…ombine (llvm#89263)" Changes since original commit: * Rebase over improved test coverage for theadba * Revert change to use TargetConstant as it appears to prevent the uimm2 clause from matching in the XTheadBa patterns. * Fix an order of operands bug in the THeadBa pattern visible in the new test coverage. Original commit message follows: This implements a RISCV specific version of the SHL_ADD node proposed in llvm#88791. If that lands, the infrastructure from this patch should seamlessly switch over the to generic DAG node. I'm posting this separately because I've run out of useful multiply strength reduction work to do without having a way to represent MUL X, 3/5/9 as a single instruction. The majority of this change is moving two sets of patterns out of tablgen and into the post-legalize combine. The major reason for this is that I have an upcoming change which needs to reuse the expansion logic, but it also helps common up some code between zba and the THeadBa variants. On the test changes, there's a couple major categories: * We chose a different lowering for mul x, 25. The new lowering involves one fewer register and the same critical path, so this seems like a win. * The order of the two multiplies changes in (3,5,9)*(3,5,9) in some cases. I don't believe this matters. * I'm removing the one use restriction on the multiply. This restriction doesn't really make sense to me, and the test changes appear positive.
Configuration menu - View commit details
-
Copy full SHA for 03760ad - Browse repository at this point
Copy the full SHA 03760adView commit details -
[mlir] Update comment about
propertiesAttr
(NFC) (llvm#89634)The comment is misleading because `propertiesAttr` is not actually ignored when the operation isn't unregistered.
Configuration menu - View commit details
-
Copy full SHA for e0c2848 - Browse repository at this point
Copy the full SHA e0c2848View commit details -
Configuration menu - View commit details
-
Copy full SHA for ed255ed - Browse repository at this point
Copy the full SHA ed255edView commit details -
Configuration menu - View commit details
-
Copy full SHA for 03c8a29 - Browse repository at this point
Copy the full SHA 03c8a29View commit details -
Configuration menu - View commit details
-
Copy full SHA for c793f4a - Browse repository at this point
Copy the full SHA c793f4aView commit details -
Revert "[mlir][linalg] Add runtime verification for linalg ops" (llvm…
…#89780) Reverts llvm#89342 due to build failure
Configuration menu - View commit details
-
Copy full SHA for f426be1 - Browse repository at this point
Copy the full SHA f426be1View commit details -
[NVPTX] Improve support for rsqrt.approx (llvm#89417)
Complete support for rsqrt.approx with rsqrt.approx.f64 ([PTX ISA 9.7.3.17. Floating Point Instructions: rsqrt.approx.ftz.f64](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-rsqrt-approx-ftz-f64)). Additionally, add support for folding `sqrt` into `rsqrt`, with an optional flag to disable.
Configuration menu - View commit details
-
Copy full SHA for df60805 - Browse repository at this point
Copy the full SHA df60805View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3197146 - Browse repository at this point
Copy the full SHA 3197146View commit details -
[AArch64][GISel] Avoid scalarizing G_IMPLICIT_DEF and G_FREEZE in the…
… Legalizer (llvm#88469) It does not make sense to scalarize G_FREEZE as it leads to the generation of pairs of G_UNMERGE_VALUES and G_BUILD_VECTORs which are difficult to optimize especially when operations like G_TRUNC operate before G_FREEZE but after G_UNMERGE_VALUES. Instead, it is better to legalize G_FREEZE like any other vector type would be, as it gets lowered to a COPY during instruction selection anyways. This is an issue that was encountered when looking at the TSVC benchmark, where the legalization of G_FREEZE would cause generation of unnecessary MOVs that adversely affected the performance.
Configuration menu - View commit details
-
Copy full SHA for 143be6a - Browse repository at this point
Copy the full SHA 143be6aView commit details -
[VectorCombine][X86] shuffle-of-binops.ll - adjust no matching operan…
…d test to use FDIV Use of FDIV allows us to show a definite cost improvement with llvm#88899
Configuration menu - View commit details
-
Copy full SHA for c45fbfd - Browse repository at this point
Copy the full SHA c45fbfdView commit details -
[AArch64] Match ZIP and UZP starting from undef elements. (llvm#89578)
In case the first element of a zip/uzp mask is undef, the isZIPMask and isUZPMask functions have a 50% chance of picking the wrong "WhichResult", meaning they don't match a zip/uzp where they could. This patch alters the matching code to first check for the first non-undef element, to try and get WhichResult correct.
Configuration menu - View commit details
-
Copy full SHA for cebc960 - Browse repository at this point
Copy the full SHA cebc960View commit details -
[NFC][InstrProf] Increment valid profile stat in populateCoverage (ll…
…vm#89660) We increment `NumOfCSPGOFunc` and `NumOfPGOFunc` in `PGOUseFunc::readCounters()` already. We should do the same in `PGOUseFunc::populateCoverage`. https://github.com/llvm/llvm-project/blob/83bc7b57714dc2f6b33c188f2b95a0025468ba51/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp#L1331
Configuration menu - View commit details
-
Copy full SHA for abfb491 - Browse repository at this point
Copy the full SHA abfb491View commit details -
[flang][cuda] Remove restriction on device subprogram (llvm#89677)
Newer version allow `pure`, `elemental` and `recursive` on device subprogram.
Configuration menu - View commit details
-
Copy full SHA for 49cb6db - Browse repository at this point
Copy the full SHA 49cb6dbView commit details -
[libc++][ranges] export
std::ranges::range_adaptor_closure
(llvm#89793) This patch exports the `std::ranges::range_adaptor_closure` class template implemented in llvm#89148 from the C++ Modules file.
Configuration menu - View commit details
-
Copy full SHA for 3a9d8cd - Browse repository at this point
Copy the full SHA 3a9d8cdView commit details -
[libc++][chrono] Fixes format output of negative values. (llvm#89408)
When trying to express a time before the epoch (e.g. "one nanosecond before 00:01:40 on 1900-01-01") the date would be shown as: 1900-01-01 00:01:39.-00000001 After this patch, that time would be correctly shown as: 1900-01-01 00:01:39.999999999
Configuration menu - View commit details
-
Copy full SHA for 579d301 - Browse repository at this point
Copy the full SHA 579d301View commit details -
[llvm-exegesis] Add support for alderlake (llvm#88967)
This patch adds the PFM counter definitions for Intel alderlake CPUs.
Configuration menu - View commit details
-
Copy full SHA for 37e27a4 - Browse repository at this point
Copy the full SHA 37e27a4View commit details -
[libc++][CI] Removes clang-tidy references. (llvm#89092)
The clang-tidy selection has been made automatic recently so this is not longer needed. Thanks to Louis for spotting this.
Configuration menu - View commit details
-
Copy full SHA for 9e95951 - Browse repository at this point
Copy the full SHA 9e95951View commit details -
[DebugInfo] Report errors when DWARFUnitHeader::applyIndexEntry fails (…
…llvm#89156) Motivation: LLDB is able to report errors about these scenarios whereas LLVM's DWARF parser only gives a boolean success/fail. I want to migrate LLDB to using LLVM's DWARFUnitHeader class, but I don't want to lose some of the error reporting, so I'm adding it to the LLVM class first.
Configuration menu - View commit details
-
Copy full SHA for 1a8935a - Browse repository at this point
Copy the full SHA 1a8935aView commit details -
[libc++][doc] Updates module build instructions. (llvm#89413)
CMake has landed experimental support for using the Standard modules. This will be part of the CMake 3.30 release. This updates the build instructions to use modules with CMake. The changes have been tested locally. --------- Co-authored-by: Will Hawkins <whh8b@obs.cr>
Configuration menu - View commit details
-
Copy full SHA for 033453a - Browse repository at this point
Copy the full SHA 033453aView commit details -
[CodeGen][TII] Allow reassociation on custom operand indices (llvm#88306
Configuration menu - View commit details
-
Copy full SHA for 5fe93b0 - Browse repository at this point
Copy the full SHA 5fe93b0View commit details -
[flang] Remove hardcoded bits from AddDebugInfo. (llvm#89231)
This PR adds following options to the AddDebugInfo pass. 1. IsOptimized flag. 2. Level of debug info to generate. 3. Name of the source file This enables us to remove the hard coded values from the code. It also allows us to test the pass with different options. The tests have been modified to take advantage of that. The calling convention flag and producer name have also been improved.
Configuration menu - View commit details
-
Copy full SHA for 5f3f9d1 - Browse repository at this point
Copy the full SHA 5f3f9d1View commit details -
[lldb/test] Add basic ld.lld --debug-names tests (llvm#88335)
Test that ld.lld --debug-names (llvm#86508) built per-module index can be consumed by lldb. This has uncovered a bug during the development of the lld feature.
Configuration menu - View commit details
-
Copy full SHA for a7e2726 - Browse repository at this point
Copy the full SHA a7e2726View commit details -
Configuration menu - View commit details
-
Copy full SHA for 06cc175 - Browse repository at this point
Copy the full SHA 06cc175View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1d14034 - Browse repository at this point
Copy the full SHA 1d14034View commit details -
[libc] Generate docs for
setjmp.h
(llvm#89542)Resolves llvm#88065 Added macros and functions.
Configuration menu - View commit details
-
Copy full SHA for 3ae10fd - Browse repository at this point
Copy the full SHA 3ae10fdView commit details -
[clang] coroutine: generate valid mangled name in CodeGenFunction::ge…
…nerateAwaitSuspendWrapper (llvm#89731) Fixes llvm#89723
Configuration menu - View commit details
-
Copy full SHA for dc8f6a8 - Browse repository at this point
Copy the full SHA dc8f6a8View commit details -
[RISCV] Use SHL_ADD in remaining strength reduce cases for MUL (llvm#…
…89789) The interesting bit is the zext folding. This is the first case where we end up with a profitable fold of shNadd (zext x), y to shNadd.uw x, y. See zext_mul68 from rv64zba.ll. The test differences are cases where we can legally fold (only because there's no one use check). These are not profitable or harmful, but we can't a oneuse check without breaking the zext_mul68 case. Note that XTHeadBa doesn't appear to have the equivalent patterns so this only shows up in Zba.
Configuration menu - View commit details
-
Copy full SHA for 0c032fd - Browse repository at this point
Copy the full SHA 0c032fdView commit details -
[hwasan] Add test for hwasan pass with fixed shadow (llvm#89813)
This test records the current behavior of HWASan, which doesn't utilize the fixed shadow intrinsics of llvm@365bddf It is intended to be updated in future work ("Optimize outlined memaccess for fixed shadow on Aarch64"; llvm#88544)
Configuration menu - View commit details
-
Copy full SHA for 2662bce - Browse repository at this point
Copy the full SHA 2662bceView commit details -
[libc] Make fenv and math tests preserve fenv_t state (llvm#89658)
This adds a new test fixture class FEnvSafeTest (usable as a base class for other fixtures) that ensures each test doesn't perturb the `fenv_t` state that the next test will start with. It also provides types and methods tests can use to explicitly wrap code under test either to check that it doesn't perturb the state or to save and restore the state around particular test code. All the fenv and math tests are updated to use this so that none can affect another. Expectations that code under test and/or tests themselves don't perturb state can be added later.
Configuration menu - View commit details
-
Copy full SHA for 837dab9 - Browse repository at this point
Copy the full SHA 837dab9View commit details -
[libc++][TZDB] Fixes reverse time lookups. (llvm#89502)
Testing with the get_info() returning a local_info revealed some issues in the reverse lookup. This needed an additional quirk. Also the skipping when not in the current continuation optimization was wrong. It prevented merging two sys_info objects.
Configuration menu - View commit details
-
Copy full SHA for 4e9decf - Browse repository at this point
Copy the full SHA 4e9decfView commit details -
[memprof] Take Schema into account in PortableMemInfoBlock::serialize…
…dSize (llvm#89824) PortableMemInfoBlock::{serialize,deserialize} take Schema into account, allowing us to serialize/deserialize a subset of the fields. However, PortableMemInfoBlock::serializedSize does not. That is, it assumes that all fields are always serialized and deserialized. In other words, if we choose to serialize/deserialize a subset of the fields, serializedSize would claim more storage than we actually need. This patch fixes the problem by teaching serializedSize to take Schema into account. For now, this patch has no effect on the actual indexed MemProf profile because we serialize/deserialize all fields, but that might change in the future. Aside from check-llvm, I tested this patch by verifying that llvm-profdata generates bit-wise identical files for each version for a large raw MemProf file I have.
Configuration menu - View commit details
-
Copy full SHA for edf733b - Browse repository at this point
Copy the full SHA edf733bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b8d385 - Browse repository at this point
Copy the full SHA 6b8d385View commit details -
Configuration menu - View commit details
-
Copy full SHA for 859de94 - Browse repository at this point
Copy the full SHA 859de94View commit details -
[Nomination] New Intel representative for the security group (llvm#89435
) Sergey Malsov has left Intel. I would like to nominate Will Huhn to replace him as an Intel representative in the LLVM security group. Will is a security champion for the Intel compiler team. I believe he will be a valuable addition to the LLVM security group as a second representative from Intel. He has more security-specific expertise than me. I regularly consult with Will about topics the LLVM security group is considering, and it will be useful to have him more directly involved.
Andy Kaylor authoredApr 23, 2024 Configuration menu - View commit details
-
Copy full SHA for 5ac744d - Browse repository at this point
Copy the full SHA 5ac744dView commit details -
[clang-tidy][modernize-use-starts-ends-with] Add support for compare() (
llvm#89530) Using `compare` is the next most common roundabout way to express `starts_with` before it was added to the standard. In this case, using `starts_with` is a readability improvement. Extend existing `modernize-use-starts-ends-with` to cover this case. ``` // The following will now be replaced by starts_with(). string.compare(0, strlen("prefix"), "prefix") == 0; string.compare(0, 6, "prefix") == 0; string.compare(0, prefix.length(), prefix) == 0; string.compare(0, prefix.size(), prefix) == 0; ```
Configuration menu - View commit details
-
Copy full SHA for ef59069 - Browse repository at this point
Copy the full SHA ef59069View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4182120 - Browse repository at this point
Copy the full SHA 4182120View commit details -
[Xtensa] Implement base CallConvention. (llvm#83280)
Implement base Calling Convention functionality. Implement stack load/store register operations. Implement call lowering.
Configuration menu - View commit details
-
Copy full SHA for 36209d3 - Browse repository at this point
Copy the full SHA 36209d3View commit details -
Revert "Reapply "[Clang][Sema] placement new initializes typedef arra…
…y with correct size (llvm#83124)" (llvm#89036)" This reverts commit 74cab54.
Configuration menu - View commit details
-
Copy full SHA for e1321fa - Browse repository at this point
Copy the full SHA e1321faView commit details -
[RISCV] Split code that tablegen needs out of RISCVISAInfo. (llvm#89684)
This introduces a new file, RISCVISAUtils.cpp and moves the rest of RISCVISAInfo to the TargetParser library. This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
Configuration menu - View commit details
-
Copy full SHA for 733a877 - Browse repository at this point
Copy the full SHA 733a877View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0c0c5c4 - Browse repository at this point
Copy the full SHA 0c0c5c4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 688c10d - Browse repository at this point
Copy the full SHA 688c10dView commit details -
[msan] Eliminate non-deterministic behavior in the pass (llvm#89831)
Almost NFC, instrumentation is as correct as it was before. We need InstrumentationList grouped by origin instruction, so we used stable_sort. However these objects already grouped because we never interleave sequences of `insertShadowCheck` of different instrunction. Pointer sort has artifact that it was deppendent on allocator behavior, so we could inserted checks in a different order. There is no test, as I failed to reproduce this with `opt`. My guess is that for reproducer we need to increase fragmentation in the allocator.
Configuration menu - View commit details
-
Copy full SHA for 4f4ebee - Browse repository at this point
Copy the full SHA 4f4ebeeView commit details -
Configuration menu - View commit details
-
Copy full SHA for d56f08b - Browse repository at this point
Copy the full SHA d56f08bView commit details -
[clang][RISCV] Support RVV bfloat16 C intrinsics (llvm#89354)
It follows the interface defined here: riscv-non-isa/rvv-intrinsic-doc#293
Configuration menu - View commit details
-
Copy full SHA for 3fa6b9c - Browse repository at this point
Copy the full SHA 3fa6b9cView commit details -
[lldb] Fix crash in SymbolFileCTF::ParseFunctions (llvm#89845)
Make SymbolFileCTF::ParseFunctions resilient against not being able to resolve the argument or return type of a function. ResolveTypeUID can fail for a variety of reasons so we should always check its result. The type that caused the crash was `_Bool` which we didn't recognize as a basic type. This commit also fixes the underlying issue and adds a test. rdar://126943722
Configuration menu - View commit details
-
Copy full SHA for fd4399c - Browse repository at this point
Copy the full SHA fd4399cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9c4735e - Browse repository at this point
Copy the full SHA 9c4735eView commit details
Commits on Apr 24, 2024
-
IRSymTab: Record _GLOBAL_OFFSET_TABLE_ for ELF x86
In ELF, relocatable files generated for x86-32 and some code models of x86-64 (medium, large) may reference the special symbol `_GLOBAL_OFFSET_TABLE_` that is not used in the IR. In an LTO link, if there is no regular relocatable file referencing the special symbol, the linker may not define the symbol and lead to a spurious "undefined symbol" error. Fix llvm#61101: record that `_GLOBAL_OFFSET_TABLE_` is used in the IR symbol table. Note: The `PreservedSymbols` mechanism (https://reviews.llvm.org/D112595) that just sets `FB_used` is not applicable. The `getRuntimeLibcallSymbols` for extracting lazy runtime library symbols is for symbols that are "always" potentially used, but linkers don't have the code model information to make a precise decision. Pull Request: llvm#89463
Configuration menu - View commit details
-
Copy full SHA for 99e7350 - Browse repository at this point
Copy the full SHA 99e7350View commit details -
[NFC][MC][AArch64] Do not use else after return in
getRelocType
(ll……vm#89818) After llvm#89563, we do not use else after return in code corresponding to `R_AARCH64_AUTH_ABS64` reloc in `getRelocType`. This patch removes use of else after return in other places in `getRelocType`.
Configuration menu - View commit details
-
Copy full SHA for 2cbc2e3 - Browse repository at this point
Copy the full SHA 2cbc2e3View commit details -
Configuration menu - View commit details
-
Copy full SHA for dc5939d - Browse repository at this point
Copy the full SHA dc5939dView commit details -
[PowerPC] Add PPC prefix to retglue ISD node. NFC. (llvm#89771)
So that aligned with other targets.
Kai Luo authoredApr 24, 2024 Configuration menu - View commit details
-
Copy full SHA for d97cdd7 - Browse repository at this point
Copy the full SHA d97cdd7View commit details -
[InstCombine] Fix miscompile in negation of select (llvm#89698)
Swapping the operands of a select is not valid if one hand is more poisonous that the other, because the negation zero contains poison elements. Fix this by adding an extra parameter to isKnownNegation() to forbid poison elements. I've implemented this using manual checks to avoid needing four variants for the NeedsNSW/AllowPoison combinations. Maybe there is a better way to do this... Fixes llvm#89669.
Configuration menu - View commit details
-
Copy full SHA for a1b1c4a - Browse repository at this point
Copy the full SHA a1b1c4aView commit details -
[InstCombine] Fix poison propagation in select of bitwise fold (llvm#…
…89701) We're replacing the select with the false value here, but it may be more poisonous if m_Not contains poison elements. Fix this by introducing a m_NotForbidPoison matcher and using it here. Fixes llvm#89500.
Configuration menu - View commit details
-
Copy full SHA for 7339f7b - Browse repository at this point
Copy the full SHA 7339f7bView commit details -
[RISCV] Remove implication of F extension for XTHeadFMemIdx from RISC…
…VFeatures.td. There is no implies rule in RISCVISAInfo.cpp so this makes them consistent. Soon RISCVFeatures.td will be used to generate RISCVISAInfo.cpp so it won't be possible to mismatch.
Configuration menu - View commit details
-
Copy full SHA for cc73c5c - Browse repository at this point
Copy the full SHA cc73c5cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 469c8a0 - Browse repository at this point
Copy the full SHA 469c8a0View commit details -
[RISCV] Don't make Zacas or Zabha imply A in RISCVISAInfo.cpp
Zabha and Zacas are both documented as depending on Zaamo. I'm hesitant to make them imply Zaamo instead. So remove the implication and replace with a check that either A or Zaamo is enabled.
Configuration menu - View commit details
-
Copy full SHA for d9715c6 - Browse repository at this point
Copy the full SHA d9715c6View commit details -
[InstCombine] Fix symbol conflicts in tests (NFC)
These tests break when regenerated due to symbol conflicts.
Configuration menu - View commit details
-
Copy full SHA for aa1e912 - Browse repository at this point
Copy the full SHA aa1e912View commit details -
Configuration menu - View commit details
-
Copy full SHA for ba702aa - Browse repository at this point
Copy the full SHA ba702aaView commit details -
[LIT][NVPTX] Add a few more known ptxas versions (llvm#89761)
This patch adds known ptxas versions up to 12.4, to have tests targeting them. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for da1e3e8 - Browse repository at this point
Copy the full SHA da1e3e8View commit details -
[WebAssembly] Fix uses of -DAG and -NOT in wasm-target-features.c (ll…
…vm#89777) We are currently using `PREFIX-DAG` and `PREFIX-NOT` within a single `PREFIX` test in a mixed way, but `-DAG` and `-NOT` do not work that way. For example: Result: ``` 1 2 3 ``` Test file: ```c // CHECK-DAG: 3 // CHECK-DAG: 1 // CHECK-NOT: 2 ``` This does not work. The last line `CHECK-NOT: 2` does not trigger any error, because we've already covered all three lines (1~3) while matching `CHECK-DAG: 3` and `CHECK-DAG: 1`, and FileCheck tries to check the line `CHECK-NOT: 2` _after_ the line `3`. Actually, we have ```c // BLEEDING-EDGE-NOT:#define __wasm_reference_types__ 1{{$}} ``` even though reference-types is enabled in 'bleeding-edge' config, and this has not triggered any error. This section (https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-dag-directive) explains the interactions between `CHECK-DAG` and `CHECK-NOT`s: > As a result, the surrounding `CHECK-DAG:` directives cannot be reordered, i.e. all occurrences matching `CHECK-DAG:` before `CHECK-NOT:` must not fall behind occurrences matching `CHECK-DAG:` after `CHECK-NOT:`. So in order to test the 'include' lists and 'not-include' lists, we have to run the tests twice with different prefixes. This splits `GENERIC` and `BLEEDING-EDGE` tests in two configs (`***-INCLUDE` and `***`) to test them correctly. This also adds some spaces after colons, sorts the feature lists, and adds `1{{$}}` to the `MVP` tests to make them consistent with `GENERIC` and `BLEEDING-EDGE` tests.
Configuration menu - View commit details
-
Copy full SHA for c8c1e4e - Browse repository at this point
Copy the full SHA c8c1e4eView commit details -
[WebAssembly] Tidy up wasm-target-features.c (llvm#89778)
This tidies up `wasm-target-features.c` cosmetically: - Sorts the feature tests alphabetically - Adds a space after colons
Configuration menu - View commit details
-
Copy full SHA for 88b6186 - Browse repository at this point
Copy the full SHA 88b6186View commit details -
Configuration menu - View commit details
-
Copy full SHA for b82a4bf - Browse repository at this point
Copy the full SHA b82a4bfView commit details -
[RISCV] Use the store value's VT as the MemoryVT after combining risc…
…v.masked.strided.store (llvm#89874) According to `RISCVTargetLowering::getTgtMemIntrinsic`, the MemoryVT is the scalar element VT for strided store and the MemoryVT is the same as the store value's VT for unit-stride store. After combining `riscv.masked.strided.store` to `masked.store`, we just use the scalar element VT to construct `masked.store`, which is wrong. With wrong MemoryVT, the DAGCombiner will combine `trunc+masked.store` to truncated `masked.store` because `TLI.canCombineTruncStore` returns true. So, we should use the store value's VT as the MemoryVT. This fixes llvm#89833.
Configuration menu - View commit details
-
Copy full SHA for 6493da7 - Browse repository at this point
Copy the full SHA 6493da7View commit details -
[clang] Mark ill-formed partial specialization as invalid (llvm#89536)
Fixes llvm#89374 Solution suggested by @cor3ntin
Configuration menu - View commit details
-
Copy full SHA for 805d563 - Browse repository at this point
Copy the full SHA 805d563View commit details -
[IR] Memory Model Relaxation Annotations (llvm#78569)
Implements the core/target-agnostic components of Memory Model Relaxation Annotations. RFC: https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5
Configuration menu - View commit details
-
Copy full SHA for cf328ff - Browse repository at this point
Copy the full SHA cf328ffView commit details -
[IR] Remove unused variable in Verifier.cpp (NFC)
llvm-project/llvm/lib/IR/Verifier.cpp:4854:14: error: unused variable 'IsLeaf' [-Werror,-Wunused-variable] const auto IsLeaf = [](const Metadata *CurMD) { ^ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for 806db47 - Browse repository at this point
Copy the full SHA 806db47View commit details -
[RISCV] Remove -riscv-split-regalloc flag (llvm#89715)
Split vector and scalar regalloc has been enabled by default for 5 months now since d0a39e6, and shipped with 18.1.0. I haven't heard of any issues with it so far, so this proposes to remove the flag to reduce the number of configurations we have to support.
Configuration menu - View commit details
-
Copy full SHA for ad4a42b - Browse repository at this point
Copy the full SHA ad4a42bView commit details -
Re-apply "[ORC] Unify task dispatch across ExecutionSession..." with …
…more fixes. This re-applies 6094b3b, which was reverted in e7efd37 (and before that in 1effa19) due to bot failures. The test failures were fixed by having SelfExecutorProcessControl use an InPlaceTaskDispatcher by default, rather than a DynamicThreadPoolTaskDispatcher. This shouldn't be necessary (and indicates a concurrency issue elsewhere), but InPlaceTaskDispatcher is a less surprising default, and better matches the existing behavior (compilation on current thread by default), so the change seems reasonable. I've filed llvm#89870 to investigate the concurrency issue as a follow-up. Coding my way home: 6.25133S 127.94177W
Configuration menu - View commit details
-
Copy full SHA for 7da6342 - Browse repository at this point
Copy the full SHA 7da6342View commit details -
[TableGen][GlobalISel] Specialize more MatchTable Opcodes (llvm#89736)
The vast majority of the following (very common) opcodes were always called with identical arguments: - `GIM_CheckType` for the root - `GIM_CheckRegBankForClass` for the root - `GIR_Copy` between the old and new root - `GIR_ConstrainSelectedInstOperands` on the new root - `GIR_BuildMI` to create the new root I added overloaded version of each opcode specialized for the root instructions. It always saves between 1 and 2 bytes per instance depending on the number of arguments specialized into the opcode. Some of these opcodes had between 5 and 15k occurences in the AArch64 GlobalISel Match Table. Additionally, the following opcodes are almost always used in the same sequence: - `GIR_EraseFromParent 0` + `GIR_Done` - `GIR_EraseRootFromParent_Done` has been created to do both. Saves 2 bytes per occurence. - `GIR_IsSafeToFold` was *always* called for each InsnID except 0. - Changed the opcode to take the number of instructions to check after `MI[0]` The savings from these are pretty neat. For `AArch64GenGlobalISel.inc`: - `AArch64InstructionSelector.cpp.o` goes down from 772kb to 704kb (-10% code size) - Self-reported MatchTable size goes from 420380 bytes to 352426 bytes (~ -17%) A smaller match table means a faster match table because we spend less time iterating and decoding. I don't have a solid measurement methodology for GlobalISel performance so I don't have precise numbers but I saw a few % of improvements in a simple testcase.
Configuration menu - View commit details
-
Copy full SHA for 9375962 - Browse repository at this point
Copy the full SHA 9375962View commit details -
Configuration menu - View commit details
-
Copy full SHA for 008b7f1 - Browse repository at this point
Copy the full SHA 008b7f1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 46b011d - Browse repository at this point
Copy the full SHA 46b011dView commit details -
[ORC] Fix -Wunused-variable in LLJIT.cpp (NFC)
llvm-project/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp:684:8: error: unused variable 'ConcurrentCompilationSettingDefaulted' [-Werror,-Wunused-variable] bool ConcurrentCompilationSettingDefaulted = !SupportConcurrentCompilation; ^ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for 9a8235a - Browse repository at this point
Copy the full SHA 9a8235aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 78ebaa2 - Browse repository at this point
Copy the full SHA 78ebaa2View commit details -
Configuration menu - View commit details
-
Copy full SHA for b3ca9c3 - Browse repository at this point
Copy the full SHA b3ca9c3View commit details -
[ValueTracking] Add support for
trunc nuw/nsw
in isKnowNonZeroWith `nsw`/`nuw`, the `trunc` is non-zero if its operand is non-zero. Proofs: https://alive2.llvm.org/ce/z/iujmk6 Closes llvm#89643
Configuration menu - View commit details
-
Copy full SHA for b933c84 - Browse repository at this point
Copy the full SHA b933c84View commit details -
[lldb] Enable support for Markdown documentation pages (llvm#89716)
RST is powerful but usually too powerful for 90% of what we need it for. Markdown is easier to edit and can be previewed easily without building the entire website. This copies what llvm does already, making myst_parser optional if you only want man pages. Previously we had Markdown enabled in 8b95bd3 but that got reverted. That did this in a different way but I've gone with the standard llvm set this time. I intend the first Markdown pages to be the remote protocol extension docs, as they are not in any set format right now.
Configuration menu - View commit details
-
Copy full SHA for 62db434 - Browse repository at this point
Copy the full SHA 62db434View commit details -
[clang][NFC] Remove useless code in ASTWriter
A follow-up to llvm#71709, addressing the static analysis finding reported in https://github.com/llvm/llvm-project/pull/71709/files#r1576846306
Configuration menu - View commit details
-
Copy full SHA for 662ef86 - Browse repository at this point
Copy the full SHA 662ef86View commit details -
[ARM][AArch64] autogenerate header file for TargetParser from Target …
…tablegen files (llvm#88378) Introduce a mechanism to share data between the ARM and AArch64 backends and TargetParser, to reduce duplication of code. This is similar to the current RISC-V implementation. The target tablegen file (in this case `ARM.td` or `AArch64.td`) is processed during building of `TargetParser` to generate the following files in the build tree: - `build/include/llvm/TargetParser/ARMTargetParserDef.inc` - `build/include/llvm/TargetParser/AArch64TargetParserDef.inc` For now, the use of these generated files is limited to files _outside_ of `TargetParser`. The main reason for this is that the modifications to `TargetParser` will require additional data added to the tablegen files, which I want to split into separate PRs.
Configuration menu - View commit details
-
Copy full SHA for 71c5964 - Browse repository at this point
Copy the full SHA 71c5964View commit details -
[ORC] Fix bot failure due to 7da6342 (ORC task dispatch unification).
Fixes the failure at https://lab.llvm.org/buildbot/#/builders/131/builds/62928, and add comments about unused variable and update debugging output. Coding my way home: 6.44615S, 128.16704W
Configuration menu - View commit details
-
Copy full SHA for 69703b1 - Browse repository at this point
Copy the full SHA 69703b1View commit details -
[lldb][Docs] Convert GDB protocol extensions doc to Markdown and add …
…to website (llvm#89718) This document has never been on the website, unlike GDB's protocol docs. It will be useful to have both available online to compare. Markdown is easier to edit and preview in many editors (including Github itself), so I've chosen that over RST. Plus, building the website takes minutes and I lose the will to make nice edits when I have to deal with that. The standard dialiect lacks some things notably multi-line table cells, so I've converted large tables into bullet point lists so that we still get text wrapping. This is a downside but I think the simplicity of Markdown outweighs this. I have applied the plain text markers where I've noticed it and escaped some HTML characters. There may be more changes needed but, it's Markdown, so it's in theory a lot easier for someone to fix it!
Configuration menu - View commit details
-
Copy full SHA for 601d0ca - Browse repository at this point
Copy the full SHA 601d0caView commit details -
[SPIR-V] New validation tests for pointer and primitive types (llvm#8…
…9632) This patch adds new tests mostly checking SPIR-V validation of pointer and primitive types.
Configuration menu - View commit details
-
Copy full SHA for c071c1d - Browse repository at this point
Copy the full SHA c071c1dView commit details -
[RISCV] Separate doLocalPostpass into new pass and move to post vecto…
…r regalloc (llvm#88295) This patch splits off part of the work to move vsetvli insertion to post regalloc in llvm#70549. The doLocalPostpass operates outside of RISCVInsertVSETVLI's dataflow, so we can move it to its own pass. We can then move it to post vector regalloc which should be a smaller change. A couple of things that are different from llvm#70549: - This manually fixes up the LiveIntervals rather than recomputing it via createAndComputeVirtRegInterval. I'm not sure if there's much of a difference with either. - For the postpass it's sufficient enough to just check isUndef() in hasUndefinedMergeOp, i.e. we don't need to lookup the def in VNInfo. Running on llvm-test-suite and SPEC CPU 2017 there aren't any changes in the number of vsetvlis removed. There are some minor scheduling diffs as well as extra spills and less spills in some cases (caused by transient vsetvlis existing between RISCVInsertVSETVLI and RISCVCoalesceVSETVLI when vec regalloc happens), but they are minor and should go away once we finish moving the rest of RISCVInsertVSETVLI. We could also potentially turn off this pass for unoptimised builds.
Configuration menu - View commit details
-
Copy full SHA for 603ba4c - Browse repository at this point
Copy the full SHA 603ba4cView commit details -
[libc][bazel] Allow configure options to alter all targets (llvm#89251)
The previous state was leading to inconsistencies. Some targets would get the options and some wouldn't. As an example, the `MEMORY_COPTS` definitions would only apply to the `:string_memory_utils` target but not to the `:memcpy` target. This patch makes sure definitions are applied throughout the LLVM libc targets as `local_defines`. This ensures that the preprocessor definitions don't propagate to depending targets outside of LLVM libc, and that all libc targets have consistent preprocessor definitions.
Configuration menu - View commit details
-
Copy full SHA for 788d159 - Browse repository at this point
Copy the full SHA 788d159View commit details -
[AMDGPU] Allow WorkgroupID intrinsics in amdgpu_gfx functions (llvm#8…
…9773) With GFX12 architected SGPRs the workgroup ids are trivially available in any function called from a compute entrypoint.
Configuration menu - View commit details
-
Copy full SHA for 4616368 - Browse repository at this point
Copy the full SHA 4616368View commit details -
[libcxx] [modules] Add _LIBCPP_USING_IF_EXISTS on aligned_alloc (llvm…
…#89827) This is missing e.g. on Windows. With this change, it's possible to make the libcxx std module work on mingw-w64 (although that requires a few fixes to those headers). In the regular cstdlib header, we have _LIBCPP_USING_IF_EXISTS flagged on every single reexported function (since a9c9183), but the modules seem to only have _LIBCPP_USING_IF_EXISTS set on a few individual functions, so far.
Configuration menu - View commit details
-
Copy full SHA for 91526d6 - Browse repository at this point
Copy the full SHA 91526d6View commit details -
[RISCV] Add test coverage for commutable RVV instructions
This patch adds test coverage for commutable RVV instructions added in llvm#88379. For each kind of instruction, I add two tests (one for unmasked and one for masked). These tests don't cover all the SEWs/LMULs as I think it's not worthy because there is no difference when handling instructions with different SEWs/LMULs. As the tests shown, we can't eliminate two equal instructions if there is a use of `V0`. This may be fixed in the future. Reviewers: asb, jacquesguan, topperc, lukel97, preames Reviewed By: lukel97 Pull Request: llvm#89889
Configuration menu - View commit details
-
Copy full SHA for d149370 - Browse repository at this point
Copy the full SHA d149370View commit details -
[InstCombine] Simplify
(X / C0) * C1 + (X % C0) * C2
to `(X / C0) *…… (C1 - C2 * C0) + X * C2` (llvm#76285) Since `DivRemPairPass` runs after `ReassociatePass` in the optimization pipeline, I decided to do this simplification in `InstCombine`. Alive2: https://alive2.llvm.org/ce/z/Jgsiqf Fixes llvm#76128.
Configuration menu - View commit details
-
Copy full SHA for 945eeb2 - Browse repository at this point
Copy the full SHA 945eeb2View commit details -
[ORC] Fix SpeculativeJIT example after 7da6342 (ORC dispatch unificat…
…ion). Fixes the bot failure at https://lab.llvm.org/buildbot/#/builders/272/builds/14788. Coding my way home: 6.48551S, 128.21109W
Configuration menu - View commit details
-
Copy full SHA for e400e90 - Browse repository at this point
Copy the full SHA e400e90View commit details -
[libclc] Use a response file when building on Windows (llvm#89756)
We've recently seen the libclc llvm-link invocations become so long that they exceed the character limits on certain platforms. Using a 'response file' should solve this by offloading the list of inputs into a separate file, and using special syntax to pass it to llvm-link. Note that neither the response file nor syntax aren't specific to Windows but we restrict it to that platform regardless. We have the option of expanding it to other platforms in the future.
Configuration menu - View commit details
-
Copy full SHA for effb2f1 - Browse repository at this point
Copy the full SHA effb2f1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4c3b0a6 - Browse repository at this point
Copy the full SHA 4c3b0a6View commit details -
[VectorCombine] foldShuffleOfBinops - add support for length changing…
… shuffles (llvm#88899) Refactor to be closer to foldShuffleOfCastops - sibling patch to llvm#88743 that can be used to address some of the issues identified in llvm#88693
Configuration menu - View commit details
-
Copy full SHA for 282b56f - Browse repository at this point
Copy the full SHA 282b56fView commit details -
Bit width of input/result types in OpSConvert/OpUConvert must not be …
…the same (llvm#89737) This PR fixes the issue llvm#88908 Attached test case is updated to check that OpSConvert/OpUConvert is not generated when input and result types are identical.
Configuration menu - View commit details
-
Copy full SHA for 89d1255 - Browse repository at this point
Copy the full SHA 89d1255View commit details -
[SPIR-V] Fix pre-legalizer pass in SPIR-V Backend to support more gMI…
…R opcode inserted by IRTranslator (llvm#89890) Translating global values, IRTranslator pass can sometimes generates code patterns that require additional efforts during pre-legalization. This PR addresses this problem to support G_PTRTOINT instruction used in initialization of GV.
Configuration menu - View commit details
-
Copy full SHA for 486ea1e - Browse repository at this point
Copy the full SHA 486ea1eView commit details -
[flang][OpenMP] fix reduction of arrays with non-default lower bounds (…
…llvm#89611) It turned out that `hlfir::genVariableBox` didn't add lower bounds to the boxes it created. Using a shapeshift instead of only a shape adds the lower bounds information to the thread-local copy of the box. Fixes llvm#89259
Configuration menu - View commit details
-
Copy full SHA for 18bf0c3 - Browse repository at this point
Copy the full SHA 18bf0c3View commit details -
[flang] de-duplicate CFGConversion pass (llvm#89783)
See RFC at https://discourse.llvm.org/t/rfc-add-an-interface-for-top-level-container-operations I previously did the same for the AbstractResult pass llvm#88867
Configuration menu - View commit details
-
Copy full SHA for ceca523 - Browse repository at this point
Copy the full SHA ceca523View commit details -
[ARM] Add ARMTargetDefEmitter to llvm-tblgen source
Missed from llvm#88378, only showed up in the sanitizer builds.
Configuration menu - View commit details
-
Copy full SHA for b8e97f0 - Browse repository at this point
Copy the full SHA b8e97f0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3cb660d - Browse repository at this point
Copy the full SHA 3cb660dView commit details -
Configuration menu - View commit details
-
Copy full SHA for e5de95d - Browse repository at this point
Copy the full SHA e5de95dView commit details -
[TTI] getArithmeticInstrCost - use std:nullopt to create default empt…
…y `ArrayRef<const Value *> Args` argument. NFC.
Configuration menu - View commit details
-
Copy full SHA for 506c84a - Browse repository at this point
Copy the full SHA 506c84aView commit details -
[mlir][nvgpu] NVGPU Tutorials (llvm#87065)
I have a tutorial at EuroLLVM 2024 ([Zero to Hero: Programming Nvidia Hopper Tensor Core with MLIR's NVGPU Dialect](https://llvm.swoogo.com/2024eurollvm/session/2086997/zero-to-hero-programming-nvidia-hopper-tensor-core-with-mlir's-nvgpu-dialect)). For that, I implemented tutorial codes in Python. The focus is the nvgpu dialect and how to use its advanced features. I thought it might be useful to upstream this. The tutorial codes are as follows: - **Ch0.py:** Hello World - **Ch1.py:** 2D Saxpy - **Ch2.py:** 2D Saxpy using TMA - **Ch3.py:** GEMM 128x128x64 using Tensor Core and TMA - **Ch4.py:** Multistage performant GEMM using Tensor Core and TMA - **Ch5.py:** Warp Specialized GEMM using Tensor Core and TMA I might implement one more chapter: - **Ch6.py:** Warp Specialized Persistent ping-pong GEMM This PR also introduces the nvdsl class, making IR building in the tutorial easier.
Configuration menu - View commit details
-
Copy full SHA for 4d33082 - Browse repository at this point
Copy the full SHA 4d33082View commit details -
Configuration menu - View commit details
-
Copy full SHA for 333aad7 - Browse repository at this point
Copy the full SHA 333aad7View commit details -
AMDGPU: Remove dead arguments in test and add SGPR variants
Also cleanup to avoid the memory noise by using return values in the trivial cases.
Configuration menu - View commit details
-
Copy full SHA for a13ff06 - Browse repository at this point
Copy the full SHA a13ff06View commit details -
Configuration menu - View commit details
-
Copy full SHA for 401658c - Browse repository at this point
Copy the full SHA 401658cView commit details -
[LoopUnroll] Add tests for performing load CSE after unrolling.
Precommit tests for llvm#83860.
Configuration menu - View commit details
-
Copy full SHA for 01f8da9 - Browse repository at this point
Copy the full SHA 01f8da9View commit details -
Configuration menu - View commit details
-
Copy full SHA for c81ec1f - Browse repository at this point
Copy the full SHA c81ec1fView commit details -
[AArch64][CodeGen] Add patterns for small negative VScale const (llvm…
…#89607) On AArch64, rdvl can accept a nagative value, while cntd/cntw/cnth can't. As we do support VScale with a negative multiply value, so we did not limit the negative value and instead took the hit of having the extra patterns according PR88108. Also add NoUseScalarIncVL to avoid affecting patterns works for -mattr=+use-scalar-inc-vl Fix llvm#84620
Configuration menu - View commit details
-
Copy full SHA for af81d8e - Browse repository at this point
Copy the full SHA af81d8eView commit details -
[AMDGPU] Correctly determine the toolchain linker (llvm#89803)
Summary: The AMDGPU toolchain simply took the short name to get the link job instead of using the common utilities that respect options like `-fuse-ld`. Any linker that isn't `ld.lld` will fail, however we should be able to override it.
Configuration menu - View commit details
-
Copy full SHA for 62549db - Browse repository at this point
Copy the full SHA 62549dbView commit details -
Configuration menu - View commit details
-
Copy full SHA for eaa2eac - Browse repository at this point
Copy the full SHA eaa2eacView commit details -
[DAG] Add getValid*ShiftAmountConstant wrappers without DemandedElts
Simplify callers which don't have their own DemandedElts mask. Noticed while reviewing llvm#88801
Configuration menu - View commit details
-
Copy full SHA for 9f2a068 - Browse repository at this point
Copy the full SHA 9f2a068View commit details -
[MLIR][LLVM][Mem2Reg] Extends support for partial stores (llvm#89740)
This commit enhances the LLVM dialect's Mem2Reg interfaces to support partial stores to memory slots. To achieve this support, the `getStored` interface method has to be extended with a parameter of the reaching definition, which is now necessary to produce the resulting value after this store.
Configuration menu - View commit details
-
Copy full SHA for 6e9ea6e - Browse repository at this point
Copy the full SHA 6e9ea6eView commit details -
[mlir][python] extend LLVM bindings (llvm#89797)
Add bindings for LLVM pointer type.
Configuration menu - View commit details
-
Copy full SHA for 79d4d16 - Browse repository at this point
Copy the full SHA 79d4d16View commit details -
Configuration menu - View commit details
-
Copy full SHA for d3f6c2c - Browse repository at this point
Copy the full SHA d3f6c2cView commit details -
[clang][ExtractAPI] Fix handling of anonymous TagDecls (llvm#87772)
This changes the handling of anonymous TagDecls to the following rules: - If the TagDecl is embedded in the declaration for some VarDecl (this is the only possibility for RecordDecls), then pretend the child decls belong to the VarDecl - If it's an EnumDecl proceed as we did previously, i.e., embed it in the enclosing DeclContext. Additionally this fixes a few issues with declaration fragments not consistently including "{ ... }" for anonymous TagDecls. To make testing these additions easier this patch fixes some text declaration fragments merging issues and updates tests accordingly. rdar://121436298
Configuration menu - View commit details
-
Copy full SHA for 2bcbe40 - Browse repository at this point
Copy the full SHA 2bcbe40View commit details -
Configuration menu - View commit details
-
Copy full SHA for 93eeca3 - Browse repository at this point
Copy the full SHA 93eeca3View commit details -
[gn] port 71c5964 (-gen-arm-target-def)
Reverts d3f6c2c, since ARMTargetDefEmitter.cpp has to be in llvm-min-tblgen too.
Configuration menu - View commit details
-
Copy full SHA for b87b6e2 - Browse repository at this point
Copy the full SHA b87b6e2View commit details -
[Frontend][OpenMP] Implement getLeafOrCompositeConstructs (llvm#89104)
This function will break up a construct into constituent leaf and composite constructs, e.g. if OMPD_c_d_e and OMPD_d_e are composite constructs, then OMPD_a_b_c_d_e will be broken up into the list {OMPD_a, OMPD_b, OMPD_c_d_e}.
Configuration menu - View commit details
-
Copy full SHA for d577518 - Browse repository at this point
Copy the full SHA d577518View commit details -
Allow ZX_ERR_NO_RESOURCES with MAP_ALLOWNOMEM on Fuchsia (llvm#89767)
This can occur if the virtual address space is (almost) entirely mapped or heavily fragmented.
Configuration menu - View commit details
-
Copy full SHA for 9cbf96a - Browse repository at this point
Copy the full SHA 9cbf96aView commit details -
[Clang][AArch64] Extend diagnostics when warning non/streaming about …
…vector size difference (llvm#88380) Add separate messages about passing arguments or returning parameters with scalable types. --------- Co-authored-by: Sander de Smalen <sander.desmalen@arm.com>
Configuration menu - View commit details
-
Copy full SHA for bd34bc6 - Browse repository at this point
Copy the full SHA bd34bc6View commit details -
[MLIR][OpenMP] Make omp.wsloop into a loop wrapper (1/5) (llvm#89209)
This patch updates the definition of `omp.wsloop` to enforce the restrictions of a loop wrapper operation. Related tests are updated but this PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests.
Configuration menu - View commit details
-
Copy full SHA for 07e6c16 - Browse repository at this point
Copy the full SHA 07e6c16View commit details -
[CodeGen] Make the parameter TRI required in some functions. (llvm#85968
) Fixes llvm#82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue llvm#82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
Configuration menu - View commit details
-
Copy full SHA for f6d431f - Browse repository at this point
Copy the full SHA f6d431fView commit details -
[coro] Tweak comments about CoroAwaitSuspendInst
to reflect that there are three variants.
Configuration menu - View commit details
-
Copy full SHA for 4c16b12 - Browse repository at this point
Copy the full SHA 4c16b12View commit details -
[MLIR][OpenMP] Update op verifiers dependent on omp.wsloop (2/5) (llv…
…m#89211) This patch updates verifiers for `omp.ordered`, `omp.ordered.region`, `omp.cancel` and `omp.cancellation_point`, which check for a parent `omp.wsloop`. After transitioning to a loop wrapper-based approach, the expected direct parent will become `omp.loop_nest` instead, so verifiers need to take this into account. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests.
Configuration menu - View commit details
-
Copy full SHA for 1465299 - Browse repository at this point
Copy the full SHA 1465299View commit details -
[MLIR][SCF] Update scf.parallel lowering to OpenMP (3/5) (llvm#89212)
This patch makes changes to the `scf.parallel` to `omp.parallel` + `omp.wsloop` lowering pass in order to introduce a nested `omp.loop_nest` as well, and to follow the new loop wrapper role for `omp.wsloop`. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests.
Configuration menu - View commit details
-
Copy full SHA for 8843d54 - Browse repository at this point
Copy the full SHA 8843d54View commit details -
[MLIR][OpenMP] Update omp.wsloop translation to LLVM IR (4/5) (llvm#8…
…9214) This patch introduces minimal changes to the MLIR to LLVM IR translation of `omp.wsloop` to support the loop wrapper approach. There is `omp.loop_nest` related translation code that should be extracted and shared among all loop operations (e.g. `omp.simd`). This would possibly also help in the addition of support for compound constructs later on. This first approach is only intended to keep things running after the transition to loop wrappers and not to add support for other use cases enabled by that transition. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests.
Configuration menu - View commit details
-
Copy full SHA for 2e37f28 - Browse repository at this point
Copy the full SHA 2e37f28View commit details -
[Flang][OpenMP][Lower] Update workshare-loop lowering (5/5) (llvm#89215)
This patch updates lowering from PFT to MLIR of workshare loops to follow the loop wrapper approach. Unit tests impacted by this change are also updated. As the last patch of the stack, this should compile and pass unit tests.
Configuration menu - View commit details
-
Copy full SHA for ca4dbc2 - Browse repository at this point
Copy the full SHA ca4dbc2View commit details -
[flang] lower SHAPE intrinsic (llvm#89785)
Semantics usually fold SHAPE into an array constructor, but sometimes it cannot (like when the source is a function result that cannot be duplicated in expression analysis). Add lowering handling for shape.
Configuration menu - View commit details
-
Copy full SHA for 3328ccf - Browse repository at this point
Copy the full SHA 3328ccfView commit details -
[CostModel][AArch64] Improve fixed-width vector costs for get.active.…
…lane.mask (llvm#89068) When SVE is available we can lower calls to get.active.lane.mask using the SVE whilelo instruction, however in practice since vXi1 types are not legal for NEON we often end up expanding the predicate into a vector of integers, e.g. v4i1 -> v4i32. This usually happens when we have to keep the predicate live out of the block, for example when the predicate is the incoming value to a PHI node in a tail-folded vector loop. Currently in such cases the intrinsic call has a cost of 1, which is far too low when considering the extra instructions required to expand the predicate. This patch fixes that by basing the cost on the number of lane moves required for expansion. This is required for a follow-on patch that adds the cost of the intrinsic call to the vectorisation cost model, so that we can teach the vectoriser to make better choices.
Configuration menu - View commit details
-
Copy full SHA for 96b2e35 - Browse repository at this point
Copy the full SHA 96b2e35View commit details -
[Clang] [NFC] Prevent null pointer dereference in Sema::InstantiateFu…
…nctionDefinition (llvm#89801) In the lambda function within clang::Sema::InstantiateFunctionDefinition, the return value of a function that may return null is now checked before dereferencing to avoid potential null pointer dereference issues which can lead to crashes or undefined behavior in the program.
Configuration menu - View commit details
-
Copy full SHA for e58dcf1 - Browse repository at this point
Copy the full SHA e58dcf1View commit details -
[AMDGPU] Add a trap lowering workaround for gfx11 (llvm#85854)
On gfx11 shaders run with PRIV=1, which causes `s_trap 2` to be treated as a nop, which means it isn't a correct lowering for the trap intrinsic. As a workaround, this commit instead lowers the trap intrinsic to instructions that simulate the behavior of s_trap 2. Fixes: SWDEV-438421
Configuration menu - View commit details
-
Copy full SHA for a047147 - Browse repository at this point
Copy the full SHA a047147View commit details -
Configuration menu - View commit details
-
Copy full SHA for 21ef187 - Browse repository at this point
Copy the full SHA 21ef187View commit details -
Configuration menu - View commit details
-
Copy full SHA for 50082d6 - Browse repository at this point
Copy the full SHA 50082d6View commit details -
[Mips] Use ANDi in for zero-extend in subword atomic umax/umin for bo…
…th r2 and pre-R2 (llvm#89881) About unsigned max/min, ANDi is available for all ISA revisions in extend before slt insn. So that we can reduce one instruction.
Configuration menu - View commit details
-
Copy full SHA for e1aa162 - Browse repository at this point
Copy the full SHA e1aa162View commit details -
Configuration menu - View commit details
-
Copy full SHA for a682f52 - Browse repository at this point
Copy the full SHA a682f52View commit details -
[clang][RISCV] Remove LMUL=8 scalar input for some vector crypto inst…
…ructions (llvm#89867) Since the requirement is EEW=32, it's impossible that EGW=128 needs LMUL=8.
Configuration menu - View commit details
-
Copy full SHA for 418bdb4 - Browse repository at this point
Copy the full SHA 418bdb4View commit details
Commits on Apr 29, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 3c30af4 - Browse repository at this point
Copy the full SHA 3c30af4View commit details
Commits on May 3, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b8211e9 - Browse repository at this point
Copy the full SHA b8211e9View commit details -
Configuration menu - View commit details
-
Copy full SHA for e967097 - Browse repository at this point
Copy the full SHA e967097View commit details