Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge from main with loop wrappers + composite support + liboffload #71

Merged
merged 1,204 commits into from
May 7, 2024
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Apr 22, 2024

  1. [nfc][llvm] Fix a typo in MathExtras.h testing (llvm#89653)

    I made a small typo when writing a test for MathExtras.h, sorry!
    Moxinilian authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    7c58546 View commit details
    Browse the repository at this point in the history
  2. [libc++] Remove _LIBCPP_DISABLE_NODISCARD_EXTENSIONS and refactor the…

    … tests (llvm#87094)
    
    This also adds a few tests that were missing.
    philnik777 authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    83bc7b5 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8482dbd View commit details
    Browse the repository at this point in the history
  4. Reapply "[compiler-rt][ctx_instr] Add ctx_profile component" (llvm#…

    …89625)
    
    This reverts commit 8b2ba6a.
    
    The uild errors (see below) were likely due to the same issue PR llvm#88074 fixed. Addressed by following that PR.
    
    https://lab.llvm.org/buildbot/#/builders/165/builds/52789
    https://lab.llvm.org/buildbot/#/builders/91/builds/25273
    mtrofin committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    a3e7a12 View commit details
    Browse the repository at this point in the history
  5. [Frontend][OpenMP] Add missing "return" statement after 40137ff

    When responding to review comments, `return {}` was accidentally replaced
    by `std::nullptr` instead of `return std::nullptr`.
    kparzysz committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    b8ff08d View commit details
    Browse the repository at this point in the history
  6. [RISCV] Implement RISCVISD::SHL_ADD and move patterns into combine (l…

    …lvm#89263)
    
    This implements a RISCV specific version of the SHL_ADD node proposed in
    llvm#88791.
    
    If that lands, the infrastructure from this patch should seamlessly
    switch over the to generic DAG node. I'm posting this separately because
    I've run out of useful multiply strength reduction work to do without
    having a way to represent MUL X, 3/5/9 as a single instruction.
    
    The majority of this change is moving two sets of patterns out of
    tablgen and into the post-legalize combine. The major reason for this is
    that I have an upcoming change which needs to reuse the expansion logic,
    but it also helps common up some code between zba and the THeadBa
    variants.
    
    On the test changes, there's a couple major categories:
    * We chose a different lowering for mul x, 25. The new lowering involves
    one fewer register and the same critical path, so this seems like a win.
    * The order of the two multiplies changes in (3,5,9)*(3,5,9) in some
    cases. I don't believe this matters.
    * I'm removing the one use restriction on the multiply. This restriction
    doesn't really make sense to me, and the test changes appear positive.
    preames authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    5a7c80c View commit details
    Browse the repository at this point in the history
  7. [mlir][test] Reorganize the test dialect (llvm#89424)

    This PR massively reorganizes the Test dialect's source files. It moves
    manually-written op hooks into `TestOpDefs.cpp`, moves format custom
    directive parsers and printers into `TestFormatUtils`, adds missing
    comment blocks, and moves around where generated source files are
    included for types, attributes, enums, etc. into their own source file.
    
    This will hopefully help navigate the test dialect source code, but also
    speeds up compile time of the test dialect by putting generated source
    files into separate compilation units.
    
    This also sets up the test dialect to shard its op definitions, done in
    the next PR.
    Mogball authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    e95e94a View commit details
    Browse the repository at this point in the history
  8. [Frontend][OpenMP] Add suggested brackets in array initialization

    Fixes -Werror build after 40137ff.
    kparzysz committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    14e6f63 View commit details
    Browse the repository at this point in the history
  9. [flang] Don't emit conversion error for max(a,b, optionalCharacter) (l…

    …lvm#88156)
    
    A recent patch added an error message for whole optional dummy argument
    usage as optional arguments (third or later) to MAX and MIN when those
    names required type conversion, since that conversion only works when
    the optional arguments are present. This check shouldn't care about
    character lengths. Make it so.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    e8572d0 View commit details
    Browse the repository at this point in the history
  10. [flang] Improve error reporting for procedures determined by usage (l…

    …lvm#88184)
    
    When a symbol is known to be a procedure due to its being referenced as
    a function or subroutine, improve the error messages that appear if the
    symbol is also used as an object by attaching the source location of its
    procedural use. Also, for errors spotted in name resolution due to how a
    given symbol has been used, don't unconditionally set the symbol's error
    flag (which is otherwise generally a good idea, to prevent cascades of
    errors), so that more unrelated errors related to usage will appear.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    cb1b846 View commit details
    Browse the repository at this point in the history
  11. [flang] Fix spurious overflow warning folding exponentiation by integ… (

    llvm#88188)
    
    …er powers
    
    The code that folds exponentiation by an integer power can report a
    spurious overflow warning because it calculates one last unnecessary
    square of the base value. 10.**(+/-32) exposes the problem -- the value
    of 10.**64 is calculated but not needed. Rearrange the implementation to
    only calculate squares that are necessary.
    
    Fixes llvm#88151.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    31505c4 View commit details
    Browse the repository at this point in the history
  12. [lldb][Core] Fix pointless if conditon (llvm#89650)

    Addresses llvm#85984
    
    Signed-off-by: Troy-Butler <squintik@outlook.com>
    Co-authored-by: Troy-Butler <squintik@outlook.com>
    Troy-Butler and Troy-Butler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    2987fca View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    ce1b678 View commit details
    Browse the repository at this point in the history
  14. [test][GWP-ASan] Only add check-gwp_asan when its dependencies are bu…

    …ilt (llvm#89164)
    
    Currently, `check-gwp_asan` is added no matter its dependencies are
    built or not, this is wrong and will cause cmake error when scudo is not
    built. This patch includes the target in the dependencies check.
    yingcong-wu authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    6884c1f View commit details
    Browse the repository at this point in the history
  15. [flang] Fix crash on erroneous program (llvm#88192)

    Constant folding had a CHECK on array subscript rank that should more
    gracefully handle a bad program with a subscript that is a matrix or
    higher rank.
    
    Fixes llvm#88112.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    138524e View commit details
    Browse the repository at this point in the history
  16. [flang] Fix bogus error on statement function (llvm#89402)

    When a statement function in a nested scope has a name that clashes with
    a name that exists in the host scope, the compiler can handle it
    correctly (with a portability warning)... unless the host scope acquired
    the name via USE association. Fix.
    
    Fixes llvm#88678.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    59bf49a View commit details
    Browse the repository at this point in the history
  17. [libc] Clean up alternate test framework support (llvm#89659)

    This replaces the old macros LIBC_COPT_TEST_USE_FUCHSIA and
    LIBC_COPT_TEST_USE_PIGWEED with LIBC_COPT_TEST_ZXTEST and
    LIBC_COPT_TEST_GTEST, respectively.  These are really not about
    whether the code is in the Fuchsia build or in the Pigweed build,
    but just about what test framework is being used.  The gtest
    framework can be used in many contexts, and the zxtest framework
    is not always what's used in the Fuchsia build.
    
    The test/UnitTest/Test.h wrapper header now provides the macro
    LIBC_TEST_HAS_MATCHERS() for use in `#if` conditionals on use of
    gmock-style matchers, replacing `#if` conditionals that test the
    framework selection macros directly.
    frobtech authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    d2be982 View commit details
    Browse the repository at this point in the history
  18. [flang] Make proc characterization error conditional for generics (ll…

    …vm#89429)
    
    When the characteristics of a procedure depend on a procedure that
    hasn't yet been defined, the compiler currently emits an unconditional
    error message. This includes the case of a procedure whose
    characteristics depend, perhaps indirectly, on itself. However, in the
    case where the characteristics of a procedure are needed to resolve a
    generic, we should not emit an error for a hitherto undefined procedure
    -- either the call will resolve to another specific procedure, in which
    case the error is spurious, or it won't, and then an error will issue
    anyway.
    
    Fixes llvm#88677.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    cb26391 View commit details
    Browse the repository at this point in the history
  19. [docs] Rewrite cmake LLVM_RAM_PER_*_JOB description (llvm#88570)

    Rewrite  `LLVM_PARALLEL_{}_JOBS` and `LLVM_RAM_PER_{}_JOB` documentation.
    urnathan authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    2f77757 View commit details
    Browse the repository at this point in the history
  20. [flang] C_LOC is PURE (llvm#89437)

    The standard defines C_LOC as being PURE (actually SIMPLE now in
    F'2023); characterize it appropriately.
    
    Fixes llvm#88747.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    fde5e47 View commit details
    Browse the repository at this point in the history
  21. [flang] Complete implementation of OUT_OF_RANGE() (llvm#89334)

    The intrinsic function OUT_OF_RANGE() lacks support in lowering and the
    runtime. This patch obviates a need for any such support by implementing
    OUT_OF_RANGE() via rewriting in semantics. This rewriting of
    OUT_OF_RANGE() calls replaces the existing code that folds
    OUT_OF_RANGE() calls with constant arguments.
    
    Some changes and fixes were necessary outside of OUT_OF_RANGE()'s
    folding code (now rewriting code), whose testing exposed some other
    issues worth fixing.
    
    - The common::RealDetails<> template class was recoded in terms of a new
    base class with a constexpr constructor, so that the the characteristics
    of the various REAL kinds could be queried dynamically as well. This
    affected some client usage.
    - There were bugs in the code that folds TRANSFER() when the type of X
    or MOLD was REAL(10) -- this is a type that occupies 16 bytes per
    element in execution memory but only 10 bytes (was 12) in the data of
    std::vector<Scalar<>> in a Constant<>.
    - Folds of REAL->REAL conversions weren't preserving infinities.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    1444e5a View commit details
    Browse the repository at this point in the history
  22. Temporarily remove clang_rt.ctx_profile target

    Trying to address the build failure on the `clang-ve-ninja`bot, which
    appears hard to repro locally. The target isn't needed currently (there
    are unit tests exercising the new functionality). Removing it for now
    to green-ify the build bot.
    mtrofin committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    579efe0 View commit details
    Browse the repository at this point in the history
  23. [GlobalISel] matchSDivByConst should use isNullValue() (llvm#89666)

    It has been using isZeroValue(), which is for floats, not integers.
    AreaZR authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    5fef5e6 View commit details
    Browse the repository at this point in the history
  24. [ORC] Unify task dispatch across ExecutionSession and ExecutorProcess…

    …Control.
    
    Updates ExecutionSession to use the ExecutorProcessControl object's
    TaskDispatcher rather than having a separate dispatch function. This gives the
    TaskDispatcher a global view of all tasks to be executed, and provides a
    single point to wait on for tasks to complete when shutting down the JIT.
    lhames committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    6094b3b View commit details
    Browse the repository at this point in the history
  25. [flang] Fix build warning (llvm#89686)

    A recent patch had three declared but unused variables in it, triggering
    a warning in some build bots. Remove them.
    klausler authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    2e2ac6f View commit details
    Browse the repository at this point in the history
  26. Revert "[ORC] Unify task dispatch across ExecutionSession and Executo…

    …rProcessControl."
    
    This reverts commit 6094b3b.
    
    Multiple bots are broken.
    joker-eph committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    a28557a View commit details
    Browse the repository at this point in the history
  27. [hwasan] Add intrinsics for fixed shadow on Aarch64 (llvm#89319)

    This patch introduces HWASan memaccess intrinsics that assume a fixed
    shadow (with the offset provided by --hwasan-mapping-offset=...), with
    and without short granule support.
    
    The behavior of HWASan is not meaningfully changed by this patch;
    future work ("Optimize outlined memaccess for
    fixed shadow on Aarch64": llvm#88544) will make HWASan use these intrinsics.
    
    We currently only support lowering the LLVM IR intrinsic to AArch64.
    
    The test case is adapted from hwasan-check-memaccess.ll.
    thurstond authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    365bddf View commit details
    Browse the repository at this point in the history
  28. Update CHECK lines in tests after 14e6f63 added new output causing th…

    …e tests to fail on multiple bots. (llvm#89689)
    
    Update the check lines added in llvm#87247 after 14e6f63 updated the output
    causing the tests to fail.
    
    This should hopefully unbreak the bots failing due to these two tests
    failing.
    dyung authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    8f54ed2 View commit details
    Browse the repository at this point in the history

Commits on Apr 23, 2024

  1. Configuration menu
    Copy the full SHA
    28cea99 View commit details
    Browse the repository at this point in the history
  2. Make createReadOrMaskedRead and isValidMaskedInputVector vector utili…

    …ties (llvm#89119)
    
    Made the createReadOrMaskedRead and isValidMaskedInputVector utility
    functions - to be accessible outside of the CU. Needed by the IREE new
    TopK implementation.
    LLITCHEV authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    30d4f6a View commit details
    Browse the repository at this point in the history
  3. Revert "[RISCV] Implement RISCVISD::SHL_ADD and move patterns into co…

    …mbine (llvm#89263)"
    
    This reverts commit 5a7c80c.  Noticed failures
    with the following command:
    $ llc -mtriple=riscv64 -mattr=+m,+xtheadba -verify-machineinstrs < test/CodeGen/RISCV/rv64zba.ll
    
    I think I know the cause and will likely reland with a fix tomorrow.
    preames committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    dc3f943 View commit details
    Browse the repository at this point in the history
  4. Re-apply "[ORC] Unify task dispatch across ExecutionSession and..." w…

    …ith fix.
    
    This re-applies 6094b3b, which was reverted in a28557a due to broken
    bots. As far as I can tell all failures were due to a missing #include <deque>,
    which has been adedd in this commit.
    lhames committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    1effa19 View commit details
    Browse the repository at this point in the history
  5. [AIX][TLS][clang] Add -maix-small-local-dynamic-tls clang option (llv…

    …m#88829)
    
    This patch adds the clang portion of an AIX-specific option to inform
    the
    compiler that it can use a faster access sequence for the local-dynamic
    TLS model (formally named aix-small-local-dynamic-tls).
    
    This patch mainly references Amy's work on small local-exec TLS support.
    orcguru authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    16efd2a View commit details
    Browse the repository at this point in the history
  6. [lldb][DAP] Fix test failure from llvm#73393 (llvm#89692)

    llvm#73393 introduced a mandatory column field. Update test for that.
    pranavk authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    aa89c1b View commit details
    Browse the repository at this point in the history
  7. Revert "Re-apply [ORC] Unify task dispatch across ExecutionSession an…

    …d..."
    
    This reverts commit 1effa19 while I investigate the test failure at
    https://lab.llvm.org/buildbot/#/builders/285/builds/888.
    lhames committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    e7efd37 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    ff153bd View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    28d85e2 View commit details
    Browse the repository at this point in the history
  10. [lldb] Replace condition that always evaluates to false (llvm#89685)

    Addresses issue llvm#87243. 
    
    The current code incorrectly checks the validity of ```obj``` twice when
    it should be checking the new ```str_obj``` pointer.
    
    Signed-off-by: Troy-Butler <squintik@outlook.com>
    Co-authored-by: Troy-Butler <squintik@outlook.com>
    Troy-Butler and Troy-Butler authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    af8445e View commit details
    Browse the repository at this point in the history
  11. [SimplifyQuery] Avoid PatternMatch.h include (NFC)

    Move the one method that uses it out of line. This is primarily to
    reduce the number of files to rebuild when changing PatternMatch.h.
    nikic committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    f8a19a8 View commit details
    Browse the repository at this point in the history
  12. [Support] Fix a warning

    This patch fixes:
    
      third-party/unittest/googletest/include/gtest/gtest.h:1379:11:
      error: comparison of integers of different signs: 'const int' and
      'const unsigned long' [-Werror,-Wsign-compare]
    kazutakahirata committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    4127a69 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    34ee77c View commit details
    Browse the repository at this point in the history
  14. [ADT] Remove StringRef::{startswith,endswith} (llvm#89548)

    These functions have been deprecated since:
    
      commit 5ac1295
      Author: Kazu Hirata <kazu@google.com>
      Date:   Sun Dec 17 15:52:50 2023 -0800
    kazutakahirata authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    4ec9a66 View commit details
    Browse the repository at this point in the history
  15. [RISCV][TableGen] Generate RISCVTargetParserDef.inc from the new RISC…

    …VExtension tblgen information. (llvm#89335)
    
    Instead of using RISCVISAInfo's extension information, use the extension
    found in tblgen after llvm#89326.
        
    We still need to use RISCVISAInfo code to get the sorting rules for the
    ISA string.
        
    The ISA string we generate now is not quite the same extension we had
    before. No implied extensions are included in the generate string unless
    they are explicitly listed in RISCVProcessors.td. This primarily affects
    Zicsr being implied by F, V implying Zve*, and Zvl*b implying a smaller
    Zvl*b. All of these implication should be picked up when the string is
    used by the frontend.
        
    The benefit is that we get a more manageable ISA string for humans to
    deal with.
        
    This is a step towards generating RISCVISAInfo's extension list from
    tblgen.
    topperc authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    b64e483 View commit details
    Browse the repository at this point in the history
  16. [SimplifyCFG] Check alignment when speculating stores

    When speculating a store based on a preceding load/store, we need
    to ensure that the speculated store does not have a higher
    alignment (which might only be guaranteed by the branch condition).
    
    There are various ways in which this could be strengthened (we
    could get or enforce the alignment), but for now just do the
    simple check against the preceding load/store.
    
    Fixes llvm#89672.
    nikic committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    8838874 View commit details
    Browse the repository at this point in the history
  17. [clang][CodeGen][NFC] Make ConstExprEmitter a ConstStmtVisitor (llvm#…

    …89041)
    
    No reason for this to not be one. This gets rid of a few const_casts.
    tbaederr authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    e5f9de8 View commit details
    Browse the repository at this point in the history
  18. [RISCV] Sink some repeated code into parseVTypeToken. NFC (llvm#89694)

    Both calls to parseVTypeToken were proceeded by check for an Identifier
    token and a call to getIdentifier. Sync those into the parseVTypeToken
    to reduce repetition.
    topperc authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    25a391c View commit details
    Browse the repository at this point in the history
  19. [NFC] [Serialization] Use semantical type DeclID instead of raw type …

    …'uint32_t'
    
    This patch tries to use DeclID in the code bases to avoid use the raw
    type 'uint32_t'. It is problematic to use the raw type 'uint32_t' if we
    want to change the type of DeclID some day.
    ChuanqiXu9 committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    07b1177 View commit details
    Browse the repository at this point in the history
  20. [FunctionAttrs] Fix incorrect noundef inference with poison attrs (ll…

    …vm#89348)
    
    Currently, when inferring noundef, we only check that the return value
    is not undef/poison. However, we fail to account for the possibility
    that a poison-generating return attribute will convert the value to
    poison, and then violate the noundef attribute, resulting in immediate
    UB.
    
    For the relevant return attributes (align, nonnull and range), check
    whether we can trivially re-prove the relevant property, otherwise do
    not infer noundef.
    
    This fixes the FunctionAttrs side of
    llvm#88026.
    nikic authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    a2ccd5d View commit details
    Browse the repository at this point in the history
  21. [NFC] Remove unused LocalRedeclarationsInfo from ASTBitcodes.h

    As the title suggested.
    ChuanqiXu9 committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    02d00ec View commit details
    Browse the repository at this point in the history
  22. [memprof] Omit the key length for the record table (llvm#89527)

    The record table has a constant key length, so we don't need to
    serialize or deserialize it for every key-data pair.  Omitting the key
    length saves 0.06% of the indexed MemProf file size.
    
    Note that it's OK to change the format because Version2 is still under
    development.
    kazutakahirata authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    b28f4d4 View commit details
    Browse the repository at this point in the history
  23. [MLIR] Harmonize the behavior of the folding API functions (llvm#88508)

    This commit changes `OpBuilder::tryFold` to behave more similarly to
    `Operation::fold`. Concretely, this ensures that even an in-place fold
    returns `success`.
    This is necessary to fix a bug in the dialect conversion that occurred
    when an in-place folding made an operation legal. The dialect conversion
    infrastructure did not check if the result of an in-place folding
    legalized the operation and just went ahead and tried to apply pattern
    anyways.
    
    The added test contains a simplified version of a breakage we observed
    downstream.
    Dinistro authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    4513050 View commit details
    Browse the repository at this point in the history
  24. Reapply "[clang][dataflow] Model conditional operator correctly." wit…

    …h fixes (llvm#89596)
    
    I reverted llvm#89213 beause it was
    causing buildbots to fail with assertion failures.
    
    Embarrassingly, it turns out I had been running tests locally in
    `Release` mode, i.e. with `assert()` compiled away.
    
    This PR re-lands llvm#89213 with fixes for the failing assertions.
    martinboehme authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    9ba6961 View commit details
    Browse the repository at this point in the history
  25. [mlir][linalg] Add patterns to convert matmul to transposed variants (l…

    …lvm#89075)
    
    This adds patterns to convert from the Linalg matmul and batch_matmul
    ops to the transposed variants. By default the LHS matrix is transposed.
    
    Our work enabling a lowering path from linalg.matmul to ArmSME has
    revealed the current lowering results in non-contiguous memory accesses
    for the A matrix and very poor performance.
    
    These patterns provide a simple option to fix this.
    c-rhodes authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    7922534 View commit details
    Browse the repository at this point in the history
  26. [NFC] [Serialization] Remove unused readVisibleDeclContextStorage fro…

    …m ASTRecordReader.h
    
    As the title suggested.
    ChuanqiXu9 committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    87a2159 View commit details
    Browse the repository at this point in the history
  27. [NFC] Rename hlsl semantics to hlsl annotations (llvm#89309)

    The attribute name "HLSLSemantics" is confusing, because semantics
    aren't always the annotation that are applied to specific variables. The
    name for this attribute needs to be less specific. This PR changes the
    attribute name from HLSLSemantic to HLSLAnnotation, and changes the
    associated function and variable names to support this conceptual
    change.
    The HLSLAnnotation attribute will never be output in ast-dump due to it
    being parsed for the attribute that it represents. There is no
    functional change, so there are no accompanying tests.
    bob80905 authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    eaab97a View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    561b3de View commit details
    Browse the repository at this point in the history
  29. [clang] Set correct FPOptions if attribute 'optnone' presents (llvm#8…

    …5605)
    
    Attribute `optnone` must turn off all optimizations including fast-math
    ones. Actually AST nodes in the 'optnone' function still had fast-math
    flags. This change implements fixing FP options before function body is
    parsed.
    spavloff authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    a046242 View commit details
    Browse the repository at this point in the history
  30. [flang] handle intrinsic interfaces in FunctionRef::GetType (llvm#89583)

    User functions may be declared with an interface that is a specific
    intrinsic. In such case, there is no result type available from the
    procedure symbol (at least without using evaluate::Probe), and
    FunctionRef::GetType() returned nullopt. This caused lowering to crash.
    The result type of specific intrinsic procedures is always a lengthless
    intrinsic type, so it is fully defined in the template argument of
    FunctionRef. Use it.
    jeanPerier authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    35159c2 View commit details
    Browse the repository at this point in the history
  31. [GlobalISel] Expand IRTranslator docs. NFC (llvm#89186)

    Add some more details about how calls are lowered and what APIs are
    available.
    rovka authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    3ea9ed4 View commit details
    Browse the repository at this point in the history
  32. EmitC: Add emitc.global and emitc.get_global (llvm#145) (llvm#88701)

    This adds
    - `emitc.global` and `emitc.get_global` ops to model global variables
    similar to how `memref.global` and `memref.get_global` work.
    - translation of those ops to C++
    - lowering of `memref.global` and `memref.get_global` into those ops
    
    ---------
    
    Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
    mgehre-amd and simon-camp authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    6548465 View commit details
    Browse the repository at this point in the history
  33. [clang][ExtractAPI] Serialize platform specific unavailable attribute…

    … in symbol graphs (llvm#89277)
    
    rdar://125622225
    daniel-grumberg authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    05c1447 View commit details
    Browse the repository at this point in the history
  34. [analyzer] Fix performance of getTaintedSymbolsImpl() (llvm#89606)

    Previously the function
    ```
    std::vector<SymbolRef> taint::getTaintedSymbolsImpl(ProgramStateRef State,
                                                        const MemRegion *Reg,
                                                        TaintTagType K,
                                                        bool returnFirstOnly)
    ```
    (one of the 4 overloaded variants under this name) was handling element
    regions in a highly inefficient manner: it performed the "also examine
    the super-region" step twice. (Once in the branch for element regions,
    and once in the more general branch for all `SubRegion`s -- note that
    `ElementRegion` is a subclass of `SubRegion`.)
    
    As pointer arithmetic produces `ElementRegion`s, it's not too difficult
    to get a chain of N nested element regions where this inefficient
    recursion would produce 2^N calls.
    
    This commit is essentially NFC, apart from the performance improvements
    and the removal of (probably irrelevant) duplicate entries from the
    return value of `getTaintedSymbols()` calls.
    
    Fixes llvm#89045
    NagyDonat authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    ce763bf View commit details
    Browse the repository at this point in the history
  35. [LV] Add additional cost model tests with inductions and truncates.

    Add test coverage for additional cases not covered by current tests with
    multiple inductions and truncates.
    fhahn committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    55fc5eb View commit details
    Browse the repository at this point in the history
  36. [DWARF] Add option to add linkage_names to call_origin declaration re…

    …fs (llvm#89640)
    
    If -mllvm -add-linkage-names-to-external-call-origins is true then add
    DW_AT_linkage_name attributes to DW_TAG_subprogram DIEs referenced by
    DW_AT_call_origin attributes that would otherwise be omitted.
    
    A debugger may use DW_TAG_call_origin attributes to determine whether any
    frames in a callstack are missing due to optimisations (e.g. tail calls).
    
    For example, say a() calls b() tail-calls c(), and you stop in your debugger
    in c():
    
    The callstack looks like this:
        c()
        a()
    
    Looking "up" from c(), call site information can be found in a(). This includes
    a DW_AT_call_origin referencing b()'s subprogram DIE, which means the call at
    this call site was to b(), not c() where we are currently stopped. This
    indicates b()'s frame has been lost due to optimisation (or is misleading due
    to ICF).
    
    This patch makes it easier for a debugger to check whether the referenced
    DIE describes the target function or not, for example by comparing the referenced
    function name to the current frame.
    
    There's already an option to apply DW_AT_linkage_name in a targeted manner:
    -dwarf-linkage-names=Abstract, which limits adding DW_AT_linkage_names to
    abstract subprogram DIEs (this is default for SCE tuning).
    
    The new flag shouldn't affect non-SCE-tuned behaviour whether it is enabled
    or not because the non-SCE-tuned behaviour is to always add linkage names to
    subprogram DIEs.
    OCHyams authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    0e44ffe View commit details
    Browse the repository at this point in the history
  37. [PAC][MC][AArch64] Fix error message for AUTH_ABS64 reloc with ILP32 (l…

    …lvm#89563)
    
    The `LP64 eqv:` should say that the equivalent is `AUTH_ABS64` rather
    than `ABS64` when trying to emit an AUTH absolute reloc with ILP32.
    kovdan01 authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    da57609 View commit details
    Browse the repository at this point in the history
  38. [WebAssembly] Enable multivalue return when multivalue ABI is used (l…

    …lvm#88492)
    
    Multivalue feature of WebAssembly has been standardized for several
    years now. I think it makes sense to be able to enable it in the feature
    section by default for our clang/llvm-produced binaries so that the
    multivalue feature can be used as necessary when necessary within our
    toolchain and also when running other optimizers (e.g. wasm-opt) after
    the LLVM code generation.
    
    But some WebAssembly toolchains, such as Emscripten, do not provide both
    mulvalue-returning and not-multivalue-returning versions of libraries.
    Also allowing the uses of multivalue in the features section does not
    necessarily mean we generate them whenever we can to the fullest, which
    is a different code generation / optimization option.
    
    So this makes the lowering of multivalue returns conditional on the use
    of 'experimental-mv' target ABI. This ABI is turned off by default and
    turned on by passing `-Xclang -target-abi -Xclang experimental-mv` to
    `clang`, or `-target-abi experimental-mv` to `clang -cc1` or `llc`.
    
    But the purpose of this PR is not tying the multivalue lowering to this
    specific 'experimental-mv'. 'experimental-mv' is just one multivalue ABI
    we currently have, and it is still experimental, meaning it is not very
    well optimized or tuned for performance. (e.g. it does not have the
    limitation of the max number of multivalue-lowered values, which can be
    detrimental to performance.) We may change the name of this ABI, or
    improve it, or add a new multivalue ABI in the future. Also I heard that
    WASI is planning to add their multivalue ABI soon. So the plan is,
    whenever any one of multivalue ABIs is enabled, we enable the lowering
    of multivalue returns in the backend. We currently have only
    'experimental-mv' in the repo so we only check for that in this PR.
    
    Related past discussions:
     llvm#82714
    WebAssembly/tool-conventions#223 (comment)
    aheejin authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    c921ac7 View commit details
    Browse the repository at this point in the history
  39. [NFC] [Serialization] Turn type alias LocalDeclID into class

    Previously, the LocalDeclID and GlobalDeclID are defined as:
    
    ```
    using LocalDeclID = DeclID;
    using GlobalDeclID = DeclID;
    ```
    
    This is more or less concerning that we may misuse LocalDeclID and
    GlobalDeclID without understanding it. There is also a FIXME saying
    this.
    
    This patch tries to turn LocalDeclID into a class to improve the type
    safety here.
    ChuanqiXu9 committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    b8e3b2a View commit details
    Browse the repository at this point in the history
  40. [WebAssembly] Make RefTypeMem2Local recognize target-features (llvm#8…

    …8916)
    
    Currently we check `Subtarget->hasReferenceTypes()` to decide whether to
    run `RefTypeMem2Local` pass:
    
    https://github.com/llvm/llvm-project/blob/6133878227efc30355c02c2f089e06ce58231a3d/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp#L491-L495
    
    This works fine when `-mattr=+reference-types` is given in the command
    line (of `llc` or of `wasm-ld` in case of LTO). This also works fine if
    the backend is called by Clang, because Clang's feature set will be
    passed to the backend when creating a `TargetMachine`:
    https://github.com/llvm/llvm-project/blob/ac791888bbbe58651e597cf7a4b2276424b77a92/clang/lib/CodeGen/BackendUtil.cpp#L549-L550
    https://github.com/llvm/llvm-project/blob/ac791888bbbe58651e597cf7a4b2276424b77a92/clang/lib/CodeGen/BackendUtil.cpp#L561-L562
    
    But if the backend compilation is called by `llc`, a `TargetMachine` is
    created here:
    
    https://github.com/llvm/llvm-project/blob/bf1ad1d267b1f911cb9846403d2c3d3250a40870/llvm/tools/llc/llc.cpp#L554-L555
    And if the backend is called by `wasm-ld`'s LTO, a `TargetMachine` is
    created here:
    
    https://github.com/llvm/llvm-project/blob/ac791888bbbe58651e597cf7a4b2276424b77a92/llvm/lib/LTO/LTOBackend.cpp#L513
    At this point, in the both places, the created `TargetMachine` only has
    access to target features given by the command line with `-mattr=` and
    doesn't have access to bitcode functions' `target-features` attribute.
    
    We later gather the target features used by functions and store that
    info in the `TargetMachine` in `CoalesceFeaturesAndStripAtomics`,
    https://github.com/llvm/llvm-project/blob/ac791888bbbe58651e597cf7a4b2276424b77a92/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp#L202-L206
    but this runs in the pass pipeline driven by the pass manager, so this
    has not run by the time we check `Subtarget->hasReferenceTypes()` in
    `WebAssemblyPassConfig::addISelPrepare`. So currently `RefTypeMem2Local`
    would not run on those functions with
    `"target-features"="+reference-types"` attributes if the backend is
    called by `llc` or `wasm-ld`.
    
    So this makes `RefTypeMem2Local` pass run unconditionally, and checks
    `target-featurs` function attribute to decide whether to run the pass on
    each function. This allows the pass to run with `wasm-ld` + LTO and
    `llc`, even if `-mattr=+reference-types` is not explicitly given in the
    command line again, as long as `+reference-types` is in the function's
    `target-features` attribute.
    
    This also covers the case we give the target features by the command
    line like `llc -mattr=+reference-types` and not in the bitcode
    function's attribute, because attributes given in the command line will
    be stored in the function's attributes anyway:
    
    https://github.com/llvm/llvm-project/blob/bd28889732e14ac6baca686c3ec99a82fc9cd89d/llvm/lib/CodeGen/CommandFlags.cpp#L673-L674
    https://github.com/llvm/llvm-project/blob/bd28889732e14ac6baca686c3ec99a82fc9cd89d/llvm/lib/CodeGen/CommandFlags.cpp#L732-L733
    
    With this PR,
    - `lto0.test_externref_emjs`
    - `thinlto0.test_externref_emjs`,
    - `lto0.test_externref_emjs_dynlink`,
    - `thinlto0.test_externref_emjs_dynlnk`
    
    pass. These currently fail but don't get checked in the CI. I think they
    used to pass but started to fail after llvm#83196, because we used to run
    mem2reg even with `-O0` before that.
    (`ltoN` (N > 0) tests are not affected because they run mem2reg anyway
    so they don't need `RefTypeMem2Local`)
    aheejin authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    a22ffe5 View commit details
    Browse the repository at this point in the history
  41. [mlir][bazel] drop unnecessary rule

    llvm#75960 added a bazel rule for generating enums for the async dialects, but there are no enums defined, and no cmake rule for that. Delete this rule.
    ftynse authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    d5093aa View commit details
    Browse the repository at this point in the history
  42. Revert "[mlir][linalg] Enable fuse consumer" (llvm#89722)

    Reverts llvm#85528. This was committed without tests,
    despite reviewers requesting tests to be added. The post-commit
    discussion leans towards revert, which would be consistent with the
    policy.
    ftynse authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    f220c35 View commit details
    Browse the repository at this point in the history
  43. Revert b28f4d4 "[memprof] Omit the key length for the record table (l…

    …lvm#89527)"
    
    Breaks on EXPENSIVE_CHECKS builds which still use the static ReadKeyDataLength implementation in several locations
    RKSimon committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    20cb2ed View commit details
    Browse the repository at this point in the history
  44. [lldb/test] Rename a function

    I misunderstood what is the function looking up
    labath committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    dbcfb43 View commit details
    Browse the repository at this point in the history
  45. Configuration menu
    Copy the full SHA
    a68ea36 View commit details
    Browse the repository at this point in the history
  46. [flang][OpenMP] Support reduction of allocatable variables (llvm#88392)

    Both arrays and trivial scalars are supported. Both cases must use
    by-ref reductions because both are boxed.
    
    My understanding of the standards are that OpenMP says that this should
    follow the rules of the intrinsic reduction operators in fortran, and
    fortran says that unallocated allocatable variables can only be
    referenced to allocate them or test if they are already allocated.
    Therefore we do not need a null pointer check in the combiner region.
    tblah authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    8cc34fa View commit details
    Browse the repository at this point in the history
  47. [bazel] Add a bazel flag to enable building MLIR with CUDA support (l…

    …lvm#88856)
    
    This makes it possible to specify
    `--@llvm-project//mlir:enable_cuda=true` on the bazel command line and
    get a build that includes NVIDIA GPU support in MLIR.
    apaszke authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    bc72048 View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    719112c View commit details
    Browse the repository at this point in the history
  49. [mlir][linalg] Move transpose_matmul to targeted transform op (llvm#8…

    …9717)
    
    More targeted than a blanket "apply everywhere" pattern. Follow up to
    llvm#89075 to address @ftynse's feedback.
    c-rhodes authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    be1c72d View commit details
    Browse the repository at this point in the history
  50. [NFC] [Serialization] Turn type alias GlobalDeclID into a class

    Succsessor of b8e3b2a. This patch also converts the type
    alias GlobalDeclID to a class to improve the readability and type
    safety.
    ChuanqiXu9 committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    b467c6b View commit details
    Browse the repository at this point in the history
  51. [mlir][aarch64] Remove LIT config for lli (llvm#89545)

    This change will only affect MLIR integration tests to be run on
    AArch64. When originally introduced, these tests would run with `lli`.
    Those tests has since been updated to use `mlir-cpu-runner` instead, see
    e.g.:
    
      * https://reviews.llvm.org/D155405
      * https://reviews.llvm.org/D146917
    
    This patch removes all the leftover `lli` configuration in LIT that's
    currently not needed (and is unlikely to be needed any time soon).
    banach-space authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    132bf4a View commit details
    Browse the repository at this point in the history
  52. [VectorCombine][X86] Add test showing foldShuffleOfShuffles folding s…

    …huffles that would be better separate
    
    On AVX+ targets a broadcast load can be treated as free.
    RKSimon committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    b4c6607 View commit details
    Browse the repository at this point in the history
  53. [CostModel][X86] Add costs test coverage for broadcast loads

    Broadcast shuffles can be free is fed from a one-use load
    RKSimon committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    a9e8730 View commit details
    Browse the repository at this point in the history
  54. [CostModel][X86] Broadcast shuffles can be free if they are from a on…

    …e-use load
    
    AVX1+ can handle 32/64-bit broadcast loads, AVX2+ can handle all broadcast loads (we should be able to improve isLegalBroadcastLoad to handle more of this type matching).
    RKSimon committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    f89f670 View commit details
    Browse the repository at this point in the history
  55. [LLVM][CodeGen][AArch64] Simplify lowering for predicate inserts. (ll…

    …vm#89072)
    
    The original code has an invalid use of UZP1 because the result vector
    type does not match its input vector types. Rather than insert extra nop
    casts I figure it would be better to use CONCAT_VECTORS because that's
    the operation we're performing.
    
    NOTE: This is a step to enable more asserts in verifyTargetSDNode.
    paulwalker-arm authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    34caafe View commit details
    Browse the repository at this point in the history
  56. RenameIndependentSubregs: Add missing sub-range for new IMPLICIT_DEFs (

    …llvm#89050)
    
    Existing sub-ranges are correctly updated because new IMPLICIT_DEF is
    added, but there is missing sub-range for IMPLICIT_DEF itself.
    Because of missing sub-range in live-intervals for IMPLICIT_DEF,
    register allocator does not know that IMPLICIT_DEF rewrites its
    virtual sub-registers and can end up assigning overlapping physical
    registers to them.
    This results in deleting instructions that were defined by sub-registers
    overwritten by IMPLICIT_DEF as they are now dead.
    petar-avramovic authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    d610a51 View commit details
    Browse the repository at this point in the history
  57. [LLVM][CodeGen][SVE] rev(whilelo(a,b)) -> whilehi(b,a). (llvm#88294)

    Add similar isel patterns for lt, gt and hi comparison types.
    paulwalker-arm authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    a9689c6 View commit details
    Browse the repository at this point in the history
  58. [VPlan] Skip extending ICmp results in trunateToMinimalBitwidth.

    Results of icmp don't need extending after truncating their operands, as
    the result will always be i1. Skip them during extending.
    
    Fixes llvm#79742
    Fixes llvm#85185
    fhahn committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    17fb3e8 View commit details
    Browse the repository at this point in the history
  59. [VectorCombine] foldShuffleOfShuffles - add missing arguments to getS…

    …huffleCost calls.
    
    Ensure the getShuffleCost arguments/instruction args are populated - minor extension to llvm#88743 to help improve shuffle costs for certain corner cases (e.g. shuffles of loads)
    RKSimon committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    7f4f237 View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    8a631d7 View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    bac5d8e View commit details
    Browse the repository at this point in the history
  62. [clang-tidy] Avoid overflow when dumping unsigned integer values (llv…

    …m#85060)
    
    Some options take the maximum unsigned integer value as default, but
    they are being dumped to a string as integers. This makes -dump-config
    write invalid '-1' values for these options. This change fixes this
    issue by using utostr if the option is unsigned.
    
    Fixes llvm#60217
    ealcdan authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    c52b18d View commit details
    Browse the repository at this point in the history
  63. Make default initialization explicit

    Coverity (a static analysis tool) reported that the emitted 'Features'
    variable inside emitComputeAvailableFeatures in TableGen might be
    unitialized.
    Silence this warning by adding brackets for the default initialization.
    Adapt test cases to take additional brackets into account.
    Martin Wehking authored and ldrumm committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    b817451 View commit details
    Browse the repository at this point in the history
  64. [InstCombine] Fold fcmp into select (llvm#86482)

    This patch simplifies `fcmp (select Cond, C1, C2), C3` patterns in
    ceres:
    Alive2: https://alive2.llvm.org/ce/z/fWh_sD
    ```
    define i1 @src(double %x) {
      %cmp1 = fcmp ord double %x, 0.000000e+00
      %sel = select i1 %cmp1, double 0xFFFFFFFFFFFFFFFF, double 0.000000e+00
      %cmp2 = fcmp oeq double %sel, 0.000000e+00
      ret i1 %cmp2
    }
    
    define i1 @tgt(double %x) {
      %cmp1 = fcmp uno double %x, 0.000000e+00
      ret i1 %cmp1
    }
    
    ```
    dtcxzyw authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    9fb7a73 View commit details
    Browse the repository at this point in the history
  65. Pre-commit reproducer for argument copy elison related bug

    Adding test case related to
      llvm#89060
    
    It shows that after argument copy elison the scheduler may reorder
    a load of the input argument and a store to the same fixed stack
    entry (the fixed stack entry that is reused for the local variable).
    bjope committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    56ed3dd View commit details
    Browse the repository at this point in the history
  66. [SelectionDAG] Mark frame index as "aliased" at argument copy elison (l…

    …lvm#89712)
    
    This is a fix for miscompiles reported in
      llvm#89060
    
    After argument copy elison the IR value for the eliminated alloca
    is aliasing with the fixed stack object. This patch is making sure
    that we mark the fixed stack object as being aliased with IR values
    to avoid that for example schedulers are reordering accesses to
    the fixed stack object. This could otherwise happen when there is a
    mix of MemOperands refering the shared fixed stack slow via both
    the IR value for the elided alloca, and via a fixed stack pseudo
    source value (as would be the case when lowering the arguments).
    bjope authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    d8b253b View commit details
    Browse the repository at this point in the history
  67. [Flang][OpenMP] Add restriction about subobjects to firstprivate and … (

    llvm#89608)
    
    …lastprivate
    
    OpenMP 5.2 standard (Section 5.3) defines privatization for list items.
    Section 3.2.1 in the standard defines list items to exclude variables
    that are part of other variables.
    
    This patch adds the restriction to firstprivate and lastprivates, it was
    previously added for privates.
    
    Fixes llvm#67227
    
    Note: The specific checks that are added here are explicitly called out
    in OpenMP 4.0
    (https://www.openmp.org/wp-content/uploads/OpenMP4.0.0.pdf) Sections
    2.14.3.4 and 2.14.3.5 but in later standards have become implicit
    through other definitions.
    kiranchandramohan authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    0661af8 View commit details
    Browse the repository at this point in the history
  68. [DAGCombiner] Pre-commit test case for miscompile bug in combineShift…

    …OfShiftedLogic
    
    DAGCombiner is trying to fold shl over binops, and in the process
    combining it with another shl. However it needs to be more careful
    to ensure that the sum of the shift counts fits in the type used
    for the shift amount.
    For example, X86 is using i8 as shift amount type. So we need to
    make sure that the sum of the shift amounts isn't greater than 255.
    
    Fix will be applied in a later commit. This only pre-commits the
    test case to show that we currently get the wrong result.
    
    Bug was found when testing the C23 BitInt feature.
    bjope committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    5fd9bbd View commit details
    Browse the repository at this point in the history
  69. [DAGCombiner] Fix miscompile bug in combineShiftOfShiftedLogic (llvm#…

    …89616)
    
    Ensure that the sum of the shift amounts does not overflow the
    shift amount type when combining shifts in combineShiftOfShiftedLogic.
    
    Solves a miscompile bug found when testing the C23 BitInt feature.
    
    Targets like X86 that only use an i8 for shift amounts after
    legalization seems to be extra susceptible for bugs like this as it
    isn't legal to shift more than 255 steps.
    bjope authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    f9b419b View commit details
    Browse the repository at this point in the history
  70. [X86] getTargetShuffleMask - update to take a SDValue instead of a SD…

    …Node. NFC.
    
    Also just get the value type from the SDValue instead of passing it separately.
    RKSimon committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    304dfe1 View commit details
    Browse the repository at this point in the history
  71. [Clang][Parser] Don't always destroy template annotations at the end …

    …of a declaration (llvm#89494)
    
    Since
    [6163aa9](llvm@6163aa9#diff-3a7ef0bff7d2b73b4100de636f09ea68b72eda191b39c8091a6a1765d917c1a2),
    we have introduced an optimization that almost always destroys
    TemplateIdAnnotations at the end of a function declaration. This doesn't
    always work properly: a lambda within a default template argument could
    also result in such deallocation and hence a use-after-free bug while
    building a type constraint on the template parameter.
    
    This patch adds another flag to the parser to tell apart cases when we
    shouldn't do such cleanups eagerly. A bit complicated as it is, this retains
    the optimization on a highly templated function with lots of generic lambdas.
    
    Note the test doesn't always trigger a conspicuous bug/crash even with a
    debug build. But a sanitizer build can detect them, I believe.
    
    Fixes llvm#67235
    Fixes llvm#89127
    zyn0217 authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    8ab3caf View commit details
    Browse the repository at this point in the history
  72. [VPlan] Ignore incoming values with constant false mask. (llvm#89384)

    Ignore incoming values with constant false masks when trying to simplify
    VPBlendRecipes.
    
    As a follow-on optimization, we should also be able to drop all incoming
    values with false masks by creating a new VPBlendRecipe with those
    operands dropped.
    
    PR: llvm#89384
    fhahn authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    dadf6f2 View commit details
    Browse the repository at this point in the history
  73. AtomicExpand: Emit or with constant on RHS

    This will save later code from commuting it.
    arsenm committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    31af5e9 View commit details
    Browse the repository at this point in the history
  74. [Frontend][OpenMP] Add functions for checking construct type (llvm#87258

    )
    
    Implement helper functions to identify leaf, composite, and combined
    constructs.
    kparzysz authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    70d3ddb View commit details
    Browse the repository at this point in the history
  75. [libc++] Add some private headers to libcxx.imp (llvm#89568)

    llvm#78295 dropped private headers
    in top level directory from libcxx.imp.
    
    This PR re-adds them to libcxx.imp.
    atetubou authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    b926f75 View commit details
    Browse the repository at this point in the history
  76. [RemoveDIs][MLIR] Don't process debug records in the LLVM-IR translat…

    …or (llvm#89735)
    
    We are almost ready to enable the use of debug records everywhere in
    LLVM by default; part of the prep-work for this means ensuring that
    every tool supports them. Every tool in the `llvm/` project supports
    them, front-ends that use the `DIBuilder` will support them, and as far
    as I can tell, the only other tool in the LLVM repo that needs to
    support them but doesn't is `mlir-translate`. This patch trivially
    unblocks them by converting from debug records to debug intrinsics
    before translating a module.
    SLTozer authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    670ac23 View commit details
    Browse the repository at this point in the history
  77. [gn build] Port 70d3ddb

    llvmgnsyncbot committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    a9e3fbf View commit details
    Browse the repository at this point in the history
  78. [AMDGPU] Fix GFX12 encoding of s_wait_event export_ready (llvm#89622)

    As well as flipping the sense of the bit, GFX12 moved it from bit 0 to
    bit 1 in the encoded simm16 operand.
    jayfoad authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    e0a763c View commit details
    Browse the repository at this point in the history
  79. [SLP]Fix PR89635: do not try to vectorize single-gather alternate node.

    No need to try to vectorize single gather/buildvector with alternate
    opcode graph, it is not profitable. In other cases, need to use last
    instruction for inserting the vectorized code.
    alexey-bataev committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    b4a0fd4 View commit details
    Browse the repository at this point in the history
  80. Configuration menu
    Copy the full SHA
    282ab54 View commit details
    Browse the repository at this point in the history
  81. Reapply "[Clang][Sema] placement new initializes typedef array with c…

    …orrect size (llvm#83124)" (llvm#89036)
    
    When in-place new-ing a local variable of an array of trivial type, the
    generated code calls 'memset' with the correct size of the array,
    earlier it was generating size (squared of the typedef array + size).
    
    The cause: typedef TYPE TArray[8]; TArray x; The type of declarator is
    Tarray[8] and in SemaExprCXX.cpp::BuildCXXNew we check if it's of
    typedef and of constant size then we get the original type and it works
    fine for non-dependent cases.
    But in case of template we do TreeTransform.h:TransformCXXNEWExpr and
    there we again check the allocated type which is TArray[8] and it stays
    that way, so ArraySize=(Tarray[8] type, alloc Tarray[8*type]) so the
    squared size allocation.
    
    ArraySize gets calculated earlier in TreeTransform.h so that
    if(!ArraySize) condition was failing.
    fix: I changed that condition to if(ArraySize).
    fixes llvm#41441
    
    ---------
    
    Co-authored-by: erichkeane <ekeane@nvidia.com>
    mahtohappy and erichkeane authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    74cab54 View commit details
    Browse the repository at this point in the history
  82. [SystemZ][z/OS] Make z/OS personality function known (llvm#89679)

    This change adds the z/OS personality function to the list of known EH
    personality functions. It enables removing of the EH data/labels if the
    personality function is not invoked.
    redstar authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    d5022d9 View commit details
    Browse the repository at this point in the history
  83. Configuration menu
    Copy the full SHA
    adb0126 View commit details
    Browse the repository at this point in the history
  84. [libc++][ranges] P2387R3: Pipe support for user-defined range adaptors (

    llvm#89148)
    
    This patch finalizes the std::ranges::range_adaptor_closure
    class template from https://wg21.link/P2387R3.
    
      // [range.adaptor.object], range adaptor objects
      template<class D>
        requires is_class_v<D> && same_as<D, remove_cv_t<D>>
      class range_adaptor_closure { };
    
    The current implementation of __range_adaptor_closure was introduced
    in ee44dd8 and has served as the
    foundation for the range adaptors in libc++ for a while. This patch
    keeps its implementation, with the exception of the following changes:
    
    - __range_adaptor_closure now includes the missing constraints
      `is_class_v<D> && same_as<D, remove_cv_t<D>>` to restrict the 
      type of class that can inherit from it. (https://eel.is/c++draft/ranges.syn)
    - The operator| of __range_adaptor_closure no longer requires its
      first argument to model viewable_range. (https://eel.is/c++draft/range.adaptor.object#1)
    - The _RangeAdaptorClosure concept is refined to exclude cases where
      T models range or where T has base classes of type range_adaptor_closure<U>
      for another type U. (https://eel.is/c++draft/range.adaptor.object#2)
    xiaoyang-sde authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    c108653 View commit details
    Browse the repository at this point in the history
  85. [mlir][linalg] Add runtime verification for linalg ops (llvm#89342)

    This commit implements runtime verification for LinalgStructuredOps
    using the existing `RuntimeVerifiableOpInterface`. The verification
    checks that the runtime sizes of the operands match the runtime sizes
    inferred by composing the loop ranges with the op's indexing maps.
    ryan-holt-1 authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    8317d36 View commit details
    Browse the repository at this point in the history
  86. clang/win: Add a flag to disable default-linking of compiler-rt libra…

    …ries (llvm#89642)
    
    For ASan, users already manually have to pass in the path to the lib,
    and for other libraries they have to pass in the path to the libpath.
    
    With LLVM's unreliable name of the lib (due to
    LLVM_ENABLE_PER_TARGET_RUNTIME_DIR confusion and whatnot), it's useful
    to be able to opt in to just explicitly passing the paths to the libs
    everywhere.
    
    Follow-up of sorts to https://reviews.llvm.org/D65543, and to llvm#87866.
    nico authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    1d7086e View commit details
    Browse the repository at this point in the history
  87. Reapply "[RISCV] Implement RISCVISD::SHL_ADD and move patterns into c…

    …ombine (llvm#89263)"
    
    Changes since original commit:
    * Rebase over improved test coverage for theadba
    * Revert change to use TargetConstant as it appears to prevent the uimm2
      clause from matching in the XTheadBa patterns.
    * Fix an order of operands bug in the THeadBa pattern visible in the new
      test coverage.
    
    Original commit message follows:
    
    This implements a RISCV specific version of the SHL_ADD node proposed in
    llvm#88791.
    
    If that lands, the infrastructure from this patch should seamlessly
    switch over the to generic DAG node. I'm posting this separately because
    I've run out of useful multiply strength reduction work to do without
    having a way to represent MUL X, 3/5/9 as a single instruction.
    
    The majority of this change is moving two sets of patterns out of
    tablgen and into the post-legalize combine. The major reason for this is
    that I have an upcoming change which needs to reuse the expansion logic,
    but it also helps common up some code between zba and the THeadBa
    variants.
    
    On the test changes, there's a couple major categories:
    * We chose a different lowering for mul x, 25. The new lowering involves
      one fewer register and the same critical path, so this seems like a win.
    * The order of the two multiplies changes in (3,5,9)*(3,5,9) in some
      cases. I don't believe this matters.
    * I'm removing the one use restriction on the multiply. This restriction
      doesn't really make sense to me, and the test changes appear positive.
    preames committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    03760ad View commit details
    Browse the repository at this point in the history
  88. [mlir] Update comment about propertiesAttr (NFC) (llvm#89634)

    The comment is misleading because `propertiesAttr` is not actually
    ignored when the operation isn't unregistered.
    Mogball authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    e0c2848 View commit details
    Browse the repository at this point in the history
  89. Configuration menu
    Copy the full SHA
    ed255ed View commit details
    Browse the repository at this point in the history
  90. Configuration menu
    Copy the full SHA
    03c8a29 View commit details
    Browse the repository at this point in the history
  91. Configuration menu
    Copy the full SHA
    c793f4a View commit details
    Browse the repository at this point in the history
  92. Configuration menu
    Copy the full SHA
    f426be1 View commit details
    Browse the repository at this point in the history
  93. [NVPTX] Improve support for rsqrt.approx (llvm#89417)

    Complete support for rsqrt.approx with rsqrt.approx.f64 ([PTX ISA
    9.7.3.17. Floating Point Instructions:
    rsqrt.approx.ftz.f64](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-rsqrt-approx-ftz-f64)).
    Additionally, add support for folding `sqrt` into `rsqrt`, with an
    optional flag to disable.
    AlexMaclean authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    df60805 View commit details
    Browse the repository at this point in the history
  94. Configuration menu
    Copy the full SHA
    3197146 View commit details
    Browse the repository at this point in the history
  95. [AArch64][GISel] Avoid scalarizing G_IMPLICIT_DEF and G_FREEZE in the…

    … Legalizer (llvm#88469)
    
    It does not make sense to scalarize G_FREEZE as it leads to the generation
    of pairs of G_UNMERGE_VALUES and G_BUILD_VECTORs which are difficult to
    optimize especially when operations like G_TRUNC operate before G_FREEZE
    but after G_UNMERGE_VALUES.
    
    Instead, it is better to legalize G_FREEZE like any other vector type
    would be, as it gets lowered to a COPY during instruction selection
    anyways.
    
    This is an issue that was encountered when looking at the TSVC
    benchmark, where the legalization of G_FREEZE would cause generation of
    unnecessary MOVs that adversely affected the performance.
    dc03-work authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    143be6a View commit details
    Browse the repository at this point in the history
  96. [VectorCombine][X86] shuffle-of-binops.ll - adjust no matching operan…

    …d test to use FDIV
    
    Use of FDIV allows us to show a definite cost improvement with llvm#88899
    RKSimon committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    c45fbfd View commit details
    Browse the repository at this point in the history
  97. [AArch64] Match ZIP and UZP starting from undef elements. (llvm#89578)

    In case the first element of a zip/uzp mask is undef, the isZIPMask and
    isUZPMask functions have a 50% chance of picking the wrong
    "WhichResult", meaning they don't match a zip/uzp where they could. This
    patch alters the matching code to first check for the first non-undef
    element, to try and get WhichResult correct.
    davemgreen authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    cebc960 View commit details
    Browse the repository at this point in the history
  98. [NFC][InstrProf] Increment valid profile stat in populateCoverage (ll…

    …vm#89660)
    
    We increment `NumOfCSPGOFunc` and `NumOfPGOFunc` in
    `PGOUseFunc::readCounters()` already. We should do the same in
    `PGOUseFunc::populateCoverage`.
    
    
    https://github.com/llvm/llvm-project/blob/83bc7b57714dc2f6b33c188f2b95a0025468ba51/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp#L1331
    ellishg authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    abfb491 View commit details
    Browse the repository at this point in the history
  99. [flang][cuda] Remove restriction on device subprogram (llvm#89677)

    Newer version allow `pure`, `elemental` and `recursive` on device
    subprogram.
    clementval authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    49cb6db View commit details
    Browse the repository at this point in the history
  100. [libc++][ranges] export std::ranges::range_adaptor_closure (llvm#89793

    )
    
    This patch exports the `std::ranges::range_adaptor_closure` class
    template implemented in llvm#89148 from the C++ Modules file.
    xiaoyang-sde authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    3a9d8cd View commit details
    Browse the repository at this point in the history
  101. [libc++][chrono] Fixes format output of negative values. (llvm#89408)

    When trying to express a time before the epoch (e.g. "one nanosecond
    before 00:01:40 on 1900-01-01")
    the date would be shown as:
    
      1900-01-01 00:01:39.-00000001
    
    After this patch, that time would be correctly shown as:
    
      1900-01-01 00:01:39.999999999
    mordante authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    579d301 View commit details
    Browse the repository at this point in the history
  102. [llvm-exegesis] Add support for alderlake (llvm#88967)

    This patch adds the PFM counter definitions for Intel alderlake CPUs.
    boomanaiden154 authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    37e27a4 View commit details
    Browse the repository at this point in the history
  103. [libc++][CI] Removes clang-tidy references. (llvm#89092)

    The clang-tidy selection has been made automatic recently so this is not
    longer needed.
    
    Thanks to Louis for spotting this.
    mordante authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    9e95951 View commit details
    Browse the repository at this point in the history
  104. [DebugInfo] Report errors when DWARFUnitHeader::applyIndexEntry fails (

    …llvm#89156)
    
    Motivation: LLDB is able to report errors about these scenarios whereas
    LLVM's DWARF parser only gives a boolean success/fail. I want to migrate
    LLDB to using LLVM's DWARFUnitHeader class, but I don't want to lose
    some of the error reporting, so I'm adding it to the LLVM class first.
    bulbazord authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    1a8935a View commit details
    Browse the repository at this point in the history
  105. [libc++][doc] Updates module build instructions. (llvm#89413)

    CMake has landed experimental support for using the Standard modules.
    This will be part of the CMake 3.30 release. This updates the build
    instructions to use modules with CMake.
    
    The changes have been tested locally.
    
    ---------
    
    Co-authored-by: Will Hawkins <whh8b@obs.cr>
    mordante and hawkinsw authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    033453a View commit details
    Browse the repository at this point in the history
  106. [CodeGen][TII] Allow reassociation on custom operand indices (llvm#88306

    )
    
    This opens up a door for reusing reassociation optimizations on
    target-specific binary operations with non-standard operand list.
    
    This is effectively a NFC.
    mshockwave authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    5fe93b0 View commit details
    Browse the repository at this point in the history
  107. [flang] Remove hardcoded bits from AddDebugInfo. (llvm#89231)

    This PR adds following options to the AddDebugInfo pass.
    
    1. IsOptimized flag.
    2. Level of debug info to generate.
    3. Name of the source file
    
    This enables us to remove the hard coded values from the code. It also
    allows us to test the pass with different options. The tests have been
    modified to take advantage of that.
    
    The calling convention flag and producer name have also been improved.
    abidh authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    5f3f9d1 View commit details
    Browse the repository at this point in the history
  108. [lldb/test] Add basic ld.lld --debug-names tests (llvm#88335)

    Test that ld.lld --debug-names (llvm#86508) built per-module index can be
    consumed by lldb. This has uncovered a bug during the development of the
    lld feature.
    MaskRay authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    a7e2726 View commit details
    Browse the repository at this point in the history
  109. Configuration menu
    Copy the full SHA
    06cc175 View commit details
    Browse the repository at this point in the history
  110. Configuration menu
    Copy the full SHA
    1d14034 View commit details
    Browse the repository at this point in the history
  111. [libc] Generate docs for setjmp.h (llvm#89542)

    Resolves llvm#88065
    
    Added macros and functions.
    Rajveer100 authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    3ae10fd View commit details
    Browse the repository at this point in the history
  112. [clang] coroutine: generate valid mangled name in CodeGenFunction::ge…

    …nerateAwaitSuspendWrapper (llvm#89731)
    
    Fixes llvm#89723
    hokein authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    dc8f6a8 View commit details
    Browse the repository at this point in the history
  113. [RISCV] Use SHL_ADD in remaining strength reduce cases for MUL (llvm#…

    …89789)
    
    The interesting bit is the zext folding. This is the first case where we
    end up with a profitable fold of shNadd (zext x), y to shNadd.uw x, y.
    See zext_mul68 from rv64zba.ll.
    
    The test differences are cases where we can legally fold (only because
    there's no one use check). These are not profitable or harmful, but we
    can't a oneuse check without breaking the zext_mul68 case.
    
    Note that XTHeadBa doesn't appear to have the equivalent patterns so
    this only shows up in Zba.
    preames authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    0c032fd View commit details
    Browse the repository at this point in the history
  114. [hwasan] Add test for hwasan pass with fixed shadow (llvm#89813)

    This test records the current behavior of HWASan, which doesn't utilize
    the fixed shadow intrinsics of
    llvm@365bddf
    
    It is intended to be updated in future work ("Optimize outlined
    memaccess for fixed shadow on Aarch64";
    llvm#88544)
    thurstond authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    2662bce View commit details
    Browse the repository at this point in the history
  115. [libc] Make fenv and math tests preserve fenv_t state (llvm#89658)

    This adds a new test fixture class FEnvSafeTest (usable as a base
    class for other fixtures) that ensures each test doesn't perturb
    the `fenv_t` state that the next test will start with.  It also
    provides types and methods tests can use to explicitly wrap code
    under test either to check that it doesn't perturb the state or
    to save and restore the state around particular test code.
    
    All the fenv and math tests are updated to use this so that none
    can affect another.  Expectations that code under test and/or
    tests themselves don't perturb state can be added later.
    frobtech authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    837dab9 View commit details
    Browse the repository at this point in the history
  116. [libc++][TZDB] Fixes reverse time lookups. (llvm#89502)

    Testing with the get_info() returning a local_info revealed some issues
    in the reverse lookup. This needed an additional quirk. Also the
    skipping when not in the current continuation optimization was wrong. It
    prevented merging two sys_info objects.
    mordante authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    4e9decf View commit details
    Browse the repository at this point in the history
  117. [memprof] Take Schema into account in PortableMemInfoBlock::serialize…

    …dSize (llvm#89824)
    
    PortableMemInfoBlock::{serialize,deserialize} take Schema into
    account, allowing us to serialize/deserialize a subset of the fields.
    However, PortableMemInfoBlock::serializedSize does not.  That is, it
    assumes that all fields are always serialized and deserialized.  In
    other words, if we choose to serialize/deserialize a subset of the
    fields, serializedSize would claim more storage than we actually need.
    
    This patch fixes the problem by teaching serializedSize to take Schema
    into account.  For now, this patch has no effect on the actual indexed
    MemProf profile because we serialize/deserialize all fields, but that
    might change in the future.
    
    Aside from check-llvm, I tested this patch by verifying that
    llvm-profdata generates bit-wise identical files for each version for
    a large raw MemProf file I have.
    kazutakahirata authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    edf733b View commit details
    Browse the repository at this point in the history
  118. Configuration menu
    Copy the full SHA
    6b8d385 View commit details
    Browse the repository at this point in the history
  119. Configuration menu
    Copy the full SHA
    859de94 View commit details
    Browse the repository at this point in the history
  120. [Nomination] New Intel representative for the security group (llvm#89435

    )
    
    Sergey Malsov has left Intel. I would like to nominate Will Huhn to replace him as an Intel representative in the LLVM security group. Will is a security champion for the Intel compiler team. I believe he will be a valuable addition to the LLVM security group as a second representative from Intel. He has more security-specific expertise than me. I regularly consult with Will about topics the LLVM security group is considering, and it will be useful to have him more directly involved.
    Andy Kaylor authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    5ac744d View commit details
    Browse the repository at this point in the history
  121. [clang-tidy][modernize-use-starts-ends-with] Add support for compare() (

    llvm#89530)
    
    Using `compare` is the next most common roundabout way to express
    `starts_with` before it was added to the standard. In this case, using
    `starts_with` is a readability improvement. Extend existing
    `modernize-use-starts-ends-with` to cover this case.
    
    ```
    // The following will now be replaced by starts_with().
    string.compare(0, strlen("prefix"), "prefix") == 0;
    string.compare(0, 6, "prefix") == 0;
    string.compare(0, prefix.length(), prefix) == 0;
    string.compare(0, prefix.size(), prefix) == 0;
    ```
    nicovank authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    ef59069 View commit details
    Browse the repository at this point in the history
  122. Configuration menu
    Copy the full SHA
    4182120 View commit details
    Browse the repository at this point in the history
  123. [Xtensa] Implement base CallConvention. (llvm#83280)

    Implement base Calling Convention functionality. 
    Implement stack load/store register operations.
    Implement call lowering.
    andreisfr authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    36209d3 View commit details
    Browse the repository at this point in the history
  124. Revert "Reapply "[Clang][Sema] placement new initializes typedef arra…

    …y with correct size (llvm#83124)" (llvm#89036)"
    
    This reverts commit 74cab54.
    pranavk committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    e1321fa View commit details
    Browse the repository at this point in the history
  125. [RISCV] Split code that tablegen needs out of RISCVISAInfo. (llvm#89684)

    This introduces a new file, RISCVISAUtils.cpp and moves the rest of
    RISCVISAInfo to the TargetParser library.
    
    This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
    topperc authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    733a877 View commit details
    Browse the repository at this point in the history
  126. [gn build] Port 733a877

    llvmgnsyncbot committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    0c0c5c4 View commit details
    Browse the repository at this point in the history
  127. Configuration menu
    Copy the full SHA
    688c10d View commit details
    Browse the repository at this point in the history
  128. [msan] Eliminate non-deterministic behavior in the pass (llvm#89831)

    Almost NFC, instrumentation is as correct as it was before.
    
    We need InstrumentationList grouped by origin instruction,
    so we used stable_sort. However these objects already grouped
    because we never interleave sequences of `insertShadowCheck`
    of different instrunction.
    
    Pointer sort has artifact that it was deppendent on allocator behavior,
    so we could inserted checks in a different order.
    
    There is no test, as I failed to reproduce this with `opt`. My guess
    is that for reproducer we need to increase fragmentation in the
    allocator.
    vitalybuka authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    4f4ebee View commit details
    Browse the repository at this point in the history
  129. Configuration menu
    Copy the full SHA
    d56f08b View commit details
    Browse the repository at this point in the history
  130. Configuration menu
    Copy the full SHA
    3fa6b9c View commit details
    Browse the repository at this point in the history
  131. [lldb] Fix crash in SymbolFileCTF::ParseFunctions (llvm#89845)

    Make SymbolFileCTF::ParseFunctions resilient against not being able to
    resolve the argument or return type of a function. ResolveTypeUID can
    fail for a variety of reasons so we should always check its result.
    
    The type that caused the crash was `_Bool` which we didn't recognize 
    as a basic type. This commit also fixes the underlying issue and adds
    a test.
    
    rdar://126943722
    JDevlieghere authored Apr 23, 2024
    Configuration menu
    Copy the full SHA
    fd4399c View commit details
    Browse the repository at this point in the history
  132. [gn build] Port d56f08b

    llvmgnsyncbot committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    9c4735e View commit details
    Browse the repository at this point in the history

Commits on Apr 24, 2024

  1. IRSymTab: Record _GLOBAL_OFFSET_TABLE_ for ELF x86

    In ELF, relocatable files generated for x86-32 and some code models of
    x86-64 (medium, large) may reference the special symbol
    `_GLOBAL_OFFSET_TABLE_` that is not used in the IR. In an LTO link, if
    there is no regular relocatable file referencing the special symbol, the
    linker may not define the symbol and lead to a spurious "undefined
    symbol" error.
    
    Fix llvm#61101: record that `_GLOBAL_OFFSET_TABLE_` is used in the IR symbol
    table.
    
    Note: The `PreservedSymbols` mechanism
    (https://reviews.llvm.org/D112595) that just sets `FB_used` is not
    applicable.
    The `getRuntimeLibcallSymbols` for extracting lazy runtime library
    symbols is for symbols that are "always" potentially used, but linkers
    don't have the code model information to make a precise decision.
    
    Pull Request: llvm#89463
    MaskRay authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    99e7350 View commit details
    Browse the repository at this point in the history
  2. [NFC][MC][AArch64] Do not use else after return in getRelocType (ll…

    …vm#89818)
    
    After llvm#89563, we do not use else after return in code corresponding to
    `R_AARCH64_AUTH_ABS64` reloc in `getRelocType`. This patch removes use
    of else after return in other places in `getRelocType`.
    kovdan01 authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    2cbc2e3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    dc5939d View commit details
    Browse the repository at this point in the history
  4. [PowerPC] Add PPC prefix to retglue ISD node. NFC. (llvm#89771)

    So that aligned with other targets.
    Kai Luo authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    d97cdd7 View commit details
    Browse the repository at this point in the history
  5. [InstCombine] Fix miscompile in negation of select (llvm#89698)

    Swapping the operands of a select is not valid if one hand is more
    poisonous that the other, because the negation zero contains poison
    elements.
    
    Fix this by adding an extra parameter to isKnownNegation() to forbid
    poison elements.
    
    I've implemented this using manual checks to avoid needing four variants
    for the NeedsNSW/AllowPoison combinations. Maybe there is a better way
    to do this...
    
    Fixes llvm#89669.
    nikic authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    a1b1c4a View commit details
    Browse the repository at this point in the history
  6. [InstCombine] Fix poison propagation in select of bitwise fold (llvm#…

    …89701)
    
    We're replacing the select with the false value here, but it may be more
    poisonous if m_Not contains poison elements. Fix this by introducing a
    m_NotForbidPoison matcher and using it here.
    
    Fixes llvm#89500.
    nikic authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    7339f7b View commit details
    Browse the repository at this point in the history
  7. [RISCV] Remove implication of F extension for XTHeadFMemIdx from RISC…

    …VFeatures.td.
    
    There is no implies rule in RISCVISAInfo.cpp so this makes them
    consistent.
    
    Soon RISCVFeatures.td will be used to generate RISCVISAInfo.cpp so
    it won't be possible to mismatch.
    topperc committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    cc73c5c View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    469c8a0 View commit details
    Browse the repository at this point in the history
  9. [RISCV] Don't make Zacas or Zabha imply A in RISCVISAInfo.cpp

    Zabha and Zacas are both documented as depending on Zaamo. I'm
    hesitant to make them imply Zaamo instead.
    
    So remove the implication and replace with a check that either
    A or Zaamo is enabled.
    topperc committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    d9715c6 View commit details
    Browse the repository at this point in the history
  10. [InstCombine] Fix symbol conflicts in tests (NFC)

    These tests break when regenerated due to symbol conflicts.
    nikic committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    aa1e912 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    ba702aa View commit details
    Browse the repository at this point in the history
  12. [LIT][NVPTX] Add a few more known ptxas versions (llvm#89761)

    This patch adds known ptxas versions up to 12.4,
    to have tests targeting them.
    
    Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
    durga4github authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    da1e3e8 View commit details
    Browse the repository at this point in the history
  13. [WebAssembly] Fix uses of -DAG and -NOT in wasm-target-features.c (ll…

    …vm#89777)
    
    We are currently using `PREFIX-DAG` and `PREFIX-NOT` within a single
    `PREFIX` test in a mixed way, but `-DAG` and `-NOT` do not work that
    way. For example:
    
    Result:
    ```
    1
    2
    3
    ```
    
    Test file:
    ```c
    // CHECK-DAG: 3
    // CHECK-DAG: 1
    // CHECK-NOT: 2
    ```
    
    This does not work. The last line `CHECK-NOT: 2` does not trigger any
    error, because we've already covered all three lines (1~3) while
    matching `CHECK-DAG: 3` and `CHECK-DAG: 1`, and FileCheck tries to check
    the line `CHECK-NOT: 2` _after_ the line `3`.
    
    Actually, we have
    ```c
    // BLEEDING-EDGE-NOT:#define __wasm_reference_types__ 1{{$}}
    ```
    even though reference-types is enabled in 'bleeding-edge' config, and
    this has not triggered any error.
    
    This section
    (https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-dag-directive)
    explains the interactions between `CHECK-DAG` and `CHECK-NOT`s:
    > As a result, the surrounding `CHECK-DAG:` directives cannot be
    reordered, i.e. all occurrences matching `CHECK-DAG:` before
    `CHECK-NOT:` must not fall behind occurrences matching `CHECK-DAG:`
    after `CHECK-NOT:`.
    
    So in order to test the 'include' lists and 'not-include' lists, we have
    to run the tests twice with different prefixes. This splits `GENERIC`
    and `BLEEDING-EDGE` tests in two configs (`***-INCLUDE` and `***`) to
    test them correctly.
    
    This also adds some spaces after colons, sorts the feature lists, and
    adds `1{{$}}` to the `MVP` tests to make them consistent with `GENERIC`
    and `BLEEDING-EDGE` tests.
    aheejin authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    c8c1e4e View commit details
    Browse the repository at this point in the history
  14. [WebAssembly] Tidy up wasm-target-features.c (llvm#89778)

    This tidies up `wasm-target-features.c` cosmetically:
    - Sorts the feature tests alphabetically
    - Adds a space after colons
    aheejin authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    88b6186 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    b82a4bf View commit details
    Browse the repository at this point in the history
  16. [RISCV] Use the store value's VT as the MemoryVT after combining risc…

    …v.masked.strided.store (llvm#89874)
    
    According to `RISCVTargetLowering::getTgtMemIntrinsic`, the MemoryVT
    is the scalar element VT for strided store and the MemoryVT is the
    same as the store value's VT for unit-stride store.
    
    After combining `riscv.masked.strided.store` to `masked.store`, we
    just use the scalar element VT to construct `masked.store`, which is
    wrong.
    
    With wrong MemoryVT, the DAGCombiner will combine `trunc+masked.store`
    to truncated `masked.store` because `TLI.canCombineTruncStore` returns
    true.
    
    So, we should use the store value's VT as the MemoryVT.
    
    This fixes llvm#89833.
    wangpc-pp authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    6493da7 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    805d563 View commit details
    Browse the repository at this point in the history
  18. [IR] Memory Model Relaxation Annotations (llvm#78569)

    Implements the core/target-agnostic components of Memory Model
    Relaxation Annotations.
    
    RFC:
    https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5
    Pierre-vh authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    cf328ff View commit details
    Browse the repository at this point in the history
  19. [IR] Remove unused variable in Verifier.cpp (NFC)

    llvm-project/llvm/lib/IR/Verifier.cpp:4854:14:
    error: unused variable 'IsLeaf' [-Werror,-Wunused-variable]
      const auto IsLeaf = [](const Metadata *CurMD) {
                 ^
    1 error generated.
    DamonFool committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    806db47 View commit details
    Browse the repository at this point in the history
  20. [RISCV] Remove -riscv-split-regalloc flag (llvm#89715)

    Split vector and scalar regalloc has been enabled by default for 5
    months now since d0a39e6, and shipped
    with 18.1.0. I haven't heard of any issues with it so far, so this
    proposes to remove the flag to reduce the number of configurations we
    have to support.
    lukel97 authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    ad4a42b View commit details
    Browse the repository at this point in the history
  21. Re-apply "[ORC] Unify task dispatch across ExecutionSession..." with …

    …more fixes.
    
    This re-applies 6094b3b, which was reverted in e7efd37 (and before that
    in 1effa19) due to bot failures.
    
    The test failures were fixed by having SelfExecutorProcessControl use an
    InPlaceTaskDispatcher by default, rather than a DynamicThreadPoolTaskDispatcher.
    This shouldn't be necessary (and indicates a concurrency issue elsewhere), but
    InPlaceTaskDispatcher is a less surprising default, and better matches the
    existing behavior (compilation on current thread by default), so the change
    seems reasonable. I've filed llvm#89870
    to investigate the concurrency issue as a follow-up.
    
    Coding my way home: 6.25133S 127.94177W
    lhames committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    7da6342 View commit details
    Browse the repository at this point in the history
  22. [TableGen][GlobalISel] Specialize more MatchTable Opcodes (llvm#89736)

    The vast majority of the following (very common) opcodes were always
    called with identical arguments:
    
    - `GIM_CheckType` for the root
    - `GIM_CheckRegBankForClass` for the root
    - `GIR_Copy` between the old and new root
    - `GIR_ConstrainSelectedInstOperands` on the new root
    - `GIR_BuildMI` to create the new root
    
    I added overloaded version of each opcode specialized for the root
    instructions. It always saves between 1 and 2 bytes per instance
    depending on the number of arguments specialized into the opcode. Some
    of these opcodes had between 5 and 15k occurences in the AArch64
    GlobalISel Match Table.
    
    Additionally, the following opcodes are almost always used in the same
    sequence:
    
    - `GIR_EraseFromParent 0` + `GIR_Done` 
    - `GIR_EraseRootFromParent_Done` has been created to do both. Saves 2
    bytes per occurence.
    - `GIR_IsSafeToFold` was *always* called for each InsnID except 0.
    - Changed the opcode to take the number of instructions to check after
    `MI[0]`
    
    The savings from these are pretty neat. For `AArch64GenGlobalISel.inc`:
    - `AArch64InstructionSelector.cpp.o` goes down from 772kb to 704kb (-10%
    code size)
    - Self-reported MatchTable size goes from 420380 bytes to 352426 bytes
    (~ -17%)
    
    A smaller match table means a faster match table because we spend less
    time iterating and decoding.
    I don't have a solid measurement methodology for GlobalISel performance
    so I don't have precise numbers but I saw a few % of improvements in a
    simple testcase.
    Pierre-vh authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    9375962 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    008b7f1 View commit details
    Browse the repository at this point in the history
  24. [gn build] Port cf328ff

    llvmgnsyncbot committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    46b011d View commit details
    Browse the repository at this point in the history
  25. [ORC] Fix -Wunused-variable in LLJIT.cpp (NFC)

    llvm-project/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp:684:8:
    error: unused variable 'ConcurrentCompilationSettingDefaulted' [-Werror,-Wunused-variable]
      bool ConcurrentCompilationSettingDefaulted = !SupportConcurrentCompilation;
           ^
    1 error generated.
    DamonFool committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    9a8235a View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    78ebaa2 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    b3ca9c3 View commit details
    Browse the repository at this point in the history
  28. [ValueTracking] Add support for trunc nuw/nsw in isKnowNonZero

    With `nsw`/`nuw`, the `trunc` is non-zero if its operand is non-zero.
    
    Proofs: https://alive2.llvm.org/ce/z/iujmk6
    
    Closes llvm#89643
    goldsteinn committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    b933c84 View commit details
    Browse the repository at this point in the history
  29. [lldb] Enable support for Markdown documentation pages (llvm#89716)

    RST is powerful but usually too powerful for 90% of what we need it for.
    Markdown is easier to edit and can be previewed easily without building
    the entire website.
    
    This copies what llvm does already, making myst_parser optional if you
    only want man pages.
    
    Previously we had Markdown enabled in
    8b95bd3 but that got reverted. That did
    this in a different way but I've gone with the standard llvm set this
    time.
    
    I intend the first Markdown pages to be the remote protocol extension
    docs, as they are not in any set format right now.
    DavidSpickett authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    62db434 View commit details
    Browse the repository at this point in the history
  30. [clang][NFC] Remove useless code in ASTWriter

    A follow-up to llvm#71709, addressing the static analysis finding reported in https://github.com/llvm/llvm-project/pull/71709/files#r1576846306
    Endilll committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    662ef86 View commit details
    Browse the repository at this point in the history
  31. [ARM][AArch64] autogenerate header file for TargetParser from Target …

    …tablegen files (llvm#88378)
    
    Introduce a mechanism to share data between the ARM and AArch64 backends and
    TargetParser, to reduce duplication of code. This is similar to the current
    RISC-V implementation.
    
    The target tablegen file (in this case `ARM.td` or `AArch64.td`) is
    processed during building of `TargetParser` to generate the following
    files in the build tree:
     - `build/include/llvm/TargetParser/ARMTargetParserDef.inc`
     - `build/include/llvm/TargetParser/AArch64TargetParserDef.inc`
    
    For now, the use of these generated files is limited to files _outside_
    of `TargetParser`. The main reason for this is that the modifications to
    `TargetParser` will require additional data added to the tablegen files,
    which I want to split into separate PRs.
    tmatheson-arm authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    71c5964 View commit details
    Browse the repository at this point in the history
  32. [ORC] Fix bot failure due to 7da6342 (ORC task dispatch unification).

    Fixes the failure at https://lab.llvm.org/buildbot/#/builders/131/builds/62928,
    and add comments about unused variable and update debugging output.
    
    Coding my way home: 6.44615S, 128.16704W
    lhames committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    69703b1 View commit details
    Browse the repository at this point in the history
  33. [lldb][Docs] Convert GDB protocol extensions doc to Markdown and add …

    …to website (llvm#89718)
    
    This document has never been on the website, unlike GDB's protocol docs.
    It will be useful to have both available online to compare.
    
    Markdown is easier to edit and preview in many editors (including Github
    itself), so I've chosen that over RST. Plus, building the website takes
    minutes and I lose the will to make nice edits when I have to deal with
    that.
    
    The standard dialiect lacks some things notably multi-line table cells,
    so I've converted large tables into bullet point lists
    so that we still get text wrapping. This is a downside but I think the
    simplicity of Markdown outweighs this.
    
    I have applied the plain text markers where I've noticed it and escaped
    some HTML characters. There may be more changes needed but, it's
    Markdown, so it's in theory a lot easier for someone to fix it!
    DavidSpickett authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    601d0ca View commit details
    Browse the repository at this point in the history
  34. [SPIR-V] New validation tests for pointer and primitive types (llvm#8…

    …9632)
    
    This patch adds new tests mostly checking SPIR-V validation of pointer
    and primitive types.
    michalpaszkowski authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    c071c1d View commit details
    Browse the repository at this point in the history
  35. [RISCV] Separate doLocalPostpass into new pass and move to post vecto…

    …r regalloc (llvm#88295)
    
    This patch splits off part of the work to move vsetvli insertion to post
    regalloc in llvm#70549.
    
    The doLocalPostpass operates outside of RISCVInsertVSETVLI's dataflow,
    so we can move it to its own pass. We can then move it to post vector
    regalloc which should be a smaller change.
    
    A couple of things that are different from llvm#70549:
    
    - This manually fixes up the LiveIntervals rather than recomputing it
    via createAndComputeVirtRegInterval. I'm not sure if there's much of a
    difference with either.
    - For the postpass it's sufficient enough to just check isUndef() in
    hasUndefinedMergeOp, i.e. we don't need to lookup the def in VNInfo.
    
    Running on llvm-test-suite and SPEC CPU 2017 there aren't any changes in
    the number of vsetvlis removed. There are some minor scheduling diffs as
    well as extra spills and less spills in some cases (caused by transient
    vsetvlis existing between RISCVInsertVSETVLI and RISCVCoalesceVSETVLI
    when vec regalloc happens), but they are minor and should go away once
    we finish moving the rest of RISCVInsertVSETVLI.
    
    We could also potentially turn off this pass for unoptimised builds.
    lukel97 authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    603ba4c View commit details
    Browse the repository at this point in the history
  36. [libc][bazel] Allow configure options to alter all targets (llvm#89251)

    The previous state was leading to inconsistencies. Some targets would
    get the options and some wouldn't. As an example, the `MEMORY_COPTS`
    definitions would only apply to the `:string_memory_utils` target but
    not to the `:memcpy` target. This patch makes sure definitions are
    applied throughout the LLVM libc targets as `local_defines`. This
    ensures that the preprocessor definitions don't propagate to depending
    targets outside of LLVM libc, and that all libc targets have consistent
    preprocessor definitions.
    gchatelet authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    788d159 View commit details
    Browse the repository at this point in the history
  37. [AMDGPU] Allow WorkgroupID intrinsics in amdgpu_gfx functions (llvm#8…

    …9773)
    
    With GFX12 architected SGPRs the workgroup ids are trivially available
    in any function called from a compute entrypoint.
    jayfoad authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    4616368 View commit details
    Browse the repository at this point in the history
  38. [libcxx] [modules] Add _LIBCPP_USING_IF_EXISTS on aligned_alloc (llvm…

    …#89827)
    
    This is missing e.g. on Windows. With this change, it's possible to make
    the libcxx std module work on mingw-w64 (although that requires a few
    fixes to those headers).
    
    In the regular cstdlib header, we have _LIBCPP_USING_IF_EXISTS flagged
    on every single reexported function (since
    a9c9183), but the modules seem to only
    have _LIBCPP_USING_IF_EXISTS set on a few individual functions, so far.
    mstorsjo authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    91526d6 View commit details
    Browse the repository at this point in the history
  39. [RISCV] Add test coverage for commutable RVV instructions

    This patch adds test coverage for commutable RVV instructions
    added in llvm#88379.
    
    For each kind of instruction, I add two tests (one for unmasked and
    one for masked). These tests don't cover all the SEWs/LMULs as I
    think it's not worthy because there is no difference when handling
    instructions with different SEWs/LMULs.
    
    As the tests shown, we can't eliminate two equal instructions if
    there is a use of `V0`. This may be fixed in the future.
    
    Reviewers: asb, jacquesguan, topperc, lukel97, preames
    
    Reviewed By: lukel97
    
    Pull Request: llvm#89889
    wangpc-pp authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    d149370 View commit details
    Browse the repository at this point in the history
  40. [InstCombine] Simplify (X / C0) * C1 + (X % C0) * C2 to `(X / C0) *…

    … (C1 - C2 * C0) + X * C2` (llvm#76285)
    
    Since `DivRemPairPass` runs after `ReassociatePass` in the optimization
    pipeline, I decided to do this simplification in `InstCombine`.
    
    Alive2: https://alive2.llvm.org/ce/z/Jgsiqf
    Fixes llvm#76128.
    dtcxzyw authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    945eeb2 View commit details
    Browse the repository at this point in the history
  41. [ORC] Fix SpeculativeJIT example after 7da6342 (ORC dispatch unificat…

    …ion).
    
    Fixes the bot failure at
    https://lab.llvm.org/buildbot/#/builders/272/builds/14788.
    
    Coding my way home: 6.48551S, 128.21109W
    lhames committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    e400e90 View commit details
    Browse the repository at this point in the history
  42. [libclc] Use a response file when building on Windows (llvm#89756)

    We've recently seen the libclc llvm-link invocations become so long that
    they exceed the character limits on certain platforms.
    
    Using a 'response file' should solve this by offloading the list of
    inputs into a separate file, and using special syntax to pass it to
    llvm-link. Note that neither the response file nor syntax aren't
    specific to Windows but we restrict it to that platform regardless. We
    have the option of expanding it to other platforms in the future.
    frasercrmck authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    effb2f1 View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    4c3b0a6 View commit details
    Browse the repository at this point in the history
  44. [VectorCombine] foldShuffleOfBinops - add support for length changing…

    … shuffles (llvm#88899)
    
    Refactor to be closer to foldShuffleOfCastops - sibling patch to llvm#88743 that can be used to address some of the issues identified in llvm#88693
    RKSimon authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    282b56f View commit details
    Browse the repository at this point in the history
  45. Bit width of input/result types in OpSConvert/OpUConvert must not be …

    …the same (llvm#89737)
    
    This PR fixes the issue
    llvm#88908
    Attached test case is updated to check that OpSConvert/OpUConvert is not
    generated when input and result types are identical.
    VyacheslavLevytskyy authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    89d1255 View commit details
    Browse the repository at this point in the history
  46. [SPIR-V] Fix pre-legalizer pass in SPIR-V Backend to support more gMI…

    …R opcode inserted by IRTranslator (llvm#89890)
    
    Translating global values, IRTranslator pass can sometimes generates
    code patterns that require additional efforts during pre-legalization.
    This PR addresses this problem to support G_PTRTOINT instruction used in
    initialization of GV.
    VyacheslavLevytskyy authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    486ea1e View commit details
    Browse the repository at this point in the history
  47. [flang][OpenMP] fix reduction of arrays with non-default lower bounds (

    …llvm#89611)
    
    It turned out that `hlfir::genVariableBox` didn't add lower bounds to
    the boxes it created. Using a shapeshift instead of only a shape adds
    the lower bounds information to the thread-local copy of the box.
    
    Fixes llvm#89259
    tblah authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    18bf0c3 View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    ceca523 View commit details
    Browse the repository at this point in the history
  49. [ARM] Add ARMTargetDefEmitter to llvm-tblgen source

    Missed from llvm#88378, only showed up in the sanitizer builds.
    tmatheson-arm committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    b8e97f0 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    3cb660d View commit details
    Browse the repository at this point in the history
  51. [RISCV] bitreverse-shift.ll - fix typo

    Noticed in llvm#89897
    RKSimon committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    e5de95d View commit details
    Browse the repository at this point in the history
  52. [TTI] getArithmeticInstrCost - use std:nullopt to create default empt…

    …y `ArrayRef<const Value *> Args` argument. NFC.
    RKSimon committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    506c84a View commit details
    Browse the repository at this point in the history
  53. [mlir][nvgpu] NVGPU Tutorials (llvm#87065)

    I have a tutorial at EuroLLVM 2024 ([Zero to Hero: Programming Nvidia
    Hopper Tensor Core with MLIR's NVGPU
    Dialect](https://llvm.swoogo.com/2024eurollvm/session/2086997/zero-to-hero-programming-nvidia-hopper-tensor-core-with-mlir's-nvgpu-dialect)).
    For that, I implemented tutorial codes in Python. The focus is the nvgpu
    dialect and how to use its advanced features. I thought it might be
    useful to upstream this.
    
    The tutorial codes are as follows:
    - **Ch0.py:** Hello World
    - **Ch1.py:** 2D Saxpy
    - **Ch2.py:** 2D Saxpy using TMA
    - **Ch3.py:** GEMM 128x128x64 using Tensor Core and TMA 
    - **Ch4.py:** Multistage performant GEMM using Tensor Core and TMA
    - **Ch5.py:** Warp Specialized GEMM using Tensor Core and TMA
    
    I might implement one more chapter:
    
    - **Ch6.py:** Warp Specialized Persistent ping-pong GEMM
    
    This PR also introduces the nvdsl class, making IR building in the
    tutorial easier.
    grypp authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    4d33082 View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    333aad7 View commit details
    Browse the repository at this point in the history
  55. AMDGPU: Remove dead arguments in test and add SGPR variants

    Also cleanup to avoid the memory noise by using return values
    in the trivial cases.
    arsenm committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    a13ff06 View commit details
    Browse the repository at this point in the history
  56. Configuration menu
    Copy the full SHA
    401658c View commit details
    Browse the repository at this point in the history
  57. Configuration menu
    Copy the full SHA
    01f8da9 View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    c81ec1f View commit details
    Browse the repository at this point in the history
  59. [AArch64][CodeGen] Add patterns for small negative VScale const (llvm…

    …#89607)
    
    On AArch64, rdvl can accept a nagative value, while cntd/cntw/cnth can't.
    As we do support VScale with a negative multiply value, so we did not limit
    the negative value and instead took the hit of having the extra patterns according PR88108.
    Also add NoUseScalarIncVL to avoid affecting patterns works for -mattr=+use-scalar-inc-vl
        
    Fix llvm#84620
    vfdff authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    af81d8e View commit details
    Browse the repository at this point in the history
  60. [AMDGPU] Correctly determine the toolchain linker (llvm#89803)

    Summary:
    The AMDGPU toolchain simply took the short name to get the link job
    instead of using the common utilities that respect options like
    `-fuse-ld`. Any linker that isn't `ld.lld` will fail, however we should
    be able to override it.
    jhuber6 authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    62549db View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    eaa2eac View commit details
    Browse the repository at this point in the history
  62. [DAG] Add getValid*ShiftAmountConstant wrappers without DemandedElts

    Simplify callers which don't have their own DemandedElts mask.
    
    Noticed while reviewing llvm#88801
    RKSimon committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    9f2a068 View commit details
    Browse the repository at this point in the history
  63. [MLIR][LLVM][Mem2Reg] Extends support for partial stores (llvm#89740)

    This commit enhances the LLVM dialect's Mem2Reg interfaces to support
    partial stores to memory slots. To achieve this support, the `getStored`
    interface method has to be extended with a parameter of the reaching
    definition, which is now necessary to produce the resulting value after
    this store.
    Dinistro authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    6e9ea6e View commit details
    Browse the repository at this point in the history
  64. [mlir][python] extend LLVM bindings (llvm#89797)

    Add bindings for LLVM pointer type.
    makslevental authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    79d4d16 View commit details
    Browse the repository at this point in the history
  65. [gn] port b8e97f0

    nico committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    d3f6c2c View commit details
    Browse the repository at this point in the history
  66. [clang][ExtractAPI] Fix handling of anonymous TagDecls (llvm#87772)

    This changes the handling of anonymous TagDecls to the following rules:
    - If the TagDecl is embedded in the declaration for some VarDecl (this
    is the only possibility for RecordDecls), then pretend the child decls
    belong to the VarDecl
    - If it's an EnumDecl proceed as we did previously, i.e., embed it in
    the enclosing DeclContext.
    
    Additionally this fixes a few issues with declaration fragments not
    consistently including "{ ... }" for anonymous TagDecls. To make testing
    these additions easier this patch fixes some text declaration fragments
    merging issues and updates tests accordingly.
    
    rdar://121436298
    daniel-grumberg authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    2bcbe40 View commit details
    Browse the repository at this point in the history
  67. LangRef: fix broken link

    zmodem committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    93eeca3 View commit details
    Browse the repository at this point in the history
  68. [gn] port 71c5964 (-gen-arm-target-def)

    Reverts d3f6c2c, since ARMTargetDefEmitter.cpp has to be in
    llvm-min-tblgen too.
    nico committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    b87b6e2 View commit details
    Browse the repository at this point in the history
  69. [Frontend][OpenMP] Implement getLeafOrCompositeConstructs (llvm#89104)

    This function will break up a construct into constituent leaf and
    composite constructs, e.g. if OMPD_c_d_e and OMPD_d_e are composite
    constructs, then OMPD_a_b_c_d_e will be broken up into the list {OMPD_a,
    OMPD_b, OMPD_c_d_e}.
    kparzysz authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    d577518 View commit details
    Browse the repository at this point in the history
  70. Allow ZX_ERR_NO_RESOURCES with MAP_ALLOWNOMEM on Fuchsia (llvm#89767)

    This can occur if the virtual address space is (almost) entirely
    mapped or heavily fragmented.
    fabio-d authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    9cbf96a View commit details
    Browse the repository at this point in the history
  71. [Clang][AArch64] Extend diagnostics when warning non/streaming about …

    …vector size difference (llvm#88380)
    
    Add separate messages about passing arguments or returning parameters
    with scalable types.
    
    ---------
    
    Co-authored-by: Sander de Smalen <sander.desmalen@arm.com>
    dtemirbulatov and sdesmalen-arm authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    bd34bc6 View commit details
    Browse the repository at this point in the history
  72. [MLIR][OpenMP] Make omp.wsloop into a loop wrapper (1/5) (llvm#89209)

    This patch updates the definition of `omp.wsloop` to enforce the
    restrictions of a loop wrapper operation.
    
    Related tests are updated but this PR on its own will not pass premerge
    tests. All patches in the stack are needed before it can be compiled and
    passes tests.
    skatrak authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    07e6c16 View commit details
    Browse the repository at this point in the history
  73. [CodeGen] Make the parameter TRI required in some functions. (llvm#85968

    )
    
    Fixes llvm#82659
    
    There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue llvm#82411.
    
    Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.
    
    After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
    simonzgx authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    f6d431f View commit details
    Browse the repository at this point in the history
  74. [coro] Tweak comments about CoroAwaitSuspendInst

    to reflect that there are three variants.
    zmodem committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    4c16b12 View commit details
    Browse the repository at this point in the history
  75. [MLIR][OpenMP] Update op verifiers dependent on omp.wsloop (2/5) (llv…

    …m#89211)
    
    This patch updates verifiers for `omp.ordered`, `omp.ordered.region`,
    `omp.cancel` and `omp.cancellation_point`, which check for a parent
    `omp.wsloop`.
    
    After transitioning to a loop wrapper-based approach, the expected
    direct parent will become `omp.loop_nest` instead, so verifiers need to
    take this into account.
    
    This PR on its own will not pass premerge tests. All patches in the
    stack are needed before it can be compiled and passes tests.
    skatrak authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    1465299 View commit details
    Browse the repository at this point in the history
  76. [MLIR][SCF] Update scf.parallel lowering to OpenMP (3/5) (llvm#89212)

    This patch makes changes to the `scf.parallel` to `omp.parallel` +
    `omp.wsloop` lowering pass in order to introduce a nested
    `omp.loop_nest` as well, and to follow the new loop wrapper role for
    `omp.wsloop`.
    
    This PR on its own will not pass premerge tests. All patches in the
    stack are needed before it can be compiled and passes tests.
    skatrak authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    8843d54 View commit details
    Browse the repository at this point in the history
  77. [MLIR][OpenMP] Update omp.wsloop translation to LLVM IR (4/5) (llvm#8…

    …9214)
    
    This patch introduces minimal changes to the MLIR to LLVM IR translation
    of `omp.wsloop` to support the loop wrapper approach.
    
    There is `omp.loop_nest` related translation code that should be
    extracted and shared among all loop operations (e.g. `omp.simd`). This
    would possibly also help in the addition of support for compound
    constructs later on. This first approach is only intended to keep things
    running after the transition to loop wrappers and not to add support for
    other use cases enabled by that transition.
    
    This PR on its own will not pass premerge tests. All patches in the
    stack are needed before it can be compiled and passes tests.
    skatrak authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    2e37f28 View commit details
    Browse the repository at this point in the history
  78. [Flang][OpenMP][Lower] Update workshare-loop lowering (5/5) (llvm#89215)

    This patch updates lowering from PFT to MLIR of workshare loops to
    follow the loop wrapper approach. Unit tests impacted by this change are
    also updated.
    
    As the last patch of the stack, this should compile and pass unit tests.
    skatrak authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    ca4dbc2 View commit details
    Browse the repository at this point in the history
  79. [flang] lower SHAPE intrinsic (llvm#89785)

    Semantics usually fold SHAPE into an array constructor, but sometimes it
    cannot (like when the source is a function result that cannot be
    duplicated in expression analysis). Add lowering handling for shape.
    jeanPerier authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    3328ccf View commit details
    Browse the repository at this point in the history
  80. [CostModel][AArch64] Improve fixed-width vector costs for get.active.…

    …lane.mask (llvm#89068)
    
    When SVE is available we can lower calls to get.active.lane.mask using
    the SVE whilelo instruction, however in practice since vXi1 types are
    not legal for NEON we often end up expanding the predicate into a vector
    of integers, e.g. v4i1 -> v4i32. This usually happens when we have to
    keep the predicate live out of the block, for example when the predicate
    is the incoming value to a PHI node in a tail-folded vector loop.
    Currently in such cases the intrinsic call has a cost of 1, which is far
    too low when considering the extra instructions required to expand the
    predicate. This patch fixes that by basing the cost on the number of
    lane moves required for expansion. This is required for a follow-on
    patch that adds the cost of the intrinsic call to the vectorisation cost
    model, so that we can teach the vectoriser to make better choices.
    david-arm authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    96b2e35 View commit details
    Browse the repository at this point in the history
  81. [Clang] [NFC] Prevent null pointer dereference in Sema::InstantiateFu…

    …nctionDefinition (llvm#89801)
    
    In the lambda function within
    clang::Sema::InstantiateFunctionDefinition, the return value of a
    function that may return null is now checked before dereferencing to
    avoid potential null pointer dereference issues which can lead to
    crashes or undefined behavior in the program.
    smanna12 authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    e58dcf1 View commit details
    Browse the repository at this point in the history
  82. [AMDGPU] Add a trap lowering workaround for gfx11 (llvm#85854)

    On gfx11 shaders run with PRIV=1, which causes `s_trap 2` to be treated
    as a nop, which means it isn't a correct lowering for the trap
    intrinsic. As a workaround, this commit instead lowers the trap
    intrinsic to instructions that simulate the behavior of s_trap 2.
    
    Fixes: SWDEV-438421
    epilk authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    a047147 View commit details
    Browse the repository at this point in the history
  83. Configuration menu
    Copy the full SHA
    21ef187 View commit details
    Browse the repository at this point in the history
  84. Configuration menu
    Copy the full SHA
    50082d6 View commit details
    Browse the repository at this point in the history
  85. [Mips] Use ANDi in for zero-extend in subword atomic umax/umin for bo…

    …th r2 and pre-R2 (llvm#89881)
    
    About unsigned max/min, ANDi is available for all ISA revisions in
    extend before slt insn.
    So that we can reduce one instruction.
    yingopq authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    e1aa162 View commit details
    Browse the repository at this point in the history
  86. Configuration menu
    Copy the full SHA
    a682f52 View commit details
    Browse the repository at this point in the history
  87. [clang][RISCV] Remove LMUL=8 scalar input for some vector crypto inst…

    …ructions (llvm#89867)
    
    Since the requirement is EEW=32, it's impossible that EGW=128
    needs LMUL=8.
    4vtomat authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    418bdb4 View commit details
    Browse the repository at this point in the history

Commits on Apr 29, 2024

  1. Configuration menu
    Copy the full SHA
    3c30af4 View commit details
    Browse the repository at this point in the history

Commits on May 3, 2024

  1. Post-merge fixes

    skatrak committed May 3, 2024
    Configuration menu
    Copy the full SHA
    b8211e9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e967097 View commit details
    Browse the repository at this point in the history