Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM and SPIRV-LLVM-Translator pulldown (WW06) #12661

Merged
merged 678 commits into from
Feb 13, 2024
Merged

LLVM and SPIRV-LLVM-Translator pulldown (WW06) #12661

merged 678 commits into from
Feb 13, 2024

Conversation

sys-ce-bb
Copy link
Contributor

zyn0217 and others added 30 commits February 6, 2024 09:59
…instantiation if possible (#80594)

Before the constraint substitution, we employ
`getTemplateInstantiationArgs`, which in turn attempts to inspect
`TemplateArgument`s from the function template. For parameter packs from
their parent contexts, we used to extract the arguments from the
specialization type, in which could result in non-canonical argument
types e.g. `PackExpansionType`.

This may break the contract that, during a tree transformation, in
`TreeTransform::TryExpandParameterPacks`, the corresponding
`TemplateArgument`s for an `UnexpandedParameterPack` are expected to be
of `Pack` kinds if we're expanding template parameters.

Fixes llvm/llvm-project#72557.
llvm-project/mlir/test/CAPI/sparse_tensor.c:50:42:
error: format specifies type 'unsigned long' but the argument has type 'MlirSparseTensorLevelType' (aka 'unsigned long long') [-Werror,-Wformat]
   50 |     fprintf(stderr, "level_type: %lu\n", lvlTypes[l]);
      |                                  ~~~     ^~~~~~~~~~~
      |                                  %llu
1 error generated.
This patch adjusts the build process for building the toolchain for the
CI container to perform more rigorous perf-training for PGO,
particularly building the entirety of LLVM as that is what showed the
best results while benchmarking. This patch also splits the job into two
stages to avoid timeouts due to the large increase in buildtime. There
are a couple other hacks added in here to make things work that we can
do away with eventually once we're able to run jobs like this on more
powerful self-hosted runners.
debugserver on arm64 devices can manage both Byte Address Select
watchpoints (1-8 bytes) and MASK watchpoints (8 bytes-2 gigabytes). This
adds a SupportedWatchpointTypes key to the QSupported response from
debugserver with a list of these, so lldb can take full advantage of
them when creating larger regions with a single hardware watchpoint.

Also add documentation for this, and two other lldb extensions, to the
lldb-gdb-remote.txt documentation.

Re-enable TestLargeWatchpoint.py on Darwin systems when testing with the
in-tree built debugserver. I can remove the "in-tree built debugserver"
in the future when this new key is handled by an Xcode debugserver.
llvm-project/mlir/test/CAPI/sparse_tensor.c:50:43:
error: format specifies type 'unsigned long long' but the argument has type 'MlirSparseTensorLevelType' (aka 'unsigned long') [-Werror,-Wformat]
    fprintf(stderr, "level_type: %llu\n", lvlTypes[l]);
                                 ~~~~     ^~~~~~~~~~~
                                 %lu
1 error generated.
…0725)

This is part of
llvm/llvm-project@66347e5

The regression in downstream projects is about transfer_read patterns,
which needs more investigation. Add the support for transfer_write for
now.
…Registry.def (#80779)

This matches the optimization pipeline's PassRegistry.def.

I ran into a bug where CONSTRUCTOR wasn't always being used (in
PassBuilder::registerMachineFunctionAnalyses()).

Make DUMMY_* just accept a pass name, there's no point in having proper
constructors if the generated dummy class has a templated constructor
accepting arbitrary arguments.

Remove unused getPassNameFromLegacyName() as it was using this but for
no purpose.

Remove DUMMY_MACHINE_FUNCTION_ANALYSIS, we can just add those as we port
them.

This for some reason exposed missing mock calls in existing unittests.
new

```C
while (_Generic(x, //
           long: x)(x) > x) {
}
while (_Generic(x, //
           long: x)(x)) {
}
```

old

```C
while (_Generic(x, //
       long: x)(x) > x) {
}
while (_Generic(x, //
    long: x)(x)) {
}
```

In the first case above, the second line previously aligned to the open
parenthesis.  The 4 spaces did not get added by the fallback line near
the end of getNewLineColumn because there was already some indentaton.
Now the spaces get added explicitly.

In the second case above, without the fake parentheses, the second line
did not respect the outer parentheses, because the LastSpace field did
not get set without the fake parentheses.  Now the indentation of the
outer level is used instead.
…#80199)

This patch removes a couple of redundant buffer copies in emitTable for
setting up calls to decodeULEB128. Instead, provide the Table.data
buffer directly to the calls-- where decodeULEB128 does its own buffer
overflow checking.

Factor out 7 explicit loops to emit ULEB128 bytes into emitULEB128. Also
factor out 4 copies of 24-bit numtoskip emission into emitNumToSkip.

The functionality is already covered by existing unit tests and by
virtue of most of the in-tree back-ends exercising the decoder emitter.
…… (#80589)

…DOW_SCALE

As MEM_GRANULARITY represents the size of memory block mapped to a
single shadow entry, and SHADOW_SCALE represents the scale of shadow
mapping, so the single shadow entry size can be computed as
(MEM_GRANULARITY >> SHADOW_SCALE).

This patch replaces the hardcoded SHADOW_ENTRY_SIZE with
(MEM_GRANULARITY >> SHADOW_SCALE).
… (#80590)

Following the discussion in
https://discourse.llvm.org/t/symboltable-and-symbol-parent-child-relationship/75446,
we should enforce that a symbol's immediate parent is a symbol table.

I changed some tests to pass the verification. In most cases, we can
wrap the func with a module, change the func to another op with regions
i.e. scf.if, or change the expected error message.

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
This adds the `emitc.declare_func` operation that allows to emit the
declaration of an `emitc.func` at a specific location.
This patch adds a script to automatically query the number of running
jobs and print them to the terminal as this functionality isn't
available through the Github UI (unless you are a Github administrator).
Under fast-math flags it's possible to convert `sqrt(exp(X)) `into
`exp(X * 0.5)`. I suppose that this transformation is always profitable.
This is similar to the optimization existing in GCC.
Resolves #27008, #39735, #53013, #63619.

Hello, this PR adds the MainIncludeChar option to clang-format, allowing
to select which include syntax must be considered when searching for the
main header: quotes (`#include "foo.hpp"`, the default), brackets
(`#include <foo.hpp>`), or both.

The lack of support for brackets has been reported many times, see the
linked issues, so I am pretty sure there is a need for it :)

A short note about why I did not implement a regex approach as discussed
in #53013: while a regex would have allowed many extra ways to describe
the main header, the bug descriptions listed above suggest a very simple
need: support brackets for the main header. This PR answers this needs
in a quite simple way, with a very simple style option. IMHO the feature
space covered by the regex (again, for which there is no demand :)) can
be implemented latter, in addition to the proposed option.

The PR also includes tests for the option with and without grouped
includes.
…kage as per standard. (#79246)

Adding extern "C" to all the entry point functions to make sure that
these functions are not mangled.
…y (#80170)

When unrolling the reduction dimension of something like a matmul for
SME, you can end up with transposed reads of illegal types, like so:

```mlir
%illegalRead = vector.transfer_read %memref[%a, %b]
                : memref<?x?xf32>, vector<[8]x4xf32>
%legalType = vector.transpose %illegalRead, [1, 0]
                : vector<[8]x4xf32> to vector<4x[8]xf32>
```

Here the `vector<[8]x4xf32>` is an illegal type, there's no way to lower
a scalable vector of fixed vectors. However, as the final type
`vector<4x[8]xf32>` is legal, we can instead lift the transpose to
memory (producing a strided memref), and eliminate all the illegal
types. This is shown below.

```mlir
%readSubview = memref.subview %memref[%a, %b] [%c8_vscale, %c4] [%c1, %c1]
                : memref<?x?xf32> to memref<?x?xf32>
%transpose = memref.transpose %readSubview (d0, d1) -> (d1, d0)
                : memref<?x?xf32> to memref<?x?xf32>
%legalType = vector.transfer_read %transpose[%c0, %c0]
                : memref<?x?xf32>, vector<4x[8]xf32>
```
Having libc_errno outside of the namespace causes versioning issues when
trying to link the tests against LLVM-libc. Most of this patch is just
moving libc_errno inside the namespace in tests. This isn't necessary in
the function implementations since those are already inside the
namespace.
Removed target-triple in target-independent test case to fix failing test caused by llvm/llvm-project#67725.
A simple enough op pass so we can test standard instrumentations in
future.
optimizeTan has been renamed to optimizeTrigInversionPairs as a result.

Sadly, this is not mathematically true that all inverse pairs fold to x.
For example, asin(sin(x)) does not fold to x if x is over 2pi.
Spec: #11301

More accurately, this PR adds support for the named subgroup related features of SPV_INTEL_subgroup_requirements to support implementation of sycl_ext_named_sub_group_sizes (also see #12335). The features related to subgroup lane mapping are not added yet.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@43acfef
@sys-ce-bb sys-ce-bb added the disable-lint Skip linter check step and proceed with build jobs label Feb 8, 2024
@jsji jsji closed this Feb 8, 2024
@jsji jsji reopened this Feb 8, 2024
@jsji jsji self-assigned this Feb 8, 2024
@jsji jsji closed this Feb 9, 2024
@jsji jsji reopened this Feb 9, 2024
@jsji jsji marked this pull request as ready for review February 9, 2024 18:00
@jsji jsji requested review from a team and bader as code owners February 9, 2024 18:00
@jsji
Copy link
Contributor

jsji commented Feb 9, 2024

This is ready for merge. @bader @intel/llvm-gatekeepers

@jsji
Copy link
Contributor

jsji commented Feb 13, 2024

@bader @intel/llvm-gatekeepers Can we get this merged? Thanks.

@bader
Copy link
Contributor

bader commented Feb 13, 2024

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Feb 13, 2024

Tue 13 Feb 2024 03:16:19 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Feb 13, 2024

Tue 13 Feb 2024 03:20:49 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

@bb-sycl bb-sycl merged commit 4478292 into sycl Feb 13, 2024
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.