-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sycl-web lit tests #16996
Closed
Closed
Fix sycl-web lit tests #16996
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… tables This leverages the sharded structure of the builtins to make it easy to directly tablegen most of the AArch64 and ARM builtins while still using X-macros for a few edge cases. It also extracts common prefixes as part of that. This makes the string tables for these targets dramatically smaller. This is especially important as the SVE builtins represent (by far) the largest string table and largest builtin table across all the targets in Clang.
…and info This moves the main builtins and several targets to use nice generated string tables and info structures rather than X-macros. Even without obvious prefixes factored out, the resulting tables are significantly smaller and much cheaper to compile with out all the X-macro overhead. This leaves the X-macros in place for atomic builtins which have a wide range of uses that don't seem reasonable to fold into TableGen. As future work, these should move to their own file (whether as X-macros or just generated patterns) so the AST headers don't have to include all the data for other builtins.
This requires adding support to the general builtins emission for producing prefixed builtin infos separately from un-prefixed which is a bit crufty. But we don't currently have any good way of having a more refined model than a single hard-coded prefix string per TableGen emission. Something more powerful and/or elegant is possible, but this is a fairly minimal first step that at least allows factoring out the builtin prefix for something like X86.
This target's builtins have an especially long prefix and so we get over 2x reduction in string table size required with this change.
… (#125564) Add Rocdl support for the following GFX950 instructions: CVT_SCALE_PK_FP8_F32 CVT_SCALE_PK_BF8_F32 CVT_SCALE_SR_FP8_F32 CVT_SCALE_SR_BF8_F32 CVT_SCALE_PK_F32_FP8 CVT_SCALE_PK_F32_BF8 CVT_SCALE_F32_FP8 CVT_SCALE_F32_BF8
… reduction to scalar (#125288) This generalizes handleVectorReduceIntrinsic to allow intrinsics where the return type is not the same as the fields. This patch then applies the generalized handleVectorReduceIntrinsic to support the following Arm NEON add reduction to scalar intrinsics: llvm.aarch64.neon.{faddv, saddv, uaddv}. Updates the tests from llvm/llvm-project#125271
This adds handleVectorReduceWithStarterIntrinsic() (similar to handleVectorReduceIntrinsic but for intrinsics with an additional starting parameter) and uses it to handle Intrinsic::vector_reduce_f{add,mul}. Updates the tests from llvm/llvm-project#125597
This patch fixes: clang/lib/Frontend/CompilerInvocation.cpp:3854:16: error: enumeration value 'Ver20' not handled in switch [-Werror,-Wswitch]
Now that we store the command in the CommandReturnObject (#125132) we can check the command in the print callback.
…571) An LValueToRValue cast shouldn't be ignored, so bail out of the visitor if we encounter one.
Teach InterleavedAccessPass to recognize the following patterns: - vp.store an interleaved scalable vector - Deinterleaving a scalable vector loaded from vp.load Upon recognizing these patterns, IA will collect the interleaved / deinterleaved operands and delegate them over to their respective newly-added TLI hooks. For RISC-V, these patterns are lowered into segmented loads/stores Right now we only recognized power-of-two (de)interleave cases, in which (de)interleave4/8 are synthesized from a tree of (de)interleave2. --------- Co-authored-by: Nikolay Panchenko <[email protected]>
This PR adds support for UB constant materialization (i.e., generating `ub::PoisonOp` to `VectorDialect::materializeConstant`. This was the reason why the vector folders generating poison didn't work.
resource keys have the problem that you can’t parse them from mlir assembly if they have special or non-printable characters, but nothing prevents you from specifying such a key when you create e.g. a DenseResourceElementsAttr, and it works fine in other ways, including bytecode emission and parsing this PR solves the parsing by quoting and escaping keys with special or non-printable characters in mlir assembly, in the same way as symbols, e.g.: ``` module attributes { fst = dense_resource<resource_fst> : tensor<2xf16>, snd = dense_resource<"resource\09snd"> : tensor<2xf16> } {} {-# dialect_resources: { builtin: { resource_fst: "0x0200000001000200", "resource\09snd": "0x0200000008000900" } } #-} ``` by not quoting keys without special or non-printable characters, the change is effectively backwards compatible the change is tested by: 1. adding a test with a dense resource handle key with special characters to `dense-resource-elements-attr.mlir` 2. adding special and unprintable characters to some resource keys in the existing lit tests `pretty-resources-print.mlir` and `mlir/test/Bytecode/resources.mlir`
… constructs (#125750) Previous patch was too restrictive and didn't take into account cuf kernels and openacc compute constructs as being device context.
…romCmp`. NFCI. (#125666) I believe it is unused since we always convert it into `V == Mask ^ C`. Code coverage: https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/Analysis/ValueTracking.cpp.html#L706
If a variant part has a 128-bit discriminator, then DwarfUnit::constructTypeDIE will assert. This patch fixes the problem by allowing any size of integer to be used here. This is largely accomplished by moving part of DwarfUnit::addConstantValue to a new method. Fixes #119655
Motivating case: https://github.com/llvm/llvm-project/blob/64927af52a3bedf2b20d6cdd98bb47d9bba630f9/llvm/lib/Analysis/ValueTracking.cpp#L8600-L8602 It is translated into `xor (X & 2) != 0, (Y & 2) != 0`. Alive2: https://alive2.llvm.org/ce/z/dJehZ8
The functions now use VPBuilder to insert recipes and the VPBB argument is unused. Clean it up.
…san (#125763) Previously this test was entirely disabled under asan, but not hwasan. Instead of disabling the test, make the test compatible with both asan and hwasan by disabling sanitizers only on the subroutine that does the stack-smashing.
…739) Fixes: #58555
Adds codegen support for fence.acquire and fence.release, a script and generated tests for all possible legal fences, and cleans up some tablegen rules.
After the changes in 89001d1, the container pushes failed, because it was attempting to push the same container twice. This fixes the sed expression used to push the :latest alias for each container.
…#125729) Currently handled (suboptimally) by handleUnknownInstruction: - llvm.aarch64.neon.fmaxv (Floating-point Maximum (vector)) - llvm.aarch64.neon.fminv - llvm.aarch64.neon.fmaxnmv (Floating-point Maximum Number across Vector) - llvm.aarch64.neon.fminnmv (not to be mistaken with llvm.aarch64.neon.f{max,min}, which are correctly handled by `maybeHandleSimpleNomemIntrinsic`) Forked from llvm/test/CodeGen/AArch64/arm64-fminv.ll
Co-authored-by: Carlo Cabrera <[email protected]>
…12792) New proposed function `clang-format-vc-diff`. It is the same as calling `clang-format-region` on all diffs between the content of a buffer-file and the content of the file at git revision HEAD. This is essentially the same thing as: `git-clang-format -f {filename}` If the current buffer is saved. The motivation is many project (LLVM included) both have code that is non-compliant with there clang-format style and disallow unrelated format diffs in PRs. This means users can't just run `clang-format-buffer` on the buffer they are working on, and need to manually go through all the regions by hand to get them formatted. This is both an error prone and annoying workflow.
Currently handled (suboptimally) by handleUnknownInstruction: - llvm.aarch64.neon.saddlv - llvm.aarch64.neon.uaddlv Forked from llvm/test/CodeGen/AArch64/arm64-vaddlv.ll
I thought I had added tests together with llvm/llvm-project#125276 But there are still in my sandbox. These are the tests that were meant for this PR.
This patch implement the instruction cost for vp.splice intrinsic. To support type-based query for LV, adding a constant index when quering `getShuffleCost()`. We get the same cost no matter what `index` is because it only change the cost from `vslide.vx` to `vslide.vi` and the cost of `vslide.vx` is same as `vslide.vi` in current RISCV implementation.
Better to use TTI::getScalarizationOverhead instead of TTI::getVectorInstrCost to correctly calculate the costs of buildvectors/extracts. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm/llvm-project#125725
CONFLICT (content): Merge conflict in llvm/include/llvm/IR/Intrinsics.h
CONFLICT (content): Merge conflict in llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp
…19730) Even in cases where handles are supported, references are still preferable for performance. This is because, a ref uses one less register and can avoid the handle creating code associated with taking the address of a tex/surf/sampler.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix conflicts and update failing lit tests.