profile
viewpoint

dfki-jugr/llvm-project 0

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.

dfki-jugr/mlir 0

"Multi-Level Intermediate Representation" Compiler Infrastructure

dfki-jugr/tensorflow 0

An Open Source Machine Learning Framework for Everyone

push eventllvm/llvm-project

Julian Gross

commit sha addc27bc437d2fb1f31d88294b227ac32be63cc5

Changed wrong ROCDL instructions in GPU lowering. Summary: In the scope of the lowering phase from GPU to ROCDL, the intructions for the conversion patterns seems to be wrong. According to https://github.com/ROCm-Developer-Tools/HIP/blob/master/include/hip/hcc_detail/math_fwd.h the instructions need two underscores in the beginning instead of one. Reviewers: nicolasvasilache, herhut, rriddle Reviewed By: herhut, rriddle Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73535

view details

push time in 2 months

push eventllvm/llvm-project

Julian Gross

commit sha 664d2f5bad3eeef5e7cd59492937d1c34feb8642

Add tanh lowering from Standard dialect to NVVM and ROCDL. Summary: The tanh lowering from Standard dialect to NVVM and ROCDL was not working. The conversion pattern are inserted in the lowering files. The test cases for the lowerings were added in the test files. Reviewers: nicolasvasilache, ftynse, herhut Reviewed By: ftynse, herhut Subscribers: merge_guards_bot, ftynse, jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73471

view details

Julian Gross

commit sha 88d6f18225e130b64939205e4c9ee4bfd7bb261d

[mlir] fixed invalid LLVM intrinsics in LLVMOPs.td and llvmir-intrinsics.mlir. Summary: The intrinsic operation added multiple type annotations to the llvm intrinsic operations, but only one is needed. The related tests in llvmir-intrinsics.mlir checked the wrong number and are adjusted as well. Reviewers: nicolasvasilache, ftynse Reviewed By: ftynse Subscribers: merge_guards_bot, ftynse, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73470

view details

push time in 2 months

issue commentgoogle/llvm-premerge-checks

Updated revision didn't trigger a new build

@ChristianKuehnel Yes, you are right. We have uploaded our patch via the provided web interface.

joker-eph

comment created time in 3 months

push eventllvm/llvm-project

Julian Gross

commit sha 202ab273e6eca134b69882f100c666fcd3affbcf

[mlir] Added missing GPU lowering ops. Summary: This diff adds missing GPU lowering ops to MLIR. Reviewers: herhut, pifon2a, ftynse Tags: #pre-merge_beta_testing, #llvm Differential Revision: https://reviews.llvm.org/D72439

view details

push time in 3 months

delete branch llvm/llvm-project

delete branch : arcpatch-D72439_1

delete time in 3 months

create barnchllvm/llvm-project

branch : arcpatch-D72439_1

created branch time in 3 months

push eventdfki-jugr/llvm-project

Aaron Ballman

commit sha 55a51e1c79a21080289ba88d5eac4bbe54ec4272

Disallow an empty string literal in an asm label An empty string literal in an asm label does not make a whole lot of sense. GCC does not diagnose such a construct, but it also generates code that cannot be assembled by gas should two symbols have an empty asm label within the same TU. This does not affect an asm statement with an empty string literal, which is still a useful construct.

view details

Anna Welker

commit sha 346f6b54bd1237a9a5a2d9bb1e424b57dc178998

[ARM][MVE] Enable masked gathers from vector of pointers Adds a pass to the ARM backend that takes a v4i32 gather and transforms it into a call to MVE's masked gather intrinsics. Differential Revision: https://reviews.llvm.org/D71743

view details

LLVM GN Syncbot

commit sha 26ac7923e7df982081e726bb2856fadb35d6d35d

[gn build] Port 346f6b54bd1

view details

Aaron Ballman

commit sha 7a77ad144694ced7b553c644bcbcbfffac2b5fe1

Fixing a formatting nit; NFC

view details

Qiu Chaofan

commit sha b2c2fe72197267af90b4b6a187ab6163f806ce00

[NFC] Move InPQueue into arguments of releaseNode This patch moves `InPQueue` into function arguments instead of template arguments of `releaseNode`, which is a cleaner approach. Differential Revision: https://reviews.llvm.org/D72125

view details

Bevin Hansson

commit sha 8e2b44f7e0641d3776021163ee6a77089cca9cdc

[Intrinsic] Add fixed point division intrinsics. Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division: llvm.sdiv.fix.* llvm.udiv.fix.* These intrinsics perform scaled division on two integers or vectors of integers. They are required for the implementation of the Embedded-C fixed-point arithmetic in Clang. Patch by: ebevhan Reviewers: bjope, leonardchan, efriedma, craig.topper Reviewed By: craig.topper Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70007

view details

Andi-Bogdan Postelnicu

commit sha ba129c7d0f5c7c32398ad708c88e14cb06a339ad

[clang-tidy] Disable match on `if constexpr` statements in template instantiation for `readability-misleading-indentation` check. Summary: Fixes fixes `readability-misleading-identation` for `if constexpr`. This is very similar to D71980. Reviewers: alexfh Subscribers: xazax.hun, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72333

view details

Simon Tatham

commit sha 3100480925df10960c1e0a077dd9875037d3fe29

[ARM,MVE] Intrinsics for partial-overwrite imm shifts. This batch of intrinsics covers two sets of immediate shift instructions, which have in common that they only overwrite part of their output register and so they need an extra input giving its previous value. The VSLI and VSRI instructions shift each lane of the input vector left or right just as if they were normal immediate VSHL/VSHR, but then they only overwrite the output bits that correspond to actual shifted bits of the input. So VSLI will leave the low n bits of each output lane unchanged, and VSRI the same with the top n bits. The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input vector of 2n-bit integers, shift each lane right by a constant, and then narrowing the shifted result to only n bits. So they only overwrite half of the n-bit lanes in the output register, and the B/T suffix indicates whether it's the bottom or top half of each 2n-bit lane. I've implemented the whole of the latter family using a single IR intrinsic `vshrn`, which takes a lot of i32 parameters indicating which instruction it expands to (by specifying signedness of the input and output types, whether it saturates and/or rounds, etc). Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72328

view details

Simon Tatham

commit sha dac7b23cc3efbb4ccb6a9ea101f367f866f334e2

[ARM,MVE] Intrinsics for variable shift instructions. This batch of intrinsics fills in all the shift instructions that take a variable shift distance in a register, instead of an immediate. Some of these instructions take a single shift distance in a scalar register and apply it to all lanes; others take a vector of per-lane distances. These instructions are all basically one family, varying in whether they saturate out-of-range values, and whether they round when bits are shifted off the bottom. I've implemented them at the IR level by a much smaller family of IR intrinsics, which take flag parameters to indicate saturating and/or rounding (along with the usual one to specify signed/unsigned integers). An oddity is that all of them are //left// shift instructions – but if you pass a negative shift count, they'll shift right. So the vector shift distances are always vectors of //signed// integers, regardless of whether you're considering the other input vector to be of signed or unsigned. Also, even the simplest `vshlq` instruction in this family (neither saturating nor rounding) has to be implemented as an IR intrinsic, because the ordinary LLVM IR `shl` operation would consider an out-of-range shift count to be undefined behavior. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72329

view details

Kazu Hirata

commit sha ead815924e6ebeaf02c31c37ebf7a560b5fdf67b

[JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247

view details

Alexey Bataev

commit sha c74a8adda3bc4fc5714aef14cdcfda944d3038a0

[OPENMP]Allow comma in combiner expression. Use ParseExpression() instead of ParseAssignmentExpression() to allow commas in combiner expressions.

view details

Sanjay Patel

commit sha 780ba1f22b53116918cf12decccaed7ba2292bd5

[DAGCombiner] clean up extract-of-concat fold; NFC This hopes to improve readability and adds an assert. The functional change noted by the TODO comment is proposed in: D72361

view details

Sanjay Patel

commit sha 5dfd52398f5c1b67024106febdc68e6b12f8ad37

[InstCombine] Adding testcase for Z / (1.0 / Y) => (Y * Z); NFC The added testcase shows the current transformation for the operation Z / (1.0 / Y), which remains unchanged. This will be updated to align with the transformed code (Y * Z) with D72319. The existing transformation Z / (X / Y) => (Y * Z) / X is not handling this case as there are multiple uses for (1.0 / Y) in this testcase. Patch by: @raghesh (Raghesh Aloor) Differential Revision: https://reviews.llvm.org/D72388

view details

Simon Pilgrim

commit sha 108279948de31eba4f212b2a4715030b9d471c9e

[SelectionDAG] Use llvm::Optional<APInt> for FoldValue. Use llvm::Optional<APInt> instead of std::pair<APInt, bool> with the bool second being used to report success/failure of fold.

view details

Michael Liao

commit sha 07a569a0539a12700401b8f4221af0a58f28a654

[amdgpu] Remove unused header. NFC.

view details

Jonas Devlieghere

commit sha bbbbf8a1065e9420e3cc7c958897683e84023075

[lldb/CMake] Use LLDB's autodetection logic for libxml2 Libxml2 is already an optional dependency. It should use the same infrastructure as the other dependencies. Differential revision: https://reviews.llvm.org/D72290

view details

Simon Pilgrim

commit sha 5936717fa6537812257990143e2384bb78486ef9

Fix "pointer is null" static analyzer warning. NFCI. Use castAs<> instead of getAs<> since we know that the pointer will be valid (and is dereferenced immediately below).

view details

Simon Pilgrim

commit sha 19bfb6d8df6c23c8c8d19af9221d12bf08244b51

Fix "pointer is null" static analyzer warning. NFCI. Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately below in the getSignature call).

view details

Simon Pilgrim

commit sha 46e2f89364ce24a06953d08c78218fb5548a9fa3

[MC] writeFragment - assert MCFragment::FT_Fill length is legal. Silence (clang/MSVC) static analyzer warnings that the fragment data may either write out of bounds of the local array or reference uninitialized data.

view details

Fangrui Song

commit sha 96e2376d02f0840e82b96314108660ecabe63c7f

[ELF] Don't special case weak symbols for pie with no shared objects D59275 added the following clause to Symbol::includeInDynsym() if (isUndefWeak() && Config->Pie && SharedFiles.empty()) return false; D59549 explored the possibility to generalize it for -no-pie. GNU ld's rules are architecture dependent and partly controlled by -z {,no-}dynamic-undefined-weak. Our attempts to mimic its rules are actually half-baked and don't provide perceivable benefits (it can save a few more weak undefined symbols in .dynsym in a -static-pie executable). Let's just delete the rule for simplicity. We will expect cosmetic inconsistencies with ld.bfd in certain -static-pie scenarios. This permits a simplification in D71795. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D71794

view details

push time in 3 months

create barnchdfki-jugr/llvm-project

branch : lower_gpu

created branch time in 3 months

fork dfki-jugr/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.

http://llvm.org

fork in 3 months

push eventdfki-jugr/mlir

Christian Sigg

commit sha a68a4bd9f1fbd8f9cf663f55b3ecb7460b2f043b

Fix maskAndClamp in gpu.all_reduce. The clamp value determines the returned predicate. Previously, the clamp value was fixed to 31 and the predicate was therefore always true. This is incorrect for partial warp reductions, but went unnoticed because the returned values happened to be zero (but it could be anything). PiperOrigin-RevId: 285343160

view details

Prakalp Srivastava

commit sha 13ab7525176fe42ec943d5cad072de6ee1d22617

Add a type range for the XLA HLO dialect. PiperOrigin-RevId: 285437835

view details

River Riddle

commit sha cc05a3a33136a2ce56e69bb86f49f7c65a47b118

Try to fold operations in DialectConversion when trying to legalize. This change allows for DialectConversion to attempt folding as a mechanism to legalize illegal operations. This also expands folding support in OpBuilder::createOrFold to generate new constants when folding, and also enables it to work in the context of a PatternRewriter. PiperOrigin-RevId: 285448440

view details

Nicolas Vasilache

commit sha 4a817036212351f58e5bfd6298ba4c32b4a1ab93

Add a layer of EDSC for linalg.GenericOp This will be evolved into a simple programming model for custom ops and custom layers in followup CLs. This CL also deletes the obsolete tablegen's reference-impl.td that was using EDSCs. PiperOrigin-RevId: 285459545

view details

Jing Pu

commit sha 4287da9a41a97b6a94eb1e9b06eafc97f9a53681

Skip generating C++ for "DeclareOpInterfaceMethods" in op interface gen. This is needed for calling the generator on a .td file that contains both OpInterface definitions and op definitions with DeclareOpInterfaceMethods<...> Traits. PiperOrigin-RevId: 285465784

view details

River Riddle

commit sha 55aabdfafb12703ee6ce6cec1cb22729fc21224a

Refactor various canonicalization patterns as in-place folds. This is more efficient, and allows for these to fire in more situations: e.g. createOrFold, DialectConversion, etc. PiperOrigin-RevId: 285476837

view details

Nicolas Vasilache

commit sha 60b6edd8c1bf0c3da3dcd91b72a0d7ee4cd89a6d

Apply a level of sugaring to the linalg.generic EDSC - NFC Make the declarative C++ builder API simpler to use so we can start chaining these ops together. PiperOrigin-RevId: 285496266

view details

Nicolas Vasilache

commit sha 0f79da8aaebd8b9605ee28762299b49b21630147

Reconcile struct and class for NestedPatternMatchers - NFC This removes a warning and fixes a potential ABI issue on Windows. PiperOrigin-RevId: 285502010

view details

Smit Hinsu

commit sha dd74ee168d6966e412f294a6c9899c7475f0cb54

Add verifyCompatibleShape function overload with shapes PiperOrigin-RevId: 285574334

view details

Uday Bondhugula

commit sha f33f6e16ec430da0989d5e3a287bed1d6d55f73f

Splat op doc - fix misformat / update tablegen op desc. comment - bring op description comment in sync with the doc - fix misformat in doc Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes #317 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/317 from bondhugula:quickfix 7fcd945b318c973b2488b702874c87526855c8ef PiperOrigin-RevId: 285574527

view details

Tres Popp

commit sha 6a43e0f22293aee41639e2931fdfa36046057761

Remove LLVM dependency on mlir::Module and instead check Traits. PiperOrigin-RevId: 285724678

view details

Alex Zinenko

commit sha 7c64c7d41b2426267663d25f9943b08dbe5f69c0

Make memref promotion during std->LLVM lowering the default calling convention During the conversion from the standard dialect to the LLVM dialect, memref-typed arguments are promoted from registers to memory and passed into functions by pointer. This had been introduced into the lowering to work around the abesnce of calling convention modeling in MLIR to enable better interoperability with LLVM IR generated from C, and has been exerciced for several months. Make this promotion the default calling covention when converting to the LLVM dialect. This adds the documentation, simplifies the code and makes the conversion consistent across function operations and function types used in other places, e.g. in high-order functions or attributes, which would not follow the same rule previously. PiperOrigin-RevId: 285751280

view details

Andy Davis

commit sha cd95258ba4f30ea1cf4c2ec053bcfd005797a90f

Adds ExtractSlicesOp to the VectorOps dialect. ExtractSlicesOp extracts slices of its vector operand and with a specified tiling scheme. This operation centralizes the tiling scheme around a single op, which simplifies vector op unrolling and subsequent pattern rewrite transformations. PiperOrigin-RevId: 285761129

view details

Aart Bik

commit sha a712c277f40a4c066add825de75d0ac103b11812

[VectorOps] Add [insert/extract]element definition together with lowering to LLVM Similar to insert/extract vector instructions but (1) work on 1-D vectors only (2) allow for a dynamic index %c3 = constant 3 : index %0 = vector.insertelement %arg0, %arg1[%c : index] : vector<4xf32> %1 = vector.extractelement %arg0[%c3 : index] : vector<4xf32> PiperOrigin-RevId: 285792205

view details

Mehdi Amini

commit sha 27b00f3a5f8f97a5c13afa8ef61834a75b268903

Remove unused variable (fix warning) NFC PiperOrigin-RevId: 285799680

view details

Jose Ignacio Gomez

commit sha 944cb2afca72daa3c4ff7ce597556633e9952a91

[Linalg] Expose subview promotion as a declarative pattern This PR targest issue #295. It exposes the already existing subiew promotion pass as a declarative pattern Change-Id: If901ebef9fb53fcd0b12ecc536f6b174ce320b92 Closes #315 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/315 from tetuante:issue295 8e5f268b6d85f31015c33505329dbd7a4db97ac5 PiperOrigin-RevId: 285801463

view details

Alex Zinenko

commit sha 4fddec734dd1126dbe6cc63f6338058e46eb1330

Make "LowerToCFG" an operation pass The conversion from the Loops dialect to the Standard dialect, also known as loop-to-cfg lowering, has been historically a function pass. It can be required on non-Standard function Ops, in particular the recently introduced GPU functions. Make the conversion an operation pass instead of a function pass. PiperOrigin-RevId: 285814560

view details

River Riddle

commit sha 294a21e1625dea54a65437549187c3244e33f598

Insert signature-converted blocks into a region with a parent operation. This keeps the IR valid and consistent as it is expected that each block should have a valid parent region/operation. Previously, converted blocks were kept floating without a valid parent region. PiperOrigin-RevId: 285821687

view details

Alex Zinenko

commit sha 6271fd115db58b3c0dc789fbbef4414b52c66e31

Plug gpu.func into the GPU lowering pipelines This updates the lowering pipelines from the GPU dialect to lower-level dialects (NVVM, SPIRV) to use the recently introduced gpu.func operation instead of a standard function annotated with an attribute. In particular, the kernel outlining is updated to produce gpu.func instead of std.func and the individual conversions are updated to consume gpu.funcs and disallow standard funcs after legalization, if necessary. The attribute "gpu.kernel" is preserved in the generic syntax, but can also be used with the custom syntax on gpu.funcs. The special kind of function for GPU allows one to use additional features such as memory attribution. PiperOrigin-RevId: 285822272

view details

Andy Davis

commit sha dc09a6f2f5e7cc6eec032a4823ea74fc4ced5bf0

Add InsertSlicesOp to the VectorOps dialect. PiperOrigin-RevId: 285830394

view details

push time in 3 months

more