profile
viewpoint

tensorflow/mlir 1640

"Multi-Level Intermediate Representation" Compiler Infrastructure

iml130/mlir-emitc 2

MLIR EmitC dialect

jpienaar/mlir-grammar 1

Copy of the grammar file from MLIR repo

joker-eph/mlir 0

"Multi-Level Intermediate Representation" Compiler Infrastructure

jpienaar/charted 0

Visualization toolkit for Dart language

jpienaar/chroma 0

A general purpose syntax highlighter in pure Go

issue closedllvm/mlir-www

Generated heading for pass descriptions

These seem missing

image

closed time in 2 days

jpienaar

issue commentllvm/mlir-www

Generated heading for pass descriptions

Resolved in next update (was due to missing/incorrect entries in passes.md & incorrect doc generation flag).

jpienaar

comment created time in 2 days

push eventllvm/llvm-project

Jacques Pienaar

commit sha 584d91925eb13515305ccb974f492ca9cf1f99d4

[mlir] Fix capitalization typo Was testing on case insensitive config :-/

view details

push time in 2 days

push eventllvm/mlir-www

Jacques Pienaar

commit sha a8cedd839ea1cb4a45d82db461730030ded22a93

Avoid non-posix sed -i

view details

push time in 2 days

push eventllvm/llvm-project

Jacques Pienaar

commit sha 57b871f8eca5029d244c9777f27d13f2a5ef9ab2

[mlir] Updates to generate dialect rather than op docs

view details

push time in 2 days

push eventllvm/llvm-project

Jacques Pienaar

commit sha 93628ea9d162434ef78f51d152e7feac2c4095ef

[mlir] Fix passes.md's naming & add missing

view details

push time in 2 days

push eventllvm/llvm-project

Jacques Pienaar

commit sha 80deb1e106a8c3c5ba31ef0bb4d7651acb6e6b69

[mlir][ods] Custom builder with no params Incorrect generation of custom build method without any params.

view details

Jacques Pienaar

commit sha 501d7e07e31d8f79160324e683e4931403f469d5

[mlir] Remove unneeded OpBuilder params. NFC. These are now automatically prepended.

view details

push time in 5 days

push eventllvm/llvm-project

Jacques Pienaar

commit sha 2a6db92ca97da946307b559e63c6ac75caf4bbd6

[mlir][ods] Make OpBuilder and OperationState optional The OpBuilder is required to start with OpBuilder and OperationState, so remove the need for the user to specify it. To make it simpler to update callers, retain the legacy behavior for now and skip injecting OpBuilder/OperationState when params start with OpBuilder. Related to bug 47442. Differential Revision: https://reviews.llvm.org/D88050

view details

push time in 6 days

push eventllvm/llvm-project

Jacques Pienaar

commit sha 3a799deed72963d124cc9ab8141fb32976cfc846

[mlir] Add tutorial index.md pages Sets the content for the section entry pages Hugo side. Differential Revision: https://reviews.llvm.org/D87969

view details

push time in 7 days

push eventjpienaar/llvm-project

Jacques Pienaar

commit sha 618a10af193df2b3d6f7085d0ca98e8527ac698e

[mlir] Add tutorial index.md pages Sets the content for the section entry pages Hugo side. Differential Revision: https://reviews.llvm.org/D87969

view details

push time in 7 days

push eventllvm/mlir-www

Jacques Pienaar

commit sha 5ec0d548fb9d5358d80f1cafbe323a44a5c73375

Revert sed option

view details

push time in 9 days

push eventjpienaar/llvm-project

Jacques Pienaar

commit sha 5c58b86780aafbef0b05e4d522f9be69ba022f9f

[mlir] List SCF passes in passes list

view details

push time in 9 days

push eventjpienaar/llvm-project

Jacques Pienaar

commit sha 263ca92f83942030e35d41143b87370a877787be

[mlir] Remove include of non-existent doc

view details

push time in 9 days

push eventllvm/mlir-www

Jacques Pienaar

commit sha 0ec04f90545e616a7babd753c51bdcf85f03db69

Pass empty in-place option Required to work with BSD sed too. Also change commit user.

view details

push time in 9 days

issue openedllvm/mlir-www

Generated heading for pass descriptions

These seem missing

image

created time in 19 days

PullRequestReviewEvent

Pull request review commentllvm/mlir-www

FAQ: "Registered, loaded, dependent: what's up with Dialects management

 weight: 10 2) Tensors can be dynamically shaped, unranked, or have 0 dimensions ; but Vectors can't be. 3) You can have a memref (a buffer in memory) containing Vectors but you can't have a memref of a tensor type. 4) The set of allowed element types is different: the Tensor type isn't limited while Vector is limited to float and integer types.++## Registered, loaded, dependent: what's up with Dialects management?++Before creating an Operation, a Type, or an Attribute, the associated Dialect+must be already *loaded* in the `MLIRContext`. For example the Toy tutorial+explicitly loads the Toy Dialect before emitting the Toy IR from the AST.++The process of loading a Dialect in the context is not thread-safe, which forces+all involved Dialects to be loaded before the multi-threaded pass manager starts+the execution. To keep the system modular and layered, invoking a pass pipeline+should never require to pre-load some dialects explicitly. This is achieved by+requiring every pass to declare a list of *dependent* Dialects: these are+Dialects for which an entity (Operation, Type, or Attribute) can be created by+the pass, other than for Dialects that would already be in the input.+For example, a `convertLinalgToLoops` pass would declare the `SCF` Dialect as+dependent, but does not need to declare `Linalg`.++Finally, dialects can be *registered* with the context. The sole purpose of the+registration is to make these dialects available for the textual parser used by+tools like `mlir-opt` or `mlir-translate`. A compiler frontend emitting the IR+programmatically and invoking a pass pipeline would never need to register any

Couldn't additional checking logic be avoided if I could say "hey I've already registered all that is needed, no need to look at all the dependent dialects or attempt to load them"? Or is it sufficiently cheap that it isn't an issue?

joker-eph

comment created time in 21 days

Pull request review commentllvm/mlir-www

FAQ: "Registered, loaded, dependent: what's up with Dialects management

 weight: 10 2) Tensors can be dynamically shaped, unranked, or have 0 dimensions ; but Vectors can't be. 3) You can have a memref (a buffer in memory) containing Vectors but you can't have a memref of a tensor type. 4) The set of allowed element types is different: the Tensor type isn't limited while Vector is limited to float and integer types.++## Registered, loaded, dependent: what's up with Dialects management?++Before creating an Operation, a Type, or an Attribute, the associated Dialect+must be already *loaded* in the `MLIRContext`. For example the Toy tutorial+explicitly loads the Toy Dialect before emitting the Toy IR from the AST.++The process of loading a Dialect in the context is not thread-safe, which forces+all involved Dialects to be loaded before the multi-threaded pass manager starts+the execution. To keep the system modular and layered, invoking a pass pipeline+should never require to pre-load some dialects explicitly. This is achieved by

s/to pre-load some/pre-loading/

joker-eph

comment created time in 21 days

Pull request review commentllvm/mlir-www

FAQ: "Registered, loaded, dependent: what's up with Dialects management

 weight: 10 2) Tensors can be dynamically shaped, unranked, or have 0 dimensions ; but Vectors can't be. 3) You can have a memref (a buffer in memory) containing Vectors but you can't have a memref of a tensor type. 4) The set of allowed element types is different: the Tensor type isn't limited while Vector is limited to float and integer types.++## Registered, loaded, dependent: what's up with Dialects management?++Before creating an Operation, a Type, or an Attribute, the associated Dialect+must be already *loaded* in the `MLIRContext`. For example the Toy tutorial+explicitly loads the Toy Dialect before emitting the Toy IR from the AST.++The process of loading a Dialect in the context is not thread-safe, which forces+all involved Dialects to be loaded before the multi-threaded pass manager starts+the execution. To keep the system modular and layered, invoking a pass pipeline+should never require to pre-load some dialects explicitly. This is achieved by+requiring every pass to declare a list of *dependent* Dialects: these are+Dialects for which an entity (Operation, Type, or Attribute) can be created by+the pass, other than for Dialects that would already be in the input.

But you could already have SCF in the input, this seems a fuzzy requirement here. So this is saying if you have Foo.BarOp in your input/source patterns then you don't need to mark Foo as dependent. Wouldn't you have been able to have a Foo being injected from a "support undeclared dialects" input and so not have Foo registered, but if you created Foo.BarOp internally to the pass then it would have failed? It seems the current setup requires that a dialect of any op that may be created by the pass needs to be registered.

joker-eph

comment created time in 21 days

PullRequestReviewEvent

startedjacktasia/dumb-jump

started time in 25 days

Pull request review commentllvm/mlir-www

First entry in the FAQ: Vectors vs Tensors?

 menu: "main" weight: 10 --- -TODO+## What is the difference between the Tensor and Vector types?++1) Conceptual: vectors are meant to and occur in lower level dialects - often where you expect hardware to have registers of that size. Tensors model higher-level "closer to the source" abstract representation. This is reflected in the abstraction modeled by the operations from the [`vector` dialect](https://mlir.llvm.org/docs/Dialects/Vector/), while Tensors would be more naturally present in the operations of the [`linalg dialect](https://mlir.llvm.org/docs/Dialects/Linalg/).+2) Tensors can be dynamically shaped, unranked, or have 0 dimensions ; but Vectors can be.+3) You can have a memref (a buffer in memory) containing Vectors but you can't have a memref of a tensor type.+4) The set of allowed element types is different: the Tensor type isn't limited while Vector is limited to float and integer types.

Part of this also goes with tensors being abstract values that need not have any in memory backing. E.g., a tensor need not have any memory backing or fixed size elements. You can have a tensor of python programs as it is just an abstract value.

joker-eph

comment created time in a month

Pull request review commentllvm/mlir-www

First entry in the FAQ: Vectors vs Tensors?

 menu: "main" weight: 10 --- -TODO+## What is the difference between the Tensor and Vector types?++1) Conceptual: vectors are meant to and occur in lower level dialects - often where you expect hardware to have registers of that size. Tensors model higher-level "closer to the source" abstract representation. This is reflected in the abstraction modeled by the operations from the [`vector` dialect](https://mlir.llvm.org/docs/Dialects/Vector/), while Tensors would be more naturally present in the operations of the [`linalg dialect](https://mlir.llvm.org/docs/Dialects/Linalg/).

"to and occur" ?

joker-eph

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

issue commenttensorflow/mlir

installation steps

The Python warning is harmless. We need to fix the detection for it. Now if you want to use the Python frontend, I'm not sure how best and it is still very experimental.

aruncgowda

comment created time in a month

push eventjpienaar/onnx-mlir

Jacques Pienaar

commit sha 341e3f64b22a132ca9a6d129fe0e1a950f12e46a

Update README.md Basic MLIR syntax highlighting is supported on GitHub

view details

push time in a month

push eventjpienaar/onnx-mlir

Jacques Pienaar

commit sha 90a7145d0f19dfd8c3fc753e5e7e852bfb945074

Indicate source of snippets Basic MLIR syntax highlighting is supported on GitHub.

view details

push time in a month

push eventiml130/mlir-emitc

Jacques Pienaar

commit sha 96fe1982bbc3be3ee680b51a8478821309b86730

Add conversion to SCF for Simple matcher of while corresponding to scf.for (the same simple structure was used in both tests so easy starting point) and with generating an undefined dialect op (this needs a change upstream in core). Need to clean up and add tests for it and then upstream to mlir-hlo repo and then remove here. emitc-opt --convert-mhlo-while-to-scf /testCorrectGroundTruthWithHMC_canon_inline.mlir -allow-unregistered-dialect Not the prettiest and requires building HLO dialect until parts are upstreamed.

view details

push time in a month

fork jpienaar/onnx-mlir

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

fork in a month

push eventiml130/mlir-emitc

Jacques Pienaar

commit sha 08c4e4e6e3ccd25dbab02c0c3f9591bba8c03cc7

Inline and canonicalize test case again

view details

push time in 2 months

push eventiml130/mlir-emitc

Jacques Pienaar

commit sha 1a888422c4ef1b2ccdf5341a25c6577d27588a21

Add canonicalized version of test

view details

push time in 2 months

push eventiml130/mlir-emitc

Jacques Pienaar

commit sha 5bea2775699ac2c7d04ca3d5f98ea997e6697a0e

Change test to generic form

view details

push time in 2 months

push eventiml130/mlir-emitc

Jacques Pienaar

commit sha c2b08ccaafb269e8bbab222a02e46a4d6dd3e394

Add basic float printing * Print float attributes using scientific notation * Also removed deprecated command line flag from tests

view details

push time in 2 months

push eventiml130/mlir-emitc

Jacques Pienaar

commit sha c385f9a9d829c78f8d5c378886c4ebca64b27e5c

Update dialect registration Post https://reviews.llvm.org/D85495 the dialect registration changed.

view details

push time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 29429d1a443a51d0e1ac4ef4033a2bcc95909ba3

[drr] Add $_loc special directive for NativeCodeCall Allows propagating the location to ops created via NativeCodeCall. Differential Revision: https://reviews.llvm.org/D85704

view details

push time in 2 months

Pull request review commenttensorflow/tensorflow

[MLIR:LITE] Verify unpack op

 void FakeQuantOp::getCanonicalizationPatterns(OwningRewritePatternList &results,  // TODO(b/133486129): Implement shape inference for unpack -static LogicalResult Verify(UnpackOp op) {-  // TODO(antiagainst): Implement other checks as in-  // tensorflow/lite/kernels/unpack.cc+LogicalResult UnpackOp::inferReturnTypes(+    MLIRContext *context, Optional<Location> location, ValueRange operands,+    DictionaryAttr attributes, RegionRange regions,+    SmallVectorImpl<Type> &inferredReturnTypes) {+  auto num = attributes.get("num");

I should have mentioned op adaptors here (e.g., you can do UnpackOp::Adaptor op(operands, attributes); and then use its verify method and named accessors). I need to fix up another one too, so just leave "TODO(jpienaar): Use Adaptor instead" here.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[MLIR:LITE] Verify unpack op

 void FakeQuantOp::getCanonicalizationPatterns(OwningRewritePatternList &results, // TODO(b/133486129): Implement shape inference for unpack  static LogicalResult Verify(UnpackOp op) {-  // TODO(antiagainst): Implement other checks as in-  // tensorflow/lite/kernels/unpack.cc+  if (op.getOperation()->getNumOperands() != 1)+    return op.emitOpError("input count shoule be equal to 1");    if (op.getOperation()->getNumResults() != op.num())     return op.emitOpError("output count should match 'num' attribute"); +  auto input_type = op.input().getType().dyn_cast<ShapedType>();+  if (!input_type.hasRank()) {+    // If input has unknown rank, skip the checks.+    return success();+  }++  if (input_type.getNumElements() <= 0)+    return op.emitOpError("number of elements in input shoule be larger than 0");++  const int64_t rank = input_type.getRank();+  if (rank <= 0)+    return op.emitOpError("input should be of rank larger than 0");++  int64_t axis_value = op.axis().getSExtValue();+  if (axis_value < 0)+    axis_value += rank;+  if (axis_value < 0 || axis_value >= rank)+    return op.emitOpError()+            << "op attribute 'axis' should be in range [-rank, rank), "+            << "got rank = " << rank+            << ", and axis = " << op.axis().getSExtValue();++  llvm::SmallVector<int64_t, 4> output_shape;

Good question. So the number of outputs are always known (https://github.com/tensorflow/tensorflow/blob/acbfab2b0191ec7a845eb967b39f1cb08d3a3c3a/tensorflow/core/ops/array_ops.cc#L445) their shapes may not be (well, would not be in unranked case). I believe we keep that attribute still, else we would not be able to :-)

WindQAQ

comment created time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 4514a3cfa4765ec77e8ccf56e84ae124ea5afa63

[mlir][shape] Fix description copy pasta

view details

push time in 2 months

push eventiml130/mlir-emitc

Jacques Pienaar

commit sha 821b6e93b36e98c9e2697ca1da1268af2a962f24

Create testCorrectGroundTruthWithHMC.mlir Test case with MHLO ops. This needs an update to MHLO dialect that is pending review and this is the "raw" output post conversion (e.g., no canonicalization, DCE have been applied, and some terrible names :)). This is from TFP's InferenceGymTestCaseTest.testCorrectGroundTruthWithHMC test case using the existing default lowerings.

view details

push time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 4b211b94d71386d249e2004c817a9bb659634c2b

[mlir][drr] Make error easier to understand Changes error from error: referencing unbound symbol '' to error: raw string not supported as argument

view details

push time in 2 months

Pull request review commenttensorflow/tensorflow

[MLIR:LITE] Verify unpack op

 void FakeQuantOp::getCanonicalizationPatterns(OwningRewritePatternList &results, // TODO(b/133486129): Implement shape inference for unpack  static LogicalResult Verify(UnpackOp op) {-  // TODO(antiagainst): Implement other checks as in-  // tensorflow/lite/kernels/unpack.cc+  if (op.getOperation()->getNumOperands() != 1)+    return op.emitOpError("input count shoule be equal to 1");    if (op.getOperation()->getNumResults() != op.num())     return op.emitOpError("output count should match 'num' attribute"); +  auto input_type = op.input().getType().dyn_cast<ShapedType>();+  if (!input_type.hasRank()) {+    // If input has unknown rank, skip the checks.+    return success();+  }++  if (input_type.getNumElements() <= 0)+    return op.emitOpError("number of elements in input shoule be larger than 0");++  const int64_t rank = input_type.getRank();+  if (rank <= 0)+    return op.emitOpError("input should be of rank larger than 0");++  int64_t axis_value = op.axis().getSExtValue();+  if (axis_value < 0)+    axis_value += rank;+  if (axis_value < 0 || axis_value >= rank)+    return op.emitOpError()+            << "op attribute 'axis' should be in range [-rank, rank), "+            << "got rank = " << rank+            << ", and axis = " << op.axis().getSExtValue();++  llvm::SmallVector<int64_t, 4> output_shape;

Could you extract these into a type op interface's build method? (https://github.com/tensorflow/tensorflow/blob/001ec7efbed18e9581e859513c5acc76e5aabbe9/tensorflow/compiler/mlir/tensorflow/ir/tf_ops_a_m.cc#L905) It results in generating more convenient build method and then also verifies the return type (https://github.com/tensorflow/tensorflow/commit/7779f4bbe0f50a77084561fe4308e27f45ec7a2b is an example but has other parts there).

If you can't get that to work, just add a TODO for me here.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Improve parallelism of tf.AddN

 Type InferExpandDimsType(Type ty, int64_t axis, Builder *builder) {  // Lowers AddN op to a sequence of AddV2 ops to accumulate operands. //+// Note that to improve the parallelism, the operands are split+// into two halves, and are accumulated first.+//+// Example:+// //   %result = "tf.AddN"(%0, %1, %2) // // is lowered to: //-//   %sum_0 = "tf.AddV2"(%0, %1)-//   %result = "tf.AddV2"(%sum_0, %2)+//   %sum_right = "tf.AddV2"(%1, %2)+//   %result = "tf.AddV2"(%0, %sum_right)+//+// Or

s/Or/While/ ? (I read the or to be a clause on "is lowered to" instead of separate sentence, and this made this less likely to misread)

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Improve parallelism of tf.AddN

 class LowerAddNOp : public OpRewritePattern<TF::AddNOp> {     // support variant type so variant types require special handling.     if (getElementTypeOrSelf(op.getType()).isa<VariantType>()) return failure(); -    // TODO(hinsu): Improve parallelism by splitting operands in two halves and-    // accumulating them first.-    Value result = *op.inputs().begin();-    for (Value operand : llvm::drop_begin(op.inputs(), 1)) {-      result = rewriter.create<TF::AddV2Op>(op.getLoc(), result, operand);+    auto begin = op.inputs().begin();+    // Return the only operand directly.+    if (op.N() == 1) {+      rewriter.replaceOp(op, *begin);+      return success();     } +    // Helper functor to accumulate from `begin` to `end` (exclusive).+    auto accumulate_add = [&rewriter, &op] (auto begin, auto end) -> Value {+      Value result = *begin;+      ++begin;+      for (auto operand = begin; operand != end; ++operand) {+        result = rewriter.create<TF::AddV2Op>(op.getLoc(), result, *operand);+      }+      return result;+    };++    // Accumulate range `[begin, half)` and `[half, end)`,+    // and add the results of two halves.

Why two halves vs a tree of additions? (meaning this is an improvement but if the goal is additional parallelism one could go further)

WindQAQ

comment created time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 86a78546b97950dfacd44ab77f17f4ce055d16e5

[mlir] Add shape.with_shape op This is an operation that can returns a new ValueShape with a different shape. Useful for composing shape function calls and reusing existing shape transfer functions. Just adding the op in this change. Differential Revision: https://reviews.llvm.org/D84217

view details

push time in 2 months

issue commenttensorflow/tensorflow

tf.nn.depthwise_conv2d with rank=1 kernels (separable filters)

Is the proposal here that DepthwiseConv2dNative (for different dimensions) has an additional attribute to say that is separable and in which case it gets a rank 1 kernel. And allows backends to decide whether they want to implement that as a single call with 2D kernel or 2 calls of DepthwiseConv2dNative with said rank 1 kernel? The rationale being that matching these two calls would be too difficult/fragile?

So it isn't a case where one would have

%5 = tf.SomeKernelGeneration() %10 = DepthwiseConv2dNative(..., %5) %20 = DepthwiseConv2dNative(%10 ..., %5)

and need to find cases where you have 1 dephtwise conv feeding into a next and they both share the same kernel, and if so one can convert to the other call form?

bhack

comment created time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 595d214f47e484ffe517a4294d3ac042d6c7d25d

[mlir][shape] Further operand and result type generalization Previous changes generalized some of the operands and results. Complete a larger group of those to simplify progressive lowering. Also update some of the declarative asm form due to generalization. Tried to keep it mostly mechanical.

view details

push time in 2 months

PR opened google/iree

[shapex] Handle case where shape.shape_of feeds into from_extent_tensor

Pre-fetching for upstream change where shape.shape_of can produce a ranked tensor of index, rather than just a shape and post canonicalization one has a shape.shape_of feeding into a shapex.from_extent_tensor directly. The lowering of shape_of ti get_ranked_shape then leads to an invalid ranked_tensor as input to from_extent_tensor, e.g.,

%63 = "shapex.get_ranked_shape"(%55) : (tensor<?x10xf32>) -> !shapex.ranked_shape<[?,10]>
%64 = "shapex.from_extent_tensor"(%63) : (!shapex.ranked_shape<[?,10]>) -> !shapex.ranked_shape<[?,?]>

Avoid this by removing the from_extent_tensor in these cases and propagating the value from get_ranked_shape directly.

+16 -2

0 comment

1 changed file

pr created time in 2 months

create barnchjpienaar/iree

branch : getextent

created branch time in 2 months

push eventllvm/llvm-project

Frederik Gossen

commit sha 07f227c0eb8c5628842e7f7aa30001b24b8aede9

[MLIR][Shape] Allow `num_elements` to operate on extent tensors Re-landing with dependent change landed and error condition relaxed. Beyond the change to error condition exactly https://reviews.llvm.org/D84445.

view details

push time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 5142448a5e2aeeffefb3aabdb48f19033025bc09

[MLIR][Shape] Refactor verification Based on https://reviews.llvm.org/D84439 but less restrictive, else we don't allow shape_of to be able to produce a ranked output and doesn't allow for iterative refinement here. We can consider making it more restrictive later.

view details

push time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 7bfecd773968668b17fddf3865b1d611325942a8

Revert "[MLIR][Shape] Allow `num_elements` to operate on extent tensors" This reverts commit 55ced04d6bc13fd0f9396a0cfc393b44378d8784. Forgot to submit depend change first.

view details

push time in 2 months

push eventllvm/llvm-project

Frederik Gossen

commit sha 55ced04d6bc13fd0f9396a0cfc393b44378d8784

[MLIR][Shape] Allow `num_elements` to operate on extent tensors Differential Revision: https://reviews.llvm.org/D84445

view details

push time in 2 months

push eventjpienaar/iree

Jacques Pienaar

commit sha f72bb3c32806964c6cb3bf8fcd5226a79004361d

Remove trailing whitespace Editing via github itself was perhaps no time saver ...

view details

push time in 2 months

push eventjpienaar/iree

Jacques Pienaar

commit sha be8405f0db9c6d2001efc3435aac27352f99d3d4

Fix formatting

view details

push time in 2 months

PR opened google/iree

Report error message for type mismatch

Previously this could result in failed verification without further info.

+3 -1

0 comment

1 changed file

pr created time in 2 months

push eventjpienaar/iree

Jacques Pienaar

commit sha 71d454e4e22a8cab9432fdc7d057485745be2d9e

Report error message for type mismatch Previously this could result in failed verification without further info.

view details

push time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha dfa267a61c2b797b8fe9c345ee94742d496b39c6

[mlir][shape] Fix missing dependency

view details

push time in 2 months

Pull request review commentyycdavid/tamago

Support more ops, nasrnn, resnext50, varying input dimensions, weight handling, docker

+use crate::{input::*, model::*};+use egg::*;++fn resnet_block(+    graph: &mut GraphConverter,+    mut input: Id,+    strides: (i32, i32),+    out_channels: i32,+    input_dim_1: i32,+) -> Id {+    let w1 = graph.new_weight(vec![out_channels, input_dim_1, 1, 1]);+    let t = graph.conv2d(input, w1, 1, 1, PSAME, ACTRELU);+    let w2 = graph.new_weight(vec![out_channels, out_channels, 3, 3]);+    let t = graph.conv2d(t, w2, strides.0, strides.1, PSAME, ACTRELU);+    let w3 = graph.new_weight(vec![out_channels * 4, out_channels, 1, 1]);+    let t = graph.conv2d(t, w3, 1, 1, PSAME, ACTNONE);+    if (strides.0 > 1) || (input_dim_1 != out_channels * 4) {+        let w4 = graph.new_weight(vec![out_channels * 4, input_dim_1, 1, 1]);+        input = graph.conv2d(input, w4, strides.0, strides.1, PSAME, ACTRELU);+    }+    let t = graph.add(input, t);+    let t = graph.mul(input, t);+    graph.relu(t)+}++/// Gets the RecExpr of a resnet50 model+pub fn get_benchnet() -> RecExpr<Mdl> {

I don't know benchnet, is it just resnet50 ?

yycdavid

comment created time in 2 months

Pull request review commentyycdavid/tamago

Support more ops, nasrnn, resnext50, varying input dimensions, weight handling, docker

 impl GraphConverter {         let padding_id = self.add_or_get_val(padding);         let activation_id = self.add_or_get_val(activation); -        let conv_node = Mdl::Conv2d([+        let new_node = Mdl::Conv2d([             stride_h_id,             stride_w_id,             padding_id,             activation_id,             inpt,             wght,         ]);-        self.rec_expr.add(conv_node)+        self.rec_expr.add(new_node)     }      pub fn relu(&mut self, inpt: Id) -> Id {-        let relu_node = Mdl::Relu(inpt);-        self.rec_expr.add(relu_node)+        let new_node = Mdl::Relu(inpt);+        self.rec_expr.add(new_node)+    }++    pub fn tanh(&mut self, inpt: Id) -> Id {+        let new_node = Mdl::Tanh(inpt);+        self.rec_expr.add(new_node)+    }++    pub fn sigmoid(&mut self, inpt: Id) -> Id {+        let new_node = Mdl::Sigmoid(inpt);+        self.rec_expr.add(new_node)     }      pub fn add(&mut self, inpt_1: Id, inpt_2: Id) -> Id {-        let add_node = Mdl::Ewadd([inpt_1, inpt_2]);-        self.rec_expr.add(add_node)+        let new_node = Mdl::Ewadd([inpt_1, inpt_2]);+        self.rec_expr.add(new_node)+    }++    pub fn matmul(&mut self, inpt_1: Id, inpt_2: Id) -> Id {+        let activation = ACTNONE;+        let act_id = self.add_or_get_val(activation);++        let new_node = Mdl::Matmul([act_id, inpt_1, inpt_2]);+        self.rec_expr.add(new_node)+    }++    pub fn mul(&mut self, inpt_1: Id, inpt_2: Id) -> Id {+        let new_node = Mdl::Ewmul([inpt_1, inpt_2]);+        self.rec_expr.add(new_node)+    }++    pub fn concat(&mut self, axis: i32, ndim: i32, inpt_1: Id, inpt_2: Id) -> Id {+        // Only support concat of 2 inputs for now+        // To support more, pass in a slice and create more concat nodes here

Why does one need to pass ndim here? (at least ndim to me sounds like rank of inpt_1 and inpt_2)

yycdavid

comment created time in 2 months

Pull request review commentyycdavid/tamago

Interface for input graph specification

 pub fn rules_from_str(rs: Vec<&str>) -> Vec<Rewrite<Mdl, TensorAnalysis>> {  /// Struct for passing results in the recursive function check_pat ///-/// Similar as ValTnsr for TensorAnalysis, but with tnsr being the object -/// rather than pointer, to make memory working correctly with recursive +/// Similar as ValTnsr for TensorAnalysis, but with tnsr being the object+/// rather than pointer, to make memory working correctly with recursive

How does recursion come into play here?

yycdavid

comment created time in 2 months

Pull request review commentyycdavid/tamago

Interface for input graph specification

+use crate::model::*;+use egg::*;+use std::collections::HashMap;++/// Struct for converting a model specified using our Rust interface to RecExpr+///+/// The RecExpr is growed on the fly when member functions are called. Uses a+/// Hashmap to store the map of scalar nodes to their indices into the RexExpr to+/// avoid replication.+#[derive(Default)]+pub struct GraphConverter {+    rec_expr: RecExpr<Mdl>,+    scalar_map: HashMap<i32, Id>,+    name_gen: NameGen,+}++/// The APIs of GraphConverter are (intended to) match TASO's so that we can easily+/// constructing TASO graphs using this class+impl GraphConverter {+    /// Gets the RexExpr after graph is constructed+    pub fn rec_expr(self) -> RecExpr<Mdl> {+        self.rec_expr+    }++    /// Takes in the parameters for the new input, construct the node in RexExpr,+    /// return the Id (index) of this input node in the RecExpr. This is the+    /// pattern for all these op functions.+    pub fn new_input(&mut self, name: &str, dim1: i32, dim2: i32, dim3: i32, dim4: i32) -> Id {

Why 4 dims? Would an array also work here, so that you could iterate over it?

yycdavid

comment created time in 2 months

Pull request review commentyycdavid/tamago

Interface for input graph specification

-use tamago::{parse::*, verify::*};-use std::time::{Duration, Instant};-use tamago::model::*;-use tamago::rewrites::*;-use tamago::optimize::*;+use clap::{App, Arg}; use egg::*; use std::env::*; use std::fs::*; use std::time::*;+use std::time::{Duration, Instant};+use tamago::model::*;+use tamago::optimize::*;+use tamago::resnet50;+use tamago::rewrites::*;+use tamago::{parse::*, verify::*};  fn main() {-    //prove_taso_rules();-    optimize();-    //convert_rw_rules();-    //test();+    // Parse arguments+    let matches = App::new("Tamago")+        .arg(+            Arg::with_name("mode")+                .short("m")+                .long("mode")+                .takes_value(true)+                .help("Mode to run, can be verify, optimize, test, convert"),+        )+        .arg(+            Arg::with_name("model")+                .short("d")+                .long("model")+                .takes_value(true)+                .help("Specify a pre-defined model to optimize"),+        )+        .arg(+            Arg::with_name("rules")+                .short("r")+                .long("rules")+                .takes_value(true)+                .help("Provide a file with rewrite rules"),+        )+        .arg(+            Arg::with_name("model_file")+                .short("f")+                .long("model_file")+                .takes_value(true)+                .help("Provide a file with the input model"),+        )+        .get_matches();++    let run_mode = matches.value_of("mode").unwrap_or("optimize");+    println!("Running mode is: {}", run_mode);++    match run_mode {+        "optimize" => optimize(matches),+        "verify" => prove_taso_rules(matches),+        "test" => test(matches),+        "convert" => convert_rw_rules(matches),+        _ => panic!("Running mode not supported"),+    } } -fn convert_rw_rules() {+fn convert_rw_rules(matches: clap::ArgMatches) {     env_logger::init();-    let file = args().nth(1).expect("Pls supply taso rules file.");++    let file = matches+        .value_of("rules")+        .expect("Pls supply taso rules file.");     let taso_rules = read_to_string(file).expect("Something went wrong reading the file");      let converted = parse_and_convert(&taso_rules);      write("converted.txt", converted).expect("Unable to write file"); } --fn test() {+fn test(matches: clap::ArgMatches) {     env_logger::init();-    let file = args().nth(1).expect("Pls supply example graph file.");-    let input_graph = read_to_string(file).expect("Something went wrong reading the file");-    let start = input_graph.parse().unwrap(); -    let runner = Runner::<Mdl, TensorAnalysis, ()>::default()-        .with_expr(&start);--    runner.egraph.dot().to_svg("target/start.svg").unwrap();+    let start = resnet50::get_resnet50(); -    println!("  Nodes: {}", runner.egraph.total_size());+    let runner_start = Runner::<Mdl, TensorAnalysis, ()>::default().with_expr(&start);+    println!("Runner complete!");+    runner_start+        .egraph+        .dot()+        .to_svg("target/start.svg")

Should generating the dot/svg perhaps be behind a flag? (at least for most graphs I've worked with this can take some time :) ). Could the egraph be serialized so that one could draw it later?

yycdavid

comment created time in 2 months

Pull request review commentyycdavid/tamago

Interface for input graph specification

+use crate::model::*;+use egg::*;+use std::collections::HashMap;++/// Struct for converting a model specified using our Rust interface to RecExpr+///+/// The RecExpr is growed on the fly when member functions are called. Uses a+/// Hashmap to store the map of scalar nodes to their indices into the RexExpr to+/// avoid replication.+#[derive(Default)]+pub struct GraphConverter {+    rec_expr: RecExpr<Mdl>,+    scalar_map: HashMap<i32, Id>,+    name_gen: NameGen,+}++/// The APIs of GraphConverter are (intended to) match TASO's so that we can easily

Nit: s/so that we can easily constructing/so that we can easily construct/ ?

yycdavid

comment created time in 2 months

Pull request review commentyycdavid/tamago

Interface for input graph specification

+use crate::model::*;+use egg::*;+use std::collections::HashMap;++/// Struct for converting a model specified using our Rust interface to RecExpr+///+/// The RecExpr is growed on the fly when member functions are called. Uses a+/// Hashmap to store the map of scalar nodes to their indices into the RexExpr to+/// avoid replication.+#[derive(Default)]+pub struct GraphConverter {+    rec_expr: RecExpr<Mdl>,

Might be good to document these here - also are all of these public? E.g., is there expectation that name_gen will be called outside the struct?

yycdavid

comment created time in 2 months

issue closedtensorflow/tensorflow

tf_ops.cc takes 79s to compile / 4300+ lines long

This is with TensorFlow at commit 80768cb23a3a4314c52af0b48a6bcf23ca541e19. The file tensorflow/compiler/mlir/tensorflow/ir/tf_ops.cc is about 4300 lines long and takes nearly 79s by itself to compile on a fast workstation (Intel Skylake-based Core i7 8700K 3.70GHz) with a typical bazel config below. It'll be great to split this file into two.

bazel build --linkopt="-fuse-ld=lld" -j 11    //tensorflow/compiler/mlir:tf-opt
gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)
On an Fedora Core 31 x86-64 Linux, Intel Core i7 8700K 3.70 GHz, 32 GB DDR4 RAM.
INFO: Analyzed target //tensorflow/compiler/mlir:tf-opt (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //tensorflow/compiler/mlir:tf-opt up-to-date:
  bazel-bin/tensorflow/compiler/mlir/tf-opt
INFO: Elapsed time: 79.099s, Critical Path: 78.93s
INFO: 2 processes: 2 local.
INFO: Build completed successfully, 3 total actions

To reproduce, please change tf_ops.cc and rebuild tf-opt as shown above. The linkopt shouldn't make a difference here.

closed time in 2 months

bondhugula

issue commenttensorflow/tensorflow

tf_ops.cc takes 79s to compile / 4300+ lines long

We are working on that ;-) But the good thing is I could split it again (we'll hit point of diminishing returns quickly, but splitting is much easier vs the first split) and check.

I'll close for now as speeding up can be a bit unbounded as a task.

bondhugula

comment created time in 2 months

issue commenttensorflow/tensorflow

tf_ops.cc takes 79s to compile / 4300+ lines long

@bondhugula could you see if the recent change has improved this?

bondhugula

comment created time in 2 months

create barnchiml130/mlir-emitc

branch : cgo

created branch time in 2 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 3ae43a580eeacede5b9be715d2539e87030fe1ca

[ods] Enable getting forward decls allow Summary: Currently forward decls are included with all the op classes. But there are cases (say when splitting up headers) where one wants the forward decls but not all the classes. Add an option to enable this. This does not change any current behavior (some further refactoring is probably due here). Differential Revision: https://reviews.llvm.org/D83727

view details

push time in 2 months

CommitCommentEvent

create barnchjpienaar/llvm-project

branch : sprint

created branch time in 3 months

fork jpienaar/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.

http://llvm.org

fork in 3 months

Pull request review commentyycdavid/tamago

Test commit for pull request

 todo.txt manual_rules.txt test_g.txt-test_rules.txt\ No newline at end of file+test_rules.txt+

Is whitespace needed here?

yycdavid

comment created time in 3 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 2a19672af5d58d9ee9f8d6276b57cb584d295eb6

[mlir] Change ODS to have include and exclude regex This makes it easier to have a "remainder" include rule. And also makes it easier to read the command line flag.

view details

push time in 3 months

issue commenttensorflow/tensorflow

Build failure - missing dependency declarations LLVM

Maybe, but if the same header is included from different files build with different C++ macros enabled, then you would end with different includes. You'd need to ensure that any header or build rule that is optionally included due to macro (which one can do with conditional rules) is always included or not for all compilation paths else bazel is correct in flagging it as it is a missing dependency.

I don't know why this happens for count though. But unless one wrote the build files expecting this and have a CI for it, it may or may not work.

bondhugula

comment created time in 3 months

pull request commenttensorflow/tensorflow

[MLIR] Add constant folder for xla_hlo.broadcast_in_dim op

I don't think Mehdi used specific options vs just universally turning in assertions. Even if he did the build environments are quite different and what works on one need not translate to another (e.g., it would not be a considered supported/tested). Building more directed tests/tf-opt and using it to explore different tests would be cases of largest reuse.

bondhugula

comment created time in 3 months

issue closedtensorflow/tensorflow

Question about how tensorflow mlir library works ?

I was wondering as to how the tensorflow mlir library is used to reduce code from the tf dialect to an executable for a gpu, which dialects are involved in this conversion ?, are there multiple choices and are there any advantages to each of them ?. (and does ths have anything to do with the gpu dialect specified in the mlir documentation : https://mlir.llvm.org/docs/Dialects/GPU/)

closed time in 3 months

AkhilJ99

issue commenttensorflow/tensorflow

Build failure - missing dependency declarations LLVM

Hey Uday,

I think in this cases the workaround is to check the git history and checkout a commit immediately before a LLVM integrate. We are working on fixing these transient errors

Thanks - but unfortunately, the build time effort (given the amount of rebuilding necessary when switching commits) would make this exercise prohibitive for those without infinite build infrastructure! :-)

I don't see how Mihai's suggestion is in increasing effort: instead of syncing to arbitrary change, sync to a specific one. That doesn't change how much you build, just when. And the suggestion is for a known stabler time until we do a bit more refactoring of the export process.

Either one doesn't sync as often and you don't incur rebuilds, or you do and you do. Now normally one syncs as there is a reason (e.g., changes in upstream projects) but in those cases one will naturally incur build times (e.g, you are pulling new code). One can also change the workspace to a local_repository for even more flexibility (e.g., not constrained by what revs chosen for a given project) and that could enable more reuse even as you sink files and then you have the option of which ones to sync (you can just update ~n files in one directory rather doing a full rev bump).

Currently it is ~eventual consistency and we are working to make it more atomic.

As to this report: you are using an unsupported flag (wrt TF builds) and that what is causing this error. Building as normal

$ git checkout -b includes 8b87c1a09bf156ca9a42d9f72fad07da62100318 $ CC=clang CXX=clang++ bazel build --linkopt="-fuse-ld=lld" //tensorflow/compiler/mlir/xla/tests:all

works for me.

bondhugula

comment created time in 3 months

issue commenttensorflow/tensorflow

Question about how tensorflow mlir library works ?

Hey,

You will get more responses on mlir@tensorflow.org as more folks monitor it and the folks working on the GPU dialect MLIR core side are active there too. Would you mind re-asking there instead if you want more info?

In short/from a high-level there are multiple dialects involved depends on the lowering path (e.g., Affine, LinAlg, Shape, GPU, Standard, HLO, LLVM are along some of those paths) but the exact dialects depend on the lowering path and there are multiple, both as initial staging/ideas develop/technology matures as well as in limit there will still be multiple paths (e.g., if one knew at TF dialect level already the best way to lower an op is to a library call, then one might lower directly - although there are cases where the optimal choice for an op in isolation is not the optimal for the model as a whole, so its not "trivial" knowledge at work). The flexibility and ability to incorporate different code-generation strategies at different levels is important to best target the evolving HW space.

Best,

Jacques

AkhilJ99

comment created time in 3 months

push eventllvm/llvm-project

Jacques Pienaar

commit sha 71b9d89df78f25f373b6352c0f0c1e3a539634d0

[ods] Update Operator to record Arg->[Attr|Operand]Index mapping Also fixed bug in type inferface generator to address bug where operands and attributes are interleaved. Differential Revision: https://reviews.llvm.org/D82819

view details

push time in 3 months

more