profile
viewpoint
Smit Hinsu smit-hinsu Working on @tensorflow at @google.

tensorflow/lingvo 2102

Lingvo

smit-hinsu/distribtued_file_system 4

This repository contains all the lab assignment for MIT course distributed systmes(6.824). Link:http://pdos.csail.mit.edu/6.824-2012/labs/index.html

smit-hinsu/stanford-tensorflow-tutorials 1

This repository contains code examples for the Stanford's course: TensorFlow for Deep Learning Research.

smit-hinsu/tensor2tensor 1

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

smit-hinsu/cs231n.github.io 0

Public facing notes page

smit-hinsu/gce-scripts 0

Scripts to deploy VMs and TensorFlow on Google Cloud.

smit-hinsu/models 0

Models built with TensorFlow

smit-hinsu/subpar 0

Subpar is a utility for creating self-contained python executables. It is designed to work well with Bazel.

smit-hinsu/tensorboard 0

TensorFlow's Visualization Toolkit

smit-hinsu/tensorflow 0

An Open Source Machine Learning Framework for Everyone

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Add TF to TF lowering patterns to legalize_tf pass for TFLite

 Type InferExpandDimsType(Type ty, int64_t axis, Builder *builder) { //   %sum2 = "tf.AddV2"(%sum0, %sum1) //   %result = "tf.AddV2"(%sum2, %4) //-class LowerAddNOp : public OpRewritePattern<TF::AddNOp> {+class LowerAddNOp : public RewritePattern {  public:-  explicit LowerAddNOp(MLIRContext *context)-      : OpRewritePattern<TF::AddNOp>(context) {}+ explicit LowerAddNOp(MLIRContext *context)+      : RewritePattern("tf.AddN", {"tf.AddV2"}, 1, context) {}

nit: use getOperationName() here

ahmedsabie

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Add TF to TF lowering patterns to legalize_tf pass for TFLite

 Type InferExpandDimsType(Type ty, int64_t axis, Builder *builder) { //   %sum2 = "tf.AddV2"(%sum0, %sum1) //   %result = "tf.AddV2"(%sum2, %4) //-class LowerAddNOp : public OpRewritePattern<TF::AddNOp> {+class LowerAddNOp : public RewritePattern {  public:-  explicit LowerAddNOp(MLIRContext *context)-      : OpRewritePattern<TF::AddNOp>(context) {}+ explicit LowerAddNOp(MLIRContext *context)+      : RewritePattern("tf.AddN", {"tf.AddV2"}, 1, context) {} -  LogicalResult matchAndRewrite(TF::AddNOp op,+  LogicalResult matchAndRewrite(Operation *op,                                 PatternRewriter &rewriter) const override {+    auto addn_op = dyn_cast<TF::AddNOp>(op);

Use cast instead of dyn_cast here.

ahmedsabie

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 struct ConvertFusedBatchNorm : public OpRewritePattern<TF::FusedBatchNormOp> {                              tf_fused_batch_norm_op.getAttrs());     Operation *tf_fused_batch_norm_op_v3 = rewriter.createOperation(new_state); -    rewriter.replaceOp(tf_fused_batch_norm_op,-                       tf_fused_batch_norm_op_v3->getResults());

We can still use replaceOp by using take_front or drop_back method on getResults().

ahmedsabie

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 struct ConvertTFBroadcastTo : public RewritePattern {   } }; +struct ConvertFusedBatchNorm : public RewritePattern {+  explicit ConvertFusedBatchNorm(MLIRContext *context)+      : RewritePattern(TF::FusedBatchNormOp::getOperationName(), 1, context) {}++  LogicalResult matchAndRewrite(Operation *op,+                                PatternRewriter &rewriter) const override {+    auto tf_fused_batch_norm_op = cast<TF::FusedBatchNormOp>(op);++    // FusedBatchNormV3 expects a 5th output reserve_space_3,+    // but the output is unused; it doesn't matter what we pass there.+    rewriter.replaceOpWithNewOp<TF::FusedBatchNormV3Op>(

This is still an issue.

ahmedsabie

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 struct ConvertTFBroadcastTo : public RewritePattern {   } }; +struct ConvertFusedBatchNorm : public OpRewritePattern<TF::FusedBatchNormOp> {+  explicit ConvertFusedBatchNorm(MLIRContext *context)+      : OpRewritePattern<TF::FusedBatchNormOp>(context) {}++  LogicalResult matchAndRewrite(TF::FusedBatchNormOp tf_fused_batch_norm_op,+                                PatternRewriter &rewriter) const override {+    const auto &old_result_types = tf_fused_batch_norm_op.getResultTypes();

we should be able to do something like,

auto new_result_types = llvm::to_vector<6>(op.getResultTypes());

ahmedsabie

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 struct ConvertTFBroadcastTo : public RewritePattern {   } }; +struct ConvertFusedBatchNorm : public RewritePattern {+  explicit ConvertFusedBatchNorm(MLIRContext *context)+      : RewritePattern(TF::FusedBatchNormOp::getOperationName(), 1, context) {}++  LogicalResult matchAndRewrite(Operation *op,+                                PatternRewriter &rewriter) const override {+    auto tf_fused_batch_norm_op = cast<TF::FusedBatchNormOp>(op);++    // FusedBatchNormV3 expects a 5th output reserve_space_3,+    // but the output is unused; it doesn't matter what we pass there.+    rewriter.replaceOpWithNewOp<TF::FusedBatchNormV3Op>(

This will still be an issue in replaceOp. Could you try running the test in the debug mode which would trigger the assert?

ahmedsabie

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Add TF to TF lowering patterns to legalize_tf pass for TFLite

 func @addN(%arg0: tensor<2x3xi32>, %arg1: tensor<2x3xi32>, %arg2: tensor<2x3xi32   return %0 : tensor<2x3xi32>  // CHECK-LABEL: addN-// CHECK:  "tfl.add_n"(%arg0, %arg1, %arg2) : (tensor<2x3xi32>, tensor<2x3xi32>, tensor<2x3xi32>) -> tensor<2x3xi32>+// CHECK:  %0 = tfl.add %arg0, %arg1 {fused_activation_function = "NONE"} : tensor<2x3xi32>

This is an unintended effect. DialectConversion should have preferred tf.AddN to tfl.add_n direct mapping but something went wrong.

See the comment in computeLegalizationGraphBenefit. https://mlir.llvm.org/doxygen/DialectConversion_8cpp_source.html

Could you debug why the result is getting affected?

ahmedsabie

comment created time in a month

PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Add complex type reciprocal pattern to TF to TF lowerings

 limitations under the License.  #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/SmallVector.h"-#include "mlir/IR/Attributes.h"  // from @llvm-project-#include "mlir/IR/Diagnostics.h"  // from @llvm-project-#include "mlir/IR/MLIRContext.h"  // from @llvm-project-#include "mlir/IR/PatternMatch.h"  // from @llvm-project+#include "mlir/IR/Attributes.h"     // from @llvm-project

Let's revert this indentation change for comments to be consistent with other files. I guess this is getting auto formatted and would be good to have it but better to avoid the inconsistency.

ahmedsabie

comment created time in a month

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Add complex type reciprocal pattern to TF to TF lowerings

 def : Pat<(TF_PadOp TensorOf<[AnySignlessInteger, AnyFloat]>:$input, $paddings), // Reciprocal op patterns. //===----------------------------------------------------------------------===// -// TODO(hinsu): Support complex and unsigned input types.

Yes, that is correct. We can remove qualification of $x as now we handle all the supported types.

ahmedsabie

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 struct ConvertTFBroadcastTo : public RewritePattern {   } }; +struct ConvertFusedBatchNorm : public RewritePattern {+  explicit ConvertFusedBatchNorm(MLIRContext *context)+      : RewritePattern(TF::FusedBatchNormOp::getOperationName(), 1, context) {}++  LogicalResult matchAndRewrite(Operation *op,+                                PatternRewriter &rewriter) const override {+    auto tf_fused_batch_norm_op = cast<TF::FusedBatchNormOp>(op);++    // FusedBatchNormV3 expects a 5th output reserve_space_3,+    // but the output is unused; it doesn't matter what we pass there.+    rewriter.replaceOpWithNewOp<TF::FusedBatchNormV3Op>(

We can use OpBuilder::createOperation to forward all operands and attributes without explicitly listing them which could be error prone.

ahmedsabie

comment created time in a month

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 struct ConvertTFBroadcastTo : public RewritePattern {   } }; +struct ConvertFusedBatchNorm : public RewritePattern {+  explicit ConvertFusedBatchNorm(MLIRContext *context)+      : RewritePattern(TF::FusedBatchNormOp::getOperationName(), 1, context) {}++  LogicalResult matchAndRewrite(Operation *op,+                                PatternRewriter &rewriter) const override {+    auto tf_fused_batch_norm_op = cast<TF::FusedBatchNormOp>(op);++    // FusedBatchNormV3 expects a 5th output reserve_space_3,+    // but the output is unused; it doesn't matter what we pass there.+    rewriter.replaceOpWithNewOp<TF::FusedBatchNormV3Op>(+        op, tf_fused_batch_norm_op.y().getType(),+        tf_fused_batch_norm_op.batch_mean().getType(),+        tf_fused_batch_norm_op.batch_variance().getType(),+        tf_fused_batch_norm_op.reserve_space_1().getType(),+        tf_fused_batch_norm_op.reserve_space_2().getType(),+        /*reserve_space_3=*/tf_fused_batch_norm_op.reserve_space_2().getType(),

Use unranked tensor of float type instead of using the type of second last result. The shape inference in TensorFlow doesn't infer shape for the last result. See FusedBatchNormV3Shape in the TensorFlow codebase.

ahmedsabie

comment created time in a month

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 limitations under the License. #include "llvm/ADT/StringSwitch.h" #include "llvm/Support/Casting.h" #include "llvm/Support/Debug.h"-#include "mlir/Analysis/LoopAnalysis.h"  // from @llvm-project+#include "mlir/Analysis/LoopAnalysis.h"           // from @llvm-project

Let's revert this for consistency with other files.

ahmedsabie

comment created time in a month

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 struct ConvertTFBroadcastTo : public RewritePattern {   } }; +struct ConvertFusedBatchNorm : public RewritePattern {

Derive from the new OpRewritePattern class that avoids the use of the cast.

ahmedsabie

comment created time in a month

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Convert FusedBatchNorm to FusedBatchNormV3 in prepare TFLite pass

 struct ConvertTFBroadcastTo : public RewritePattern {   } }; +struct ConvertFusedBatchNorm : public RewritePattern {+  explicit ConvertFusedBatchNorm(MLIRContext *context)+      : RewritePattern(TF::FusedBatchNormOp::getOperationName(), 1, context) {}++  LogicalResult matchAndRewrite(Operation *op,+                                PatternRewriter &rewriter) const override {+    auto tf_fused_batch_norm_op = cast<TF::FusedBatchNormOp>(op);++    // FusedBatchNormV3 expects a 5th output reserve_space_3,+    // but the output is unused; it doesn't matter what we pass there.+    rewriter.replaceOpWithNewOp<TF::FusedBatchNormV3Op>(

replaceOpWithNewOp requires the new op to have the same number of results as the old op so how does the test pass?

ahmedsabie

comment created time in a month

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Legalize tf.diag with tf2xla

 func @bessel_i1e(%arg0: tensor<3xf16>, %arg1: tensor<3xf32>, %arg2: tensor<3xf64   return %0, %1, %2 : tensor<3xf16>, tensor<3xf32>, tensor<3xf64> } +// CHECK-LABEL: diag+func @diag(%arg0: tensor<2xf32>) -> tensor<2x2xf32> {+  // CHECK: %[[ZERO:.*]]  = mhlo.constant dense<0.000000e+00> : tensor<2x2xf32>+  // CHECK: %[[IOTA:.*]]  = "mhlo.iota"() {iota_dimension = 0 : i64} : () -> tensor<2xi32>+  // CHECK: %[[BROADCAST1:.*]] = "mhlo.broadcast_in_dim"(%[[IOTA]]) {broadcast_dimensions = dense<1> : tensor<1xi64>} : (tensor<2xi32>) -> tensor<2x2xi32>+  // CHECK: %[[BROADCAST0:.*]] = "mhlo.broadcast_in_dim"(%[[IOTA]]) {broadcast_dimensions = dense<0> : tensor<1xi64>} : (tensor<2xi32>) -> tensor<2x2xi32>+  // CHECK: %[[EQ:.*]] = "mhlo.compare"(%[[BROADCAST1]], %[[BROADCAST0]]) {comparison_direction = "EQ"} : (tensor<2x2xi32>, tensor<2x2xi32>) -> tensor<2x2xi1>+  // CHECK: %[[BROADCAST2:.*]] = "mhlo.broadcast_in_dim"(%[[ARG0]]) {broadcast_dimensions = dense<1> : tensor<1xi64>} : (tensor<2xf32>) -> tensor<2x2xf32>

ARG0 is not defined. This commit doesn't implement lowering so just having a CHECK-NOT to make sure tf.Diag op is not present after the pass is enough.

WindQAQ

comment created time in a month

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Legalize tf.diag with tf2xla

 def AssertAllEqual(self, result, expected, rtol, atol):     """Tests that result and expeted are exactly equal."""     self.assertAllEqual(result, expected) -  @test_util.disable_mlir_bridge(

This test has other failures as well. Keep this test disabled and change the description to "Handle complex element types in DiagPart op lowering"

WindQAQ

comment created time in a month

pull request commenttensorflow/tensorflow

[TF:MLIR] Legalize tf.diag with tf2xla

There isn't any objective standard to describe complexity of a legalization pattern. It could be the number of ops generated, lines of code for the pattern and test etc.

If you could implement rewrite patterns with reasonable effort natively without using the fallback kernel, that should be preferred.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial static input shape

 LogicalResult ShapeNOp::fold(ArrayRef<Attribute> operands,   return success(); } -// TODO(hinsu): Add canonicalization pattern for ShapeN ops that don't have all+namespace {+// Canonicalization pattern for ShapeNOp that don't have all // static input shapes. Replacing output values corresponding to static input // types may enable optimizations in users of the values.+class ShapeNPartialStaticInputShape : public OpRewritePattern<ShapeNOp> {+  using OpRewritePattern<ShapeNOp>::OpRewritePattern;+  LogicalResult matchAndRewrite(ShapeNOp op,+                                PatternRewriter &rewriter) const override {+    // ShapeNOp::fold handles this case.+    if (op.getNumOperands() == 0) return success();+    int width = op.getType(0)+                    .cast<ShapedType>()+                    .getElementType()+                    .getIntOrFloatBitWidth();++    SmallVector<Value, 4> results(op.getNumOperands());+    SmallVector<int64_t, 4> dynamic_index;+    SmallVector<Value, 4> dynamic_input;+    SmallVector<Type, 4> result_types;+    for (auto e : llvm::enumerate(op.getOperands())) {+      if (Attribute result = ConvertShapeToAttr(e.value().getType(), width)) {+        results[e.index()] =+            rewriter.create<TF::ConstOp>(op.getLoc(), result);+      } else {+        dynamic_index.push_back(e.index());+        dynamic_input.push_back(e.value());+        result_types.push_back(op.getType(e.index()));+      }+    }++    if (dynamic_input.size() == op.getNumOperands()) {+      // Cannot canonicalize ShapeN if all inputs are dynamic.+      return failure();+    }++    // Create a ShapeOp when there is only one dynamic input.+    // Or create a ShapeNOp when there are two or more dynamic inputs.+    if (dynamic_input.size() == 1) {

Having a separate pattern for ShapeN to Shape conversion to prefer Shape op is useful but not necessary. I don't see big advantage to condition it here.

WindQAQ

comment created time in 2 months

CommitCommentEvent

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial static input shape

 LogicalResult ShapeNOp::fold(ArrayRef<Attribute> operands,   return success(); } -// TODO(hinsu): Add canonicalization pattern for ShapeN ops that don't have all+namespace {+// Canonicalization pattern for ShapeNOp that don't have all // static input shapes. Replacing output values corresponding to static input // types may enable optimizations in users of the values.+class ShapeNPartialStaticInputShape : public OpRewritePattern<ShapeNOp> {+  using OpRewritePattern<ShapeNOp>::OpRewritePattern;+  LogicalResult matchAndRewrite(ShapeNOp op,+                                PatternRewriter &rewriter) const override {+    // ShapeNOp::fold handles this case.+    if (op.getNumOperands() == 0) return success();+    int width = op.getType(0)+                    .cast<ShapedType>()+                    .getElementType()+                    .getIntOrFloatBitWidth();++    SmallVector<Value, 4> results(op.getNumOperands());+    SmallVector<int64_t, 4> dynamic_index;

nit: rename to dynamic_indices and dynamic_inputs

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial static input shape

 LogicalResult ShapeNOp::fold(ArrayRef<Attribute> operands,   return success(); } -// TODO(hinsu): Add canonicalization pattern for ShapeN ops that don't have all+namespace {+// Canonicalization pattern for ShapeNOp that don't have all // static input shapes. Replacing output values corresponding to static input // types may enable optimizations in users of the values.+class ShapeNPartialStaticInputShape : public OpRewritePattern<ShapeNOp> {+  using OpRewritePattern<ShapeNOp>::OpRewritePattern;+  LogicalResult matchAndRewrite(ShapeNOp op,+                                PatternRewriter &rewriter) const override {+    // ShapeNOp::fold handles this case.+    if (op.getNumOperands() == 0) return success();+    int width = op.getType(0)+                    .cast<ShapedType>()+                    .getElementType()+                    .getIntOrFloatBitWidth();++    SmallVector<Value, 4> results(op.getNumOperands());+    SmallVector<int64_t, 4> dynamic_index;+    SmallVector<Value, 4> dynamic_input;+    SmallVector<Type, 4> result_types;+    for (auto e : llvm::enumerate(op.getOperands())) {+      if (Attribute result = ConvertShapeToAttr(e.value().getType(), width)) {+        results[e.index()] =+            rewriter.create<TF::ConstOp>(op.getLoc(), result);+      } else {+        dynamic_index.push_back(e.index());+        dynamic_input.push_back(e.value());+        result_types.push_back(op.getType(e.index()));+      }+    }++    if (dynamic_input.size() == op.getNumOperands()) {+      // Cannot canonicalize ShapeN if all inputs are dynamic.+      return failure();+    }++    if (!dynamic_input.empty()) {

Now that we handle all static shaped operands case here, we can drop the folder.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial static input shape

 LogicalResult ShapeNOp::fold(ArrayRef<Attribute> operands,   return success(); } -// TODO(hinsu): Add canonicalization pattern for ShapeN ops that don't have all+namespace {+// Canonicalization pattern for ShapeNOp that don't have all // static input shapes. Replacing output values corresponding to static input // types may enable optimizations in users of the values.+class ShapeNPartialStaticInputShape : public OpRewritePattern<ShapeNOp> {+  using OpRewritePattern<ShapeNOp>::OpRewritePattern;+  LogicalResult matchAndRewrite(ShapeNOp op,+                                PatternRewriter &rewriter) const override {+    // ShapeNOp::fold handles this case.+    if (op.getNumOperands() == 0) return success();+    int width = op.getType(0)

getelementtypeorself helper function will help shorten this.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial static input shape

 LogicalResult ShapeNOp::fold(ArrayRef<Attribute> operands,   return success(); } -// TODO(hinsu): Add canonicalization pattern for ShapeN ops that don't have all+namespace {+// Canonicalization pattern for ShapeNOp that don't have all // static input shapes. Replacing output values corresponding to static input // types may enable optimizations in users of the values.+class ShapeNPartialStaticInputShape : public OpRewritePattern<ShapeNOp> {+  using OpRewritePattern<ShapeNOp>::OpRewritePattern;+  LogicalResult matchAndRewrite(ShapeNOp op,+                                PatternRewriter &rewriter) const override {+    if (op.getNumOperands() == 0) return success();+    int width = op.getType(0)+                    .cast<ShapedType>()+                    .getElementType()+                    .getIntOrFloatBitWidth();+    BoolAttr use32Bit = BoolAttr::get(width == 32, op.getContext());++    SmallVector<Value, 4> results;+    for (Value input : op.getOperands()) {+      Value shape;+      if (OpFoldResult result = ConvertShapeToAttr(input.getType(), width)) {+        shape =+            rewriter.create<TF::ConstOp>(op.getLoc(), result.get<Attribute>());+      } else {+        shape = rewriter.create<TF::ShapeOp>(op.getLoc(), input, use32Bit);

Yes, that's a good idea.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial input shape

 LogicalResult ShapeNOp::fold(ArrayRef<Attribute> operands,   return success(); } -// TODO(hinsu): Add canonicalization pattern for ShapeN ops that don't have all+namespace {+// Canonicalization pattern for ShapeNOp that don't have all // static input shapes. Replacing output values corresponding to static input // types may enable optimizations in users of the values.+class ShapeNPartialStaticInputShape : public OpRewritePattern<ShapeNOp> {+  using OpRewritePattern<ShapeNOp>::OpRewritePattern;+  LogicalResult matchAndRewrite(ShapeNOp op,+                                PatternRewriter &rewriter) const override {+    if (op.getNumOperands() == 0) return success();+    int width = op.getType(0)+                    .cast<ShapedType>()+                    .getElementType()+                    .getIntOrFloatBitWidth();+    BoolAttr use32Bit = BoolAttr::get(width == 32, op.getContext());++    SmallVector<Value, 4> results;+    for (Value input : op.getOperands()) {+      Value shape;+      if (OpFoldResult result = ConvertShapeToAttr(input.getType(), width)) {+        shape =+            rewriter.create<TF::ConstOp>(op.getLoc(), result.get<Attribute>());+      } else {+        shape = rewriter.create<TF::ShapeOp>(op.getLoc(), input, use32Bit);

Instead of creating individual Shape ops, we should create a single ShapeN op with all the dynamic inputs.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial input shape

 LogicalResult ShapeNOp::fold(ArrayRef<Attribute> operands,   return success(); } -// TODO(hinsu): Add canonicalization pattern for ShapeN ops that don't have all+namespace {+// Canonicalization pattern for ShapeNOp that don't have all // static input shapes. Replacing output values corresponding to static input // types may enable optimizations in users of the values.+class ShapeNPartialStaticInputShape : public OpRewritePattern<ShapeNOp> {+  using OpRewritePattern<ShapeNOp>::OpRewritePattern;+  LogicalResult matchAndRewrite(ShapeNOp op,+                                PatternRewriter &rewriter) const override {+    if (op.getNumOperands() == 0) return success();+    int width = op.getType(0)+                    .cast<ShapedType>()+                    .getElementType()+                    .getIntOrFloatBitWidth();+    BoolAttr use32Bit = BoolAttr::get(width == 32, op.getContext());++    SmallVector<Value, 4> results;+    for (Value input : op.getOperands()) {+      Value shape;+      if (OpFoldResult result = ConvertShapeToAttr(input.getType(), width)) {

ConvertShapeToAttr return type is Attribute.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial input shape

 func @testEmptybf16() -> (tensor<5xbf16>) { }  // CHECK-LABEL: func @testShapeN-func @testShapeN(%arg0: tensor<f32>, %arg1: tensor<1x32x32x16xf32>, %arg2: tensor<*xf32>) -> (tensor<0xi64>, tensor<4xi64>, tensor<4xi64>, tensor<?xi64>) {+func @testShapeN(%arg0: tensor<f32>, %arg1: tensor<1x32x32x16xf32>, %arg2: tensor<*xf32>, %arg3: tensor<1x32x32xf32>) -> (tensor<0xi64>, tensor<4xi64>, tensor<4xi64>, tensor<?xi64>, tensor<3xi64>) { -  // CHECK: "tf.Const"() {value = dense<> : tensor<0xi64>-  // CHECK: "tf.Const"() {value = dense<[1, 32, 32, 16]> : tensor<4xi64>}+  // CHECK: %[[SHAPE0:.*]] = "tf.Const"() {value = dense<> : tensor<0xi64>}+  // CHECK: %[[SHAPE1:.*]] = "tf.Const"() {value = dense<[1, 32, 32, 16]> : tensor<4xi64>}   %0:2 = "tf.ShapeN"(%arg0, %arg1) : (tensor<f32>, tensor<1x32x32x16xf32>) -> (tensor<0xi64>, tensor<4xi64>) -  // CHECK: tf.ShapeN-  %1:2 = "tf.ShapeN"(%arg1, %arg2) : (tensor<1x32x32x16xf32>, tensor<*xf32>) -> (tensor<4xi64>, tensor<?xi64>)+  // CHECK: %[[SHAPE3:.*]] = "tf.Const"() {value = dense<[1, 32, 32]> : tensor<3xi64>}

nit: this is a bit convoluted now. SHAPE3 comes before SHAPE2 and SHAPE1 ops are de-duped which is not related to our canonicalizer. Let's split out this op in a separate function.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Canonicalize ShapeNOp with partial input shape

 LogicalResult ShapeNOp::fold(ArrayRef<Attribute> operands,   return success(); } -// TODO(hinsu): Add canonicalization pattern for ShapeN ops that don't have all+namespace {+// Canonicalization pattern for ShapeNOp that don't have all // static input shapes. Replacing output values corresponding to static input // types may enable optimizations in users of the values.+class ShapeNPartialStaticInputShape : public OpRewritePattern<ShapeNOp> {+  using OpRewritePattern<ShapeNOp>::OpRewritePattern;+  LogicalResult matchAndRewrite(ShapeNOp op,+                                PatternRewriter &rewriter) const override {+    if (op.getNumOperands() == 0) return success();

Add a comment that folder handles this case.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Improve parallelism of tf.AddN

 class LowerAddNOp : public OpRewritePattern<TF::AddNOp> {     // support variant type so variant types require special handling.     if (getElementTypeOrSelf(op.getType()).isa<VariantType>()) return failure(); -    auto begin = op.inputs().begin();-    // Return the only operand directly.-    if (op.N() == 1) {-      rewriter.replaceOp(op, *begin);-      return success();+    llvm::SmallVector<Value, 4> operands(op.inputs().begin(),+                                         op.inputs().end());++    int64_t n = operands.size();+    // Keep doing tree-based reduction when there are more than two operands.+    while (n >= 2) {+      int64_t j = 0;+      for (int64_t i = 0; i < n; i += 2, ++j) {+        // Add two adjacent operands if applicable.+        operands[j] = (i + 1 < n)

nit: we can use operands[i/2] and n = (n+1)/2.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Improve parallelism of tf.AddN

 Type InferExpandDimsType(Type ty, int64_t axis, Builder *builder) {  // Lowers AddN op to a sequence of AddV2 ops to accumulate operands. //-// Note that to improve the parallelism, the operands are split-// into two halves, and are accumulated first.+// Note that to improve the parallelism, AddN op uses tree-based reduction.+// For example, the tf.AddN([0, 1, 2, 3, 4]) behaves as follows:+//+//                 0     1     2     3     4+//                 |     |     |     |     |+//                 -------     -------     |+//                    |           |        |+//                    1           5        |

nit: 1 is used twice.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Utilize TensorFormat

 void BatchToSpaceOp::getCanonicalizationPatterns( //   are not unknown. // static LogicalResult Verify(BiasAddOp op) {-  StringRef format = op.data_format();-  if (format == "NHWC") {+  std::string data_format = op.data_format().str();

optional nit: To avoid materializing the string here for FormatFromString, let's change input type of the function to be string_view. Then, we can convert from StringRef to string_view directly.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Verify TransposeOp

 static LogicalResult Verify(TransposeOp op) {       const int64_t y_dim = y_type.getDimSize(y_idx);       const int64_t x_idx = e.value().getSExtValue();       const int64_t x_dim = x_type.getDimSize(x_idx);-      if (y_dim == ShapedType::kDynamicSize || x_dim == ShapedType::kDynamicSize) {-        continue;-      }-      if (y_dim != x_dim) {+      if (y_dim != ShapedType::kDynamicSize && x_dim != ShapedType::kDynamicSize && y_dim != x_dim) {

This seems to be exceeding 80 char limit, break it into two lines. https://google.github.io/styleguide/cppguide.html#Line_Length

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Verify TransposeOp

 void ToBoolOp::getCanonicalizationPatterns(OwningRewritePatternList &results, //===----------------------------------------------------------------------===//  static LogicalResult Verify(TransposeOp op) {-  // TODO(hinsu): Verify using a custom verifier that,-  // * Transpose permutation is 1-D of size equal to the rank of the first-  //   input, if the shapes are partially known. Requires use of a more-  //   restrictive type than TF_Tensor.-  // * Result shape dimensions are possible based on the input shape.+  auto perm_type = op.perm().getType().dyn_cast<RankedTensorType>();+  auto x_type = op.x().getType().dyn_cast<RankedTensorType>();+  auto y_type = op.y().getType().dyn_cast<RankedTensorType>();++  if (!perm_type) {+    return success();+  }++  if (perm_type.getRank() != 1) {

perm_type && perm_type.getRank() != 1 so that we can reject the case of mismatching ranks of x and y.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Verify TransposeOp

 void ToBoolOp::getCanonicalizationPatterns(OwningRewritePatternList &results, //===----------------------------------------------------------------------===//  static LogicalResult Verify(TransposeOp op) {-  // TODO(hinsu): Verify using a custom verifier that,-  // * Transpose permutation is 1-D of size equal to the rank of the first-  //   input, if the shapes are partially known. Requires use of a more-  //   restrictive type than TF_Tensor.-  // * Result shape dimensions are possible based on the input shape.+  auto perm_type = op.perm().getType().dyn_cast<RankedTensorType>();+  auto x_type = op.x().getType().dyn_cast<RankedTensorType>();+  auto y_type = op.y().getType().dyn_cast<RankedTensorType>();++  if (!perm_type) {+    return success();+  }++  if (perm_type.getRank() != 1) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor, got perm of rank "+           << perm_type.getRank();+  }++  if (x_type && y_type && x_type.getRank() != y_type.getRank()) {+    return op.emitOpError()+           << "x should be of the same rank with y, got "+           << "x of rank " << x_type.getRank() << ", and y of rank "+           << y_type.getRank();+  }++  if (!x_type || !y_type || !perm_type.hasStaticShape()) {+    return success();+  }++  if (x_type.getRank() != perm_type.getNumElements()) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor of size "+           << "equal to the rank of x, got perm of size "+           << perm_type.getNumElements() << ", and x of rank "+           << x_type.getRank();+  }++  DenseIntElementsAttr attr_perm;+  if (matchPattern(op.perm(), m_Constant(&attr_perm))) {+    // y.shape[i] should be equal to x.shape[perm[i]]+    // for i = [0, 1, ..., rank(x) - 1]+    for (auto e : llvm::enumerate(attr_perm)) {+      const int64_t y_idx = e.index();+      const int64_t y_dim = y_type.getDimSize(y_idx);+      const int64_t x_idx = e.value().getSExtValue();+      const int64_t x_dim = x_type.getDimSize(x_idx);+      if (y_dim == ShapedType::kDynamicSize || x_dim == ShapedType::kDynamicSize) {

It would be good to have a new test for this. Up to you but we can combine this condition with the next one to keep it simple.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Verify TransposeOp

 void ToBoolOp::getCanonicalizationPatterns(OwningRewritePatternList &results, //===----------------------------------------------------------------------===//  static LogicalResult Verify(TransposeOp op) {-  // TODO(hinsu): Verify using a custom verifier that,-  // * Transpose permutation is 1-D of size equal to the rank of the first-  //   input, if the shapes are partially known. Requires use of a more-  //   restrictive type than TF_Tensor.-  // * Result shape dimensions are possible based on the input shape.+  auto perm_type = op.perm().getType().dyn_cast<RankedTensorType>();+  if (!perm_type) {+    return success();+  }++  if (perm_type.getRank() != 1) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor, got perm of rank "+           << perm_type.getRank();+  }++  if (!perm_type.hasStaticShape()) {+    return success();+  }++  auto x_type = op.x().getType().dyn_cast<RankedTensorType>();+  if (!x_type) {+    return success();+  }++  const int64_t x_rank = x_type.getRank();+  if (x_rank != perm_type.getNumElements()) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor of size "+           << "equal to the rank of x, got perm of size "+           << perm_type.getNumElements() << ", and x of rank " << x_rank;+  }++  auto y_type = op.y().getType().dyn_cast<RankedTensorType>();+  if (!y_type) {+    return success();+  }++  const int64_t y_rank = y_type.getRank();+  if (x_rank != y_rank) {+    return op.emitOpError()+           << "x should be of the same rank with y, got "+           << "x of rank " << x_rank << ", and y of rank " << y_rank;+  }++  DenseIntElementsAttr attr_perm;+  if (matchPattern(op.perm(), m_Constant(&attr_perm))) {+    // y.shape[i] should be equal to x.shape[perm[i]]+    // for i = [0, 1, ..., rank(x) - 1]+    for (auto e : llvm::enumerate(attr_perm)) {+      const int64_t y_idx = e.index();+      const int64_t y_dim = y_type.getDimSize(y_idx);+      const int64_t x_idx = e.value().getSExtValue();+      const int64_t x_dim = x_type.getDimSize(x_idx);+      if (y_dim != x_dim) {

The example you gave is legal. During compile time verify, we only report error in cases that are already known to be illegal and can never be valid at runtime.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Verify TransposeOp

 void ToBoolOp::getCanonicalizationPatterns(OwningRewritePatternList &results, //===----------------------------------------------------------------------===//  static LogicalResult Verify(TransposeOp op) {-  // TODO(hinsu): Verify using a custom verifier that,-  // * Transpose permutation is 1-D of size equal to the rank of the first-  //   input, if the shapes are partially known. Requires use of a more-  //   restrictive type than TF_Tensor.-  // * Result shape dimensions are possible based on the input shape.+  auto perm_type = op.perm().getType().dyn_cast<RankedTensorType>();+  if (!perm_type) {+    return success();+  }++  if (perm_type.getRank() != 1) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor, got perm of rank "+           << perm_type.getRank();+  }++  if (!perm_type.hasStaticShape()) {

we can still verify that ranks of x and y are matching.

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Verify TransposeOp

 void ToBoolOp::getCanonicalizationPatterns(OwningRewritePatternList &results, //===----------------------------------------------------------------------===//  static LogicalResult Verify(TransposeOp op) {-  // TODO(hinsu): Verify using a custom verifier that,-  // * Transpose permutation is 1-D of size equal to the rank of the first-  //   input, if the shapes are partially known. Requires use of a more-  //   restrictive type than TF_Tensor.-  // * Result shape dimensions are possible based on the input shape.+  auto perm_type = op.perm().getType().dyn_cast<RankedTensorType>();+  if (!perm_type) {+    return success();+  }++  if (perm_type.getRank() != 1) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor, got perm of rank "+           << perm_type.getRank();+  }++  if (!perm_type.hasStaticShape()) {+    return success();+  }++  auto x_type = op.x().getType().dyn_cast<RankedTensorType>();+  if (!x_type) {+    return success();+  }++  const int64_t x_rank = x_type.getRank();+  if (x_rank != perm_type.getNumElements()) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor of size "+           << "equal to the rank of x, got perm of size "+           << perm_type.getNumElements() << ", and x of rank " << x_rank;+  }++  auto y_type = op.y().getType().dyn_cast<RankedTensorType>();+  if (!y_type) {+    return success();+  }++  const int64_t y_rank = y_type.getRank();+  if (x_rank != y_rank) {+    return op.emitOpError()+           << "x should be of the same rank with y, got "+           << "x of rank " << x_rank << ", and y of rank " << y_rank;+  }++  DenseIntElementsAttr attr_perm;+  if (matchPattern(op.perm(), m_Constant(&attr_perm))) {+    // y.shape[i] should be equal to x.shape[perm[i]]+    // for i = [0, 1, ..., rank(x) - 1]+    for (auto e : llvm::enumerate(attr_perm)) {+      const int64_t y_idx = e.index();+      const int64_t y_dim = y_type.getDimSize(y_idx);+      const int64_t x_idx = e.value().getSExtValue();+      const int64_t x_dim = x_type.getDimSize(x_idx);+      if (y_dim != x_dim) {+        return op.emitOpError()+               << "y.shape[" << y_idx << "] = " << y_dim

nit: Following error is not easy to read. y.shape[0] = 3 != x.shape[perm[2]] = 4

This can be rephrased as, requires y.shape[0] (3) to be equal to x.shape[perm[2]] (4).

WindQAQ

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TF:MLIR] Verify TransposeOp

 void ToBoolOp::getCanonicalizationPatterns(OwningRewritePatternList &results, //===----------------------------------------------------------------------===//  static LogicalResult Verify(TransposeOp op) {-  // TODO(hinsu): Verify using a custom verifier that,-  // * Transpose permutation is 1-D of size equal to the rank of the first-  //   input, if the shapes are partially known. Requires use of a more-  //   restrictive type than TF_Tensor.-  // * Result shape dimensions are possible based on the input shape.+  auto perm_type = op.perm().getType().dyn_cast<RankedTensorType>();+  if (!perm_type) {+    return success();+  }++  if (perm_type.getRank() != 1) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor, got perm of rank "+           << perm_type.getRank();+  }++  if (!perm_type.hasStaticShape()) {+    return success();+  }++  auto x_type = op.x().getType().dyn_cast<RankedTensorType>();+  if (!x_type) {+    return success();+  }++  const int64_t x_rank = x_type.getRank();+  if (x_rank != perm_type.getNumElements()) {+    return op.emitOpError()+           << "expected perm to be a 1-D Tensor of size "+           << "equal to the rank of x, got perm of size "+           << perm_type.getNumElements() << ", and x of rank " << x_rank;+  }++  auto y_type = op.y().getType().dyn_cast<RankedTensorType>();+  if (!y_type) {+    return success();+  }++  const int64_t y_rank = y_type.getRank();+  if (x_rank != y_rank) {+    return op.emitOpError()+           << "x should be of the same rank with y, got "+           << "x of rank " << x_rank << ", and y of rank " << y_rank;+  }++  DenseIntElementsAttr attr_perm;+  if (matchPattern(op.perm(), m_Constant(&attr_perm))) {+    // y.shape[i] should be equal to x.shape[perm[i]]+    // for i = [0, 1, ..., rank(x) - 1]+    for (auto e : llvm::enumerate(attr_perm)) {+      const int64_t y_idx = e.index();+      const int64_t y_dim = y_type.getDimSize(y_idx);+      const int64_t x_idx = e.value().getSExtValue();+      const int64_t x_dim = x_type.getDimSize(x_idx);+      if (y_dim != x_dim) {

We should allow cases where one of x_dim and y_dim is dynamic as shapes are allowed to be not known completely.

WindQAQ

comment created time in 2 months

issue closedtensorflow/tensorflow

error: 'tfl.concatenation'

ConverterError                            Traceback (most recent call last)
<ipython-input-162-bce3c984534c> in <module>
      6 )
      7 converter.optimizations = [tf.lite.Optimize.DEFAULT]
----> 8 tflite_model = converter.convert()
      9 with tf.io.gfile.GFile('model.tflite', 'wb') as f:
     10   f.write(tflite_model)

~/anaconda3/lib/python3.7/site-packages/tensorflow/lite/python/lite.py in convert(self)
   1082           input_tensors=self._input_tensors,
   1083           output_tensors=self._output_tensors,
-> 1084           **converter_kwargs)
   1085     else:
   1086       result = _toco_convert_graph_def(

~/anaconda3/lib/python3.7/site-packages/tensorflow/lite/python/convert.py in toco_convert_impl(input_data, input_tensors, output_tensors, enable_mlir_converter, *args, **kwargs)
    494       input_data.SerializeToString(),
    495       debug_info_str=debug_info_str,
--> 496       enable_mlir_converter=enable_mlir_converter)
    497   return data
    498 

~/anaconda3/lib/python3.7/site-packages/tensorflow/lite/python/convert.py in toco_convert_protos(model_flags_str, toco_flags_str, input_data_str, debug_info_str, enable_mlir_converter)
    225       stdout = _try_convert_to_unicode(stdout)
    226       stderr = _try_convert_to_unicode(stderr)
--> 227       raise ConverterError("See console for info.\n%s\n%s\n" % (stdout, stderr))
    228   finally:
    229     # Must manually cleanup files.

ConverterError: See console for info.
loc("Concat_165"): error: 'tfl.concatenation' op dimension size of dimension #3 of operand #0 must be equal to dimension size of dimension #3 of output, expected 40, got 20
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/bin/toco_from_protos", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/lite/toco/python/toco_from_protos.py", line 93, in main
    app.run(main=execute, argv=[sys.argv[0]] + unparsed)
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/lite/toco/python/toco_from_protos.py", line 56, in execute
    enable_mlir_converter)
Exception: <unknown>:0: error: loc("Concat_165"): 'tfl.concatenation' op dimension size of dimension #3 of operand #0 must be equal to dimension size of dimension #3 of output, expected 40, got 20

I am converting a fronzen graph (.pb) to tflite and encountered this error. I have no idea why this is happen. Please help me. Using below code for conversion.

import tensorflow as tf
converter = tf.compat.v1.lite.TFLiteConverter.from_frozen_graph(
    graph_def_file = 'weights/yolov5s_v1.pb', 
    input_arrays = ['images'],
    output_arrays = ['output','463','482'] 
)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
with tf.io.gfile.GFile('model.tflite', 'wb') as f:
  f.write(tflite_model)

closed time in 3 months

karanjakhar

issue commenttensorflow/tensorflow

error: 'tfl.concatenation'

I looked into this and found that the input model has invalid shapes.

Here, two operands of the concat operation don't have compatible shapes. Operands should have same dimensions except the axis dimension. Here, the dimensions are [1, 256, 40, 20] and [1, 256, 40, 40].

The first operand is coming from this function. https://gist.github.com/smit-hinsu/95150cb40a523f0d892fe32d1cd03009

Let us know if this doesn't help you fix the dimensions in the model and need further assistance. If you can generate some sample data, you can try running the model with a sample input before conversion to identify source of the issue.

karanjakhar

comment created time in 3 months

more