profile
viewpoint
Andrew Audibert aaudiber Tensorflow United States

aaudiber/alluxio 1

Memory-Centric Virtual Distributed Storage System

aaudiber/2nd-semester-introduction-to-computer-science-principles 0

A 2nd semester follow-up to the TEALS Intro CS course

aaudiber/algorithms 0

interesting algorithms

aaudiber/alluxio-extensions 0

Alluxio Extensions

aaudiber/alluxio-test-client 0

Basic Alluxio client to help with testing

aaudiber/atomix 0

A reactive framework for building fault-tolerant distributed systems for the JVM

aaudiber/clahub 0

Easy contributor license agreements for your GitHub projects.

aaudiber/community 0

Stores documents used by the TensorFlow developer community

aaudiber/copycat 0

A novel implementation of the Raft consensus algorithm

aaudiber/DefinitelyTyped 0

The repository for high quality TypeScript type definitions.

issue commenttensorflow/tensorflow

Custom dataset op encounters refcount error

I looked through the code and nothing stuck out that would cause the refcount issue. When you build with the tensorflow codebase, are you building from latest, or from branch 1.15? It's possible that there were some changes to reference counting since 1.15.

zhuzilin

comment created time in 7 hours

issue commenttensorflow/tensorflow

tf.data.Dataset doesn't handle namedtuples properly

@AdrienCorenflos As far as I can tell, element_spec does work correctly for named tuples:

import tensorflow as tf
import collections

Point = collections.namedtuple('Point', ['x', 'y'])
dataset = tf.data.Dataset.from_tensor_slices(Point([1, 2, 3], [4, 5, 6]))

print(dataset.element_spec)
Point(x=TensorSpec(shape=(), dtype=tf.int32, name=None), y=TensorSpec(shape=(), dtype=tf.int32, name=None))

One easy mistake to make is passing a list of tuples or named tuples to from_tensor_slices. from_tensor_slices expects its input to be a structure of tensors, and will coerce Python lists (along with their contents) into tensors. For example, [(1, 2), (3, 4)] is seen as a 2d tensor of integers, equivalent to [[1, 2], [3, 4]]. The same applies to [Point(1, 2), Point(3, 4)]. This could make it look like named tuples aren't being respected properly if you call tf.data.Dataset.from_tensor_slices([Point(1, 2), Point(3, 4)]). The argument [Point(1, 2), Point(3, 4)] will be interpreted as equivalent to [[1, 2], [3, 4]].

I think this behavior is pretty unintuitive (it looked like a bug at first to me too). However, we can't change the behavior without breaking backwards compatibility, so I think the action item here is to improve the documentation to make it clear that the input is treated as a structure of Tensors, not a list of dataset elements.

AdrienCorenflos

comment created time in 4 days

fork aaudiber/ecosystem

Integration of TensorFlow with other open-source frameworks

fork in 4 days

issue commenttensorflow/tensorflow

Using tf.data.Dataset has big overhead

Thanks @Flamefire.

This is a difficult case for tf.data.Dataset because there isn't any preprocessing. tf.data.Dataset usually does preprocessing on the CPU, then transfers the data to the GPU afterward. The tf.data.Dataset example is slower because it is copying the tensors from GPU memory to CPU memory and back each time, while the non-Dataset example starts with the tensors on the GPU and doesn't need to move them at all since there isn't any preprocessing.

Ideally we could use tf.data.experimental.prefetch_to_device to prefetch to the GPU and recover the performance, but there is currently an outstanding bug with prefetch_to_device. Once that gets fixed, the performance should be almost identical when using prefetch_to_device.

Flamefire

comment created time in 21 days

issue commenttensorflow/tensorflow

Problem with read and get batch from 2d array tfrecords dataset

If your data is complex numbers, you can use the tf.complex64 Tensorflow type

kaen2891

comment created time in 21 days

issue commenttensorflow/tensorflow

Using tf.data.Dataset has big overhead

@Flamefire The tf.data.Dataset example is slicing a 4D tensor into a 3D tensor (which requires copying the data every step), while the non-Dataset code starts with 3D tensors and therefore doesn't need to copy. To compare apples to apples here, you should define the Dataset data with

x = tf.random.uniform([32, 224, 224, 3])
y = tf.random.uniform([32, 1], minval=0, maxval=999, dtype=tf.int64)
tf.data.Dataset.from_tensors((x, y))
Flamefire

comment created time in 24 days

Pull request review commenttensorflow/tensorflow

Add cardinality calculation for Dataset.unbatch() when possible

 class UnbatchDatasetOp : public UnaryDatasetOpKernel {     explicit Dataset(OpKernelContext* ctx, DatasetBase* input)         : DatasetBase(DatasetContext(ctx)), input_(input) {       input_->Ref();+      known_batch_size_ = -1;       for (const PartialTensorShape& shape : input->output_shapes()) {         if (!shape.unknown_rank()) {+          if (known_batch_size_ < 0) {+            if (shape.dim_size(0) >= 0) {

combine the two if clauses:

if (known_batch_size_ < 0 && shape.dim_size(0) >= 0) {
  ...
}
yongtang

comment created time in a month

Pull request review commenttensorflow/tensorflow

Add cardinality calculation for Dataset.unbatch() when possible

 def _test_combinations():        lambda: dataset_ops.Dataset.range(5).filter(lambda _: True).take(2),        cardinality.UNKNOWN),       ("Take4", lambda: dataset_ops.Dataset.range(5).repeat().take(2), 2),+      ("Unbatch1",+       lambda: dataset_ops.Dataset.range(5).batch(2, drop_remainder=True).unbatch(), 4),+      ("Unbatch2",+       lambda: dataset_ops.Dataset.range(5).batch(2, drop_remainder=False).unbatch(), cardinality.UNKNOWN),+      ("Unbatch3",+       lambda: dataset_ops.Dataset.range(5).batch(2, drop_remainder=True).filter(lambda _: True).unbatch(),+       cardinality.UNKNOWN),+      ("Unbatch4", lambda: dataset_ops.Dataset.range(5).batch(2, drop_remainder=True).repeat().unbatch(),

Add test with 2 components, where only the second component's batch size is known:

lambda: dataset_ops.Dataset.zip(
  dataset_ops.Dataset.range(4).batch(2, drop_remainder=False),
  dataset_ops.Dataset.range(5).batch(2, drop_remainder=True))
yongtang

comment created time in a month

Pull request review commenttensorflow/tensorflow

Add cardinality calculation for Dataset.unbatch() when possible

 class UnbatchDatasetOp : public UnaryDatasetOpKernel {      const DatasetBase* const input_;     std::vector<PartialTensorShape> shapes_;+    int64 known_batch_size_;

can we call this just batch_size_? Then add a comment that it may or may not be known, with -1 representing unknown.

yongtang

comment created time in a month

issue commenttensorflow/tensorflow

Problem with read and get batch from 2d array tfrecords dataset

It looks like you are using FixedLenFeature to parse your features, but the features are sequences not scalars, so you need to use FixedLenSequenceFeature instead.

kaen2891

comment created time in a month

issue commenttensorflow/tensorflow

Problem with read and get batch from 2d array tfrecords dataset

Please read https://www.tensorflow.org/tutorials/load_data/tfrecord#reading_a_tfrecord_file to understand how to read TFRecords with tf.data. Is there something specific in there that doesn't make sense to you?

kaen2891

comment created time in a month

issue commenttensorflow/tensorflow

Problem with read and get batch from 2d array tfrecords dataset

After parsing, you can call Dataset.batch(batch_size) to put the examples in batches. Did the link I shared above help?

kaen2891

comment created time in a month

issue commenttensorflow/tensorflow

tf.data.Dataset.from_tensor_slices: ValueError: Failed to convert a NumPy array to a Tensor (Unsupported␣ ,→object type list), worked on 2.0.0-beta1

The problem is that from_tensor_slices needs to convert its input into a Tensor, but the given input contains variable-length numpy lists, which cannot be converted into tensors (tensors must be rectangular). You can get the same error message by running

a = np.array([[1, 2, 3], [4, 5]], dtype=object)
print(tf.convert_to_tensor(a))

This error appears to occur even in tensorflow 2.0.0-beta1, so it doesn't look like a regression.

To make this work, you need to pad the dataframe's lists so that they are the same length.

NiBurhe

comment created time in a month

issue commenttensorflow/tensorflow

Problem with read and get batch from 2d array tfrecords dataset

@kaen2891, tf.train.Example.FromString parses a single example, not a batch of examples. See https://www.tensorflow.org/tutorials/load_data/tfrecord#reading_a_tfrecord_file for how to read TFRecord files with tf.data

kaen2891

comment created time in a month

issue commenttensorflow/tensorflow

Buggy behaviour of dataset API

Hi @csxeba,

This is working as intended. Datasets can be much larger than the memory of a single machine, so Dataset objects act like blueprints for how to produce data (instead of trying to hold the entire dataset at once). Datasets provide a streaming API for consuming data through an iterator. If you want each iterator created on a shuffled dataset to produce elements in the same order, use the reshuffle_each_iteration argument to Datasest.shuffle:

index = index.shuffle(buffer_size=len(self.index), reshuffle_each_iteration=False)
csxeba

comment created time in a month

pull request commenttensorflow/community

Add C++ style guide.

@martinwicke PTAL

aaudiber

comment created time in a month

pull request commenttensorflow/community

Add C++ style guide.

Would it help to remove the explanations of reasoning and simply state "Tensorflow does not use StatusOr"?

aaudiber

comment created time in a month

PR opened tensorflow/community

Add C++ style guide.

This guide is a starting point for documenting Tensorflow C++ style, especially where it differs from general Google C++ style.

The goal of this PR is to reflect style rules as they are today, so that we can be on the same page about what the style rules are, and so that we have a basis from which to discuss potential changes to the style rules. This PR itself is not the right place to discuss changing the style rules (though if the content here does not reflect the current reality, by all means please comment!).

+64 -0

0 comment

1 changed file

pr created time in a month

create barnchaaudiber/community

branch : cpp-style

created branch time in a month

pull request commenttensorflow/tensorflow

update dataset_ops.py

@NeerajBhadani sorry I didn't notice this at first - multi-line doctest statements require "..." on the continuation line, see https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/data/ops/dataset_ops.py#L612-L614 for example

NeerajBhadani

comment created time in a month

pull request commenttensorflow/tensorflow

update dataset_ops.py

@NeerajBhadani It appears that the line is now too long, can you shorten the example or wrap the line?

FAIL: Found 1 non-whitelisted pylint errors: tensorflow/python/data/ops/dataset_ops.py:1659: [C0301(line-too-long), ] Line too long (87/80)

NeerajBhadani

comment created time in 2 months

pull request commenttensorflow/tensorflow

Add support for any Tensor type describable by TensorSpec to tf.data.Dataset.from_generator

@lithuak The test has been updated, and I've confirmed that it now fails with this PR. It seems that the issue is in handling generators that produce tuples. Can you take a look?

lithuak

comment created time in 2 months

pull request commenttensorflow/tensorflow

Add support for attr classes in Dataset

@AdrienCorenflos you can find the detailed error message under the "Target Log" tab. It looks like the cause of the failures is "ModuleNotFoundError: No module named 'attr'". This is because the attr module may not always be available (TF is still in the process of dropping python2 within Google). You can fix this issue by guarding your attr imports with

try:
  import attr  # pylint:disable=g-import-not-at-top
except ImportError:
  attr = None

and then checking whether attr is None before using it. See function_test.py for an example of doing this: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/eager/function_test.py

AdrienCorenflos

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Add support for attr classes in Dataset

 def testOptionalDatasetSpec(self):         optional_ops.Optional.from_value(37.0),         optional_ops.OptionalSpec(tensor_spec.TensorSpec([], dtypes.float32))) +  @combinations.generate(test_base.default_test_combinations())+  def testAttrClassDatasetSpec(self):++    @attr.s+    class AttrClass:+      x = attr.ib()++    self._testDatasetSpec(+        AttrClass(x=constant_op.constant(0)),+        AttrClass(x=tensor_spec.TensorSpec([], dtypes.int32)))++

nit: extra newline

AdrienCorenflos

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Add support for attr classes in Dataset

 def testObjectProxy(self):     self.assertEqual(structure.type_spec_from_value(nt_type(1, 2)),                      proxied_spec) +  def testAttrClassProxy(self):+    @attr.s+    class AttrClass:+        x = attr.ib()+        y = attr.ib()++    elem = AttrClass(x=constant_op.constant(1.), y=constant_op.constant(2.))+    proxied_elem = wrapt.ObjectProxy(elem)+    proxied_spec = structure.type_spec_from_value(proxied_elem)+    self.assertEqual(structure.type_spec_from_value(attr.evolve(elem)),+                     proxied_spec)++

nit: extra newline

AdrienCorenflos

comment created time in 2 months

pull request commenttensorflow/tensorflow

Add support for any Tensor type describable by TensorSpec to tf.data.Dataset.from_generator

@lithuak I'm also unable to reproduce the test failure. I will communicate with the test writer to get a test that reproduces the issue.

lithuak

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Add support for attr classes in Dataset

 def testCustomMapping(self):     self.assertIsInstance(spec, CustomMap)     self.assertEqual(spec["foo"], tensor_spec.TensorSpec([], dtypes.float32)) +  def testAttrClass(self):+    @attr.s+    class AttrClass:+        x = attr.ib()+        y = attr.ib()++    elem = AttrClass(x=constant_op.constant(1.), y=constant_op.constant(2.))+    spec = structure.type_spec_from_value(elem)+    self.assertIsInstance(spec, AttrClass)+    self.assertEqual(spec.x, tensor_spec.TensorSpec([], dtypes.float32))+

Please add a test for an ObjectProxy of attr.s, similar to the test for ObjectProxy of namedtuple below

AdrienCorenflos

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Add support for attr classes in Dataset

 def testCustomMapping(self):     self.assertIsInstance(spec, CustomMap)     self.assertEqual(spec["foo"], tensor_spec.TensorSpec([], dtypes.float32)) +  def testAttrClass(self):+    @attr.s+    class AttrClass:+        x = attr.ib()+        y = attr.ib()++    elem = AttrClass(x=constant_op.constant(1.), y=constant_op.constant(2.))+    spec = structure.type_spec_from_value(elem)+    self.assertIsInstance(spec, AttrClass)+    self.assertEqual(spec.x, tensor_spec.TensorSpec([], dtypes.float32))

Also check spec.y

AdrienCorenflos

comment created time in 2 months

issue commenttensorflow/tensorflow

[tf.data] _pywrap_server_lib.so breaks nightly packages

Thank you @byronyi for investigating this. I just put out a CL to move the usage of tf_python_pybind_extension under //tensorflow/python. I will validate the fix and merge it as soon as possible.

byronyi

comment created time in 2 months

pull request commenttensorflow/tensorflow

Add Dataset_Ops C++ API for building dataflow graph

I appreciate the idea of switching to C++ API since it definitely gives some nice benefits, but since there are also significant downsides I agree we should close this PR.

feihugis

comment created time in 2 months

pull request commenttensorflow/tensorflow

Add Dataset_Ops C++ API for building dataflow graph

@feihugis I don't know of any plans to make the C++ API compatible across versions - right now only the C API is covered (https://www.tensorflow.org/guide/versions#what_is_covered).

I believe the reason for setting Dataset API op defs to HIDDEN is that we don't want users to depend on the op def API, and instead stick to the Python API. In general the quality of our Python API documentation is higher than op documentation, since there is very little interest in the op-level documentation.

feihugis

comment created time in 2 months

pull request commenttensorflow/tensorflow

Add Dataset_Ops C++ API for building dataflow graph

@feihugis Here are the pros/cons I see for using a GraphDef dynamically generated using the C++ API vs using a static GraphDef generated using the Python API:

Pros for C++ API:

  • Dynamically generated GraphDefs are always at the latest version. Static graphdefs could become outdated in a new major version.
  • It's easier to make small changes to a dynamically generated GraphDef.

Cons for C++ API:

  • Internal C++ API is not guaranteed to stay the same, so the test could break due to C++ API changes.
  • Writing tests with C++ API requires understanding many internal details of tf.data, such as the presence of the "RetVal" node, how to build the function library, how to set the right attrs for each node.
  • Readers of the test are more familiar with Python API than C++ API.

Tensorflow guarantees that GraphDefs generated using non-experimental APIs in TF 2.x can be executed through all of TF 2.x as well as 3.x. This mitigates the concern that a static file will become outdated. Given this, I would prefer to stick to the public Python API for generating test GraphDefs, so that we don't need to update the test when the less-stable C++ API changes.

feihugis

comment created time in 2 months

issue commenttensorflow/tensorflow

Tensorflow 2.1 Error “when finalizing GeneratorDataset iterator” - a memory leak?

@Tuxius Is it possible to reproduce the issue using fake data? If you can provide a minimal, self-contained repro, that will help a lot in finding the root cause.

Tuxius

comment created time in 3 months

Pull request review commenttensorflow/tensorflow

Refactor DirectedInterleaveDatasetOp

+/* Copyright 2020 The TensorFlow Authors. All Rights Reserved.+Licensed under the Apache License, Version 2.0 (the "License");+you may not use this file except in compliance with the License.+You may obtain a copy of the License at+    http://www.apache.org/licenses/LICENSE-2.0+Unless required by applicable law or agreed to in writing, software+distributed under the License is distributed on an "AS IS" BASIS,+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+See the License for the specific language governing permissions and+limitations under the License.+==============================================================================*/+#include "tensorflow/core/kernels/data/experimental/directed_interleave_dataset_op.h"++#include "tensorflow/core/kernels/data/dataset_test_base.h"++namespace tensorflow {+namespace data {+namespace experimental {+namespace {++constexpr char kNodeName[] = "directed_interleave_dataset";++class DirectedInterleaveDatasetParams : public DatasetParams {+ public:+  template <typename S, typename T>+  DirectedInterleaveDatasetParams(S selector_input_dataset_params,+                                  std::vector<T> input_dataset_params_vec,+                                  DataTypeVector output_dtypes,+                                  std::vector<PartialTensorShape> output_shapes,+                                  int num_input_datasets, string node_name)+      : DatasetParams(std::move(output_dtypes), std::move(output_shapes),+                      std::move(node_name)),+        num_input_datasets_(num_input_datasets) {+    input_dataset_params_.push_back(+        absl::make_unique<S>(selector_input_dataset_params));+    for (auto input_dataset_params : input_dataset_params_vec) {+      input_dataset_params_.push_back(+          absl::make_unique<T>(input_dataset_params));+    }++    iterator_prefix_ = name_utils::IteratorPrefix(+        input_dataset_params_vec[0].dataset_type(),+        input_dataset_params_vec[0].iterator_prefix());+  }++  std::vector<Tensor> GetInputTensors() const override { return {}; }++  Status GetInputNames(std::vector<string>* input_names) const override {+    input_names->clear();+    input_names->emplace_back(+        DirectedInterleaveDatasetOp::kSelectorInputDataset);+    for (int i = 0; i < num_input_datasets_; ++i) {+      input_names->emplace_back(absl::StrCat(+          DirectedInterleaveDatasetOp::kDataInputDatasets, "_", i));+    }+    return Status::OK();+  }++  Status GetAttributes(AttributeVector* attr_vector) const override {+    attr_vector->clear();+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputTypes,+                              output_dtypes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputShapes,+                              output_shapes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kNumDatasets,+                              num_input_datasets_);+    return Status::OK();+  }++  string dataset_type() const override {+    return DirectedInterleaveDatasetOp::kDatasetType;+  }++ private:+  int32 num_input_datasets_;+};++class DirectedInterleaveDatasetOpTest : public DatasetOpsTestBase {};++// Test case 1: normal case+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams1() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++// Test case 2: select an exhausted input+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams2() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 2, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorOuputDataType() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int32>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorOuputShape() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6, 1},+                                          {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorValues() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {2, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidInputDatasetsDataType() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{+          RangeDatasetParams(0, 3, 1, {DT_INT32}),+          RangeDatasetParams(10, 13, 1, {DT_INT64})},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++std::vector<GetNextTestCase<DirectedInterleaveDatasetParams>>+GetNextTestCases() {+  return {{/*dataset_params=*/DirectedInterleaveDatasetParams1(),+           /*expected_outputs=*/{CreateTensors<int64>(+               TensorShape({}), {{0}, {10}, {1}, {11}, {2}, {12}})}},+          {/*dataset_params=*/DirectedInterleaveDatasetParams2(),+           /*expected_outputs=*/{CreateTensors<int64>(+               TensorShape({}), {{0}, {10}, {1}, {11}, {12}})}}};

Exactly, basically a mismatch between those two parameters

feihugis

comment created time in 3 months

Pull request review commenttensorflow/tensorflow

Refactor DirectedInterleaveDatasetOp

+/* Copyright 2020 The TensorFlow Authors. All Rights Reserved.+Licensed under the Apache License, Version 2.0 (the "License");+you may not use this file except in compliance with the License.+You may obtain a copy of the License at+    http://www.apache.org/licenses/LICENSE-2.0+Unless required by applicable law or agreed to in writing, software+distributed under the License is distributed on an "AS IS" BASIS,+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+See the License for the specific language governing permissions and+limitations under the License.+==============================================================================*/+#include "tensorflow/core/kernels/data/experimental/directed_interleave_dataset_op.h"++#include "tensorflow/core/kernels/data/dataset_test_base.h"++namespace tensorflow {+namespace data {+namespace experimental {+namespace {++constexpr char kNodeName[] = "directed_interleave_dataset";++class DirectedInterleaveDatasetParams : public DatasetParams {+ public:+  template <typename S, typename T>+  DirectedInterleaveDatasetParams(S selector_input_dataset_params,+                                  std::vector<T> input_dataset_params_vec,+                                  DataTypeVector output_dtypes,+                                  std::vector<PartialTensorShape> output_shapes,+                                  int num_input_datasets, string node_name)+      : DatasetParams(std::move(output_dtypes), std::move(output_shapes),+                      std::move(node_name)),+        num_input_datasets_(num_input_datasets) {+    input_dataset_params_.push_back(+        absl::make_unique<S>(selector_input_dataset_params));+    for (auto input_dataset_params : input_dataset_params_vec) {+      input_dataset_params_.push_back(+          absl::make_unique<T>(input_dataset_params));+    }++    iterator_prefix_ = name_utils::IteratorPrefix(+        input_dataset_params_vec[0].dataset_type(),+        input_dataset_params_vec[0].iterator_prefix());+  }++  std::vector<Tensor> GetInputTensors() const override { return {}; }++  Status GetInputNames(std::vector<string>* input_names) const override {+    input_names->clear();+    input_names->emplace_back(+        DirectedInterleaveDatasetOp::kSelectorInputDataset);+    for (int i = 0; i < num_input_datasets_; ++i) {+      input_names->emplace_back(absl::StrCat(+          DirectedInterleaveDatasetOp::kDataInputDatasets, "_", i));+    }+    return Status::OK();+  }++  Status GetAttributes(AttributeVector* attr_vector) const override {+    attr_vector->clear();+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputTypes,+                              output_dtypes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputShapes,+                              output_shapes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kNumDatasets,+                              num_input_datasets_);+    return Status::OK();+  }++  string dataset_type() const override {+    return DirectedInterleaveDatasetOp::kDatasetType;+  }++ private:+  int32 num_input_datasets_;+};++class DirectedInterleaveDatasetOpTest : public DatasetOpsTestBase {};++// Test case 1: normal case+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams1() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++// Test case 2: select an exhausted input+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams2() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 2, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorOuputDataType() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int32>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorOuputShape() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6, 1},+                                          {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorValues() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {2, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidInputDatasetsDataType() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{+          RangeDatasetParams(0, 3, 1, {DT_INT32}),+          RangeDatasetParams(10, 13, 1, {DT_INT64})},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++std::vector<GetNextTestCase<DirectedInterleaveDatasetParams>>+GetNextTestCases() {+  return {{/*dataset_params=*/DirectedInterleaveDatasetParams1(),+           /*expected_outputs=*/{CreateTensors<int64>(+               TensorShape({}), {{0}, {10}, {1}, {11}, {2}, {12}})}},+          {/*dataset_params=*/DirectedInterleaveDatasetParams2(),+           /*expected_outputs=*/{CreateTensors<int64>(+               TensorShape({}), {{0}, {10}, {1}, {11}, {12}})}}};

If it isn't too much trouble, we could verify that trying to use 0 data input datasets produces a reasonable error message, as opposed to e.g. segfaulting.

feihugis

comment created time in 3 months

Pull request review commenttensorflow/tensorflow

Refactor DirectedInterleaveDatasetOp

+/* Copyright 2020 The TensorFlow Authors. All Rights Reserved.+Licensed under the Apache License, Version 2.0 (the "License");+you may not use this file except in compliance with the License.+You may obtain a copy of the License at+    http://www.apache.org/licenses/LICENSE-2.0+Unless required by applicable law or agreed to in writing, software+distributed under the License is distributed on an "AS IS" BASIS,+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+See the License for the specific language governing permissions and+limitations under the License.+==============================================================================*/+#include "tensorflow/core/kernels/data/experimental/directed_interleave_dataset_op.h"++#include "tensorflow/core/kernels/data/dataset_test_base.h"++namespace tensorflow {+namespace data {+namespace experimental {+namespace {++constexpr char kNodeName[] = "directed_interleave_dataset";++class DirectedInterleaveDatasetParams : public DatasetParams {+ public:+  template <typename S, typename T>+  DirectedInterleaveDatasetParams(S selector_input_dataset_params,+                                  std::vector<T> input_dataset_params_vec,+                                  DataTypeVector output_dtypes,+                                  std::vector<PartialTensorShape> output_shapes,+                                  int num_input_datasets, string node_name)+      : DatasetParams(std::move(output_dtypes), std::move(output_shapes),+                      std::move(node_name)),+        num_input_datasets_(num_input_datasets) {+    input_dataset_params_.push_back(+        absl::make_unique<S>(selector_input_dataset_params));+    for (auto input_dataset_params : input_dataset_params_vec) {+      input_dataset_params_.push_back(+          absl::make_unique<T>(input_dataset_params));+    }++    iterator_prefix_ = name_utils::IteratorPrefix(+        input_dataset_params_vec[0].dataset_type(),+        input_dataset_params_vec[0].iterator_prefix());+  }++  std::vector<Tensor> GetInputTensors() const override { return {}; }++  Status GetInputNames(std::vector<string>* input_names) const override {+    input_names->clear();+    input_names->emplace_back(+        DirectedInterleaveDatasetOp::kSelectorInputDataset);+    for (int i = 0; i < num_input_datasets_; ++i) {+      input_names->emplace_back(absl::StrCat(+          DirectedInterleaveDatasetOp::kDataInputDatasets, "_", i));+    }+    return Status::OK();+  }++  Status GetAttributes(AttributeVector* attr_vector) const override {+    attr_vector->clear();+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputTypes,+                              output_dtypes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputShapes,+                              output_shapes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kNumDatasets,+                              num_input_datasets_);+    return Status::OK();+  }++  string dataset_type() const override {+    return DirectedInterleaveDatasetOp::kDatasetType;+  }++ private:+  int32 num_input_datasets_;+};++class DirectedInterleaveDatasetOpTest : public DatasetOpsTestBase {};++// Test case 1: normal case+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams1() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++// Test case 2: select an exhausted input+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams2() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 2, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorOuputDataType() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int32>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorOuputShape() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6, 1},+                                          {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidSelectorValues() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {2, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++DirectedInterleaveDatasetParams InvalidInputDatasetsDataType() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{+          RangeDatasetParams(0, 3, 1, {DT_INT32}),+          RangeDatasetParams(10, 13, 1, {DT_INT64})},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++std::vector<GetNextTestCase<DirectedInterleaveDatasetParams>>+GetNextTestCases() {+  return {{/*dataset_params=*/DirectedInterleaveDatasetParams1(),+           /*expected_outputs=*/{CreateTensors<int64>(+               TensorShape({}), {{0}, {10}, {1}, {11}, {2}, {12}})}},+          {/*dataset_params=*/DirectedInterleaveDatasetParams2(),+           /*expected_outputs=*/{CreateTensors<int64>(+               TensorShape({}), {{0}, {10}, {1}, {11}, {12}})}}};

some other cases worth covering:

  • Interleave with 0 data input datasets
  • Interleave with 1 data input dataset
  • Interleave where num_input_datasets is set too low/too high.
feihugis

comment created time in 3 months

Pull request review commenttensorflow/tensorflow

Refactor DirectedInterleaveDatasetOp

+/* Copyright 2020 The TensorFlow Authors. All Rights Reserved.+Licensed under the Apache License, Version 2.0 (the "License");+you may not use this file except in compliance with the License.+You may obtain a copy of the License at+    http://www.apache.org/licenses/LICENSE-2.0+Unless required by applicable law or agreed to in writing, software+distributed under the License is distributed on an "AS IS" BASIS,+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+See the License for the specific language governing permissions and+limitations under the License.+==============================================================================*/+#include "tensorflow/core/kernels/data/experimental/directed_interleave_dataset_op.h"++#include "tensorflow/core/kernels/data/dataset_test_base.h"++namespace tensorflow {+namespace data {+namespace experimental {+namespace {++constexpr char kNodeName[] = "directed_interleave_dataset";++class DirectedInterleaveDatasetParams : public DatasetParams {+ public:+  template <typename S, typename T>+  DirectedInterleaveDatasetParams(S selector_input_dataset_params,+                                  std::vector<T> input_dataset_params_vec,+                                  DataTypeVector output_dtypes,+                                  std::vector<PartialTensorShape> output_shapes,+                                  int num_input_datasets, string node_name)+      : DatasetParams(std::move(output_dtypes), std::move(output_shapes),+                      std::move(node_name)),+        num_input_datasets_(num_input_datasets) {+    input_dataset_params_.push_back(+        absl::make_unique<S>(selector_input_dataset_params));+    for (auto input_dataset_params : input_dataset_params_vec) {+      input_dataset_params_.push_back(+          absl::make_unique<T>(input_dataset_params));+    }++    iterator_prefix_ = name_utils::IteratorPrefix(+        input_dataset_params_vec[0].dataset_type(),+        input_dataset_params_vec[0].iterator_prefix());+  }++  std::vector<Tensor> GetInputTensors() const override { return {}; }++  Status GetInputNames(std::vector<string>* input_names) const override {+    input_names->clear();+    input_names->emplace_back(+        DirectedInterleaveDatasetOp::kSelectorInputDataset);+    for (int i = 0; i < num_input_datasets_; ++i) {+      input_names->emplace_back(absl::StrCat(+          DirectedInterleaveDatasetOp::kDataInputDatasets, "_", i));+    }+    return Status::OK();+  }++  Status GetAttributes(AttributeVector* attr_vector) const override {+    attr_vector->clear();+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputTypes,+                              output_dtypes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputShapes,+                              output_shapes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kNumDatasets,+                              num_input_datasets_);+    return Status::OK();+  }++  string dataset_type() const override {+    return DirectedInterleaveDatasetOp::kDatasetType;+  }++ private:+  int32 num_input_datasets_;+};++class DirectedInterleaveDatasetOpTest : public DatasetOpsTestBase {};++// Test case 1: normal case+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams1() {

maybe call this NormalParams or AlternateInputsParams?

feihugis

comment created time in 3 months

Pull request review commenttensorflow/tensorflow

Refactor DirectedInterleaveDatasetOp

+/* Copyright 2020 The TensorFlow Authors. All Rights Reserved.+Licensed under the Apache License, Version 2.0 (the "License");+you may not use this file except in compliance with the License.+You may obtain a copy of the License at+    http://www.apache.org/licenses/LICENSE-2.0+Unless required by applicable law or agreed to in writing, software+distributed under the License is distributed on an "AS IS" BASIS,+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+See the License for the specific language governing permissions and+limitations under the License.+==============================================================================*/+#include "tensorflow/core/kernels/data/experimental/directed_interleave_dataset_op.h"++#include "tensorflow/core/kernels/data/dataset_test_base.h"++namespace tensorflow {+namespace data {+namespace experimental {+namespace {++constexpr char kNodeName[] = "directed_interleave_dataset";++class DirectedInterleaveDatasetParams : public DatasetParams {+ public:+  template <typename S, typename T>+  DirectedInterleaveDatasetParams(S selector_input_dataset_params,+                                  std::vector<T> input_dataset_params_vec,+                                  DataTypeVector output_dtypes,+                                  std::vector<PartialTensorShape> output_shapes,+                                  int num_input_datasets, string node_name)+      : DatasetParams(std::move(output_dtypes), std::move(output_shapes),+                      std::move(node_name)),+        num_input_datasets_(num_input_datasets) {+    input_dataset_params_.push_back(+        absl::make_unique<S>(selector_input_dataset_params));+    for (auto input_dataset_params : input_dataset_params_vec) {+      input_dataset_params_.push_back(+          absl::make_unique<T>(input_dataset_params));+    }++    iterator_prefix_ = name_utils::IteratorPrefix(+        input_dataset_params_vec[0].dataset_type(),+        input_dataset_params_vec[0].iterator_prefix());+  }++  std::vector<Tensor> GetInputTensors() const override { return {}; }++  Status GetInputNames(std::vector<string>* input_names) const override {+    input_names->clear();+    input_names->emplace_back(+        DirectedInterleaveDatasetOp::kSelectorInputDataset);+    for (int i = 0; i < num_input_datasets_; ++i) {+      input_names->emplace_back(absl::StrCat(+          DirectedInterleaveDatasetOp::kDataInputDatasets, "_", i));+    }+    return Status::OK();+  }++  Status GetAttributes(AttributeVector* attr_vector) const override {+    attr_vector->clear();+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputTypes,+                              output_dtypes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kOutputShapes,+                              output_shapes_);+    attr_vector->emplace_back(DirectedInterleaveDatasetOp::kNumDatasets,+                              num_input_datasets_);+    return Status::OK();+  }++  string dataset_type() const override {+    return DirectedInterleaveDatasetOp::kDatasetType;+  }++ private:+  int32 num_input_datasets_;+};++class DirectedInterleaveDatasetOpTest : public DatasetOpsTestBase {};++// Test case 1: normal case+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams1() {+  auto selector_input_dataset_params = TensorSliceDatasetParams(+      /*components=*/{CreateTensor<int64>(TensorShape{6}, {0, 1, 0, 1, 0, 1})},+      /*node_name=*/"tensor_slice");+  return DirectedInterleaveDatasetParams(+      selector_input_dataset_params,+      /*input_dataset_params_vec=*/+      std::vector<RangeDatasetParams>{RangeDatasetParams(0, 3, 1),+                                      RangeDatasetParams(10, 13, 1)},+      /*output_dtypes=*/{DT_INT64, DT_INT64},+      /*output_shapes=*/{PartialTensorShape({}), PartialTensorShape({})},+      /*num_input_datasets=*/2,+      /*node_name=*/kNodeName);+}++// Test case 2: select an exhausted input+DirectedInterleaveDatasetParams DirectedInterleaveDatasetParams2() {

If we call this SelectExhaustedInputParams, the comment isn't needed

feihugis

comment created time in 3 months

issue commenttensorflow/tensorflow

Hang on out of memory error

I don't think the issue is with ParallelMapIterator - it was moved between 2.0.0 and 2.1.0, but it's always had the logic of waiting for outstanding calls to finish during deconstruction: https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/core/kernels/data/parallel_map_iterator.cc#L70-L79

From the stack trace

#6 0x00007fff462e9cd4 in tensorflow::condition_variable::wait #7 0x00007fff4114079c in tensorflow::data::InstantiatedCapturedFunction::RunWithBorrowedArgs #8 0x00007fff40d88d3c in tensorflow::data::GeneratorDatasetOp::Dataset::Iterator::GetNextInternal

it looks like we're getting stuck here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/data/captured_function.cc#L717-L721. We call lib_->Run to invoke the python function, which is supposed to call Notify() when the python function completes (whether or not it succeeds). For some reason it looks like that callback never happens. It isn't clear whether that's because the python function itself never completes, or because lib_->Run fails to call Notify on some error-handling code path. If I could reproduce, I would add additional logging to see what happens in lib_->Run

smatzek

comment created time in 3 months

more