profile
viewpoint

Ask questionstf.data.Dataset doesn't handle namedtuples properly

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): true
  • TensorFlow installed from (source or binary): collab notebook
  • TensorFlow version (use command below): v2.2.0-rc4-0-g70087ab4f4
  • Python version: 3.6.9

Describe the current behavior Handling of namedtuples by datasets is not coherent: elementspec does not reflect the input type

Describe the expected behavior The elementspec should reflect the structured input

Standalone code to reproduce the issue https://colab.research.google.com/drive/1w3vHiI2mItjL1uv8c8RzMEqATPI9v7u3?usp=sharing

tensorflow/tensorflow

Answer questions aaudiber

@AdrienCorenflos As far as I can tell, element_spec does work correctly for named tuples:

import tensorflow as tf
import collections

Point = collections.namedtuple('Point', ['x', 'y'])
dataset = tf.data.Dataset.from_tensor_slices(Point([1, 2, 3], [4, 5, 6]))

print(dataset.element_spec)
Point(x=TensorSpec(shape=(), dtype=tf.int32, name=None), y=TensorSpec(shape=(), dtype=tf.int32, name=None))

One easy mistake to make is passing a list of tuples or named tuples to from_tensor_slices. from_tensor_slices expects its input to be a structure of tensors, and will coerce Python lists (along with their contents) into tensors. For example, [(1, 2), (3, 4)] is seen as a 2d tensor of integers, equivalent to [[1, 2], [3, 4]]. The same applies to [Point(1, 2), Point(3, 4)]. This could make it look like named tuples aren't being respected properly if you call tf.data.Dataset.from_tensor_slices([Point(1, 2), Point(3, 4)]). The argument [Point(1, 2), Point(3, 4)] will be interpreted as equivalent to [[1, 2], [3, 4]].

I think this behavior is pretty unintuitive (it looked like a bug at first to me too). However, we can't change the behavior without breaking backwards compatibility, so I think the action item here is to improve the documentation to make it clear that the input is treated as a structure of Tensors, not a list of dataset elements.

useful!

Related questions

ModuleNotFoundError: No module named 'tensorflow.contrib' hot 8
Error occurred when finalizing GeneratorDataset iterator hot 6
ModuleNotFoundError: No module named 'tensorflow.contrib'
When importing TensorFlow, error loading Hadoop
tf.keras.layers.Conv1DTranspose ?
tensorflow-gpu CUPTI errors hot 4
[TF 2.0] tf.keras.optimizers.Adam hot 4
Lossy conversion from float32 to uint8. Range [0, 1]. Convert image to uint8 prior to saving to suppress this warning. hot 4
TF2.0 AutoGraph issue hot 4
Tf.Keras metrics issue hot 4
module 'tensorflow' has no attribute 'ConfigProto' hot 4
TF 2.0 'Tensor' object has no attribute 'numpy' while using .numpy() although eager execution enabled by default hot 4
ModuleNotFoundError: No module named 'tensorflow.examples.tutorials' hot 4
AttributeError: module 'tensorflow.python.framework.op_def_registry' has no attribute 'register_op_list' hot 4
tensorflow2.0 detected 'xla_gpu' , but 'gpu' expected hot 3
Github User Rank List