Ask questionstf.data.Dataset doesn't handle namedtuples properly

System information

• Have I written custom code (as opposed to using a stock example script provided in TensorFlow): true
• TensorFlow installed from (source or binary): collab notebook
• TensorFlow version (use command below): v2.2.0-rc4-0-g70087ab4f4
• Python version: 3.6.9

Describe the current behavior Handling of namedtuples by datasets is not coherent: elementspec does not reflect the input type

Describe the expected behavior The elementspec should reflect the structured input

Standalone code to reproduce the issue https://colab.research.google.com/drive/1w3vHiI2mItjL1uv8c8RzMEqATPI9v7u3?usp=sharing

tensorflow/tensorflow

@AdrienCorenflos As far as I can tell, `element_spec` does work correctly for named tuples:

``````import tensorflow as tf
import collections

Point = collections.namedtuple('Point', ['x', 'y'])
dataset = tf.data.Dataset.from_tensor_slices(Point([1, 2, 3], [4, 5, 6]))

print(dataset.element_spec)
``````
``````Point(x=TensorSpec(shape=(), dtype=tf.int32, name=None), y=TensorSpec(shape=(), dtype=tf.int32, name=None))
``````

One easy mistake to make is passing a list of tuples or named tuples to `from_tensor_slices`. `from_tensor_slices` expects its input to be a structure of tensors, and will coerce Python `list`s (along with their contents) into tensors. For example, `[(1, 2), (3, 4)]` is seen as a 2d tensor of integers, equivalent to `[[1, 2], [3, 4]]`. The same applies to `[Point(1, 2), Point(3, 4)]`. This could make it look like named tuples aren't being respected properly if you call `tf.data.Dataset.from_tensor_slices([Point(1, 2), Point(3, 4)])`. The argument `[Point(1, 2), Point(3, 4)]` will be interpreted as equivalent to `[[1, 2], [3, 4]]`.

I think this behavior is pretty unintuitive (it looked like a bug at first to me too). However, we can't change the behavior without breaking backwards compatibility, so I think the action item here is to improve the documentation to make it clear that the input is treated as a structure of Tensors, not a list of dataset elements.

useful!

Related questions