Ask questionsProblem with read and get batch from 2d array tfrecords dataset

URL(s) with the issue:

Description of issue (what needs changing):

Problem with read and get batch from 2d array tfrecords dataset

Clear description

Hello. I use Tensorflow 2.0 version. I have some problems with reading Tfrecords file when get batch.

First, this is my file.

import tensorflow as tf
import os
from glob import glob
import numpy as np

def serialize_example(batch, list1, list2):
    filename = "./train_set.tfrecords"
    writer =

    for i in range(batch):
        feature = {}
        feature1 = np.load(list1[i])
        feature2 = np.load(list2[i])
        print('feature1 shape {} feature2 shape {}'.format(feature1.shape, feature2.shape)) 
        feature['input'] = tf.train.Feature(float_list=tf.train.FloatList(value=feature1.flatten()))
        feature['target'] = tf.train.Feature(float_list=tf.train.FloatList(value=feature2.flatten()))

        features = tf.train.Features(feature=feature)
        example = tf.train.Example(features=features)
        serialized = example.SerializeToString()
        print("{}th input {} target {} finished".format(i, list1[i], list2[i]))

list_inp = sorted(glob('./input/2d_magnitude/*'))
list_tar = sorted(glob('./target/2d_magnitude/*'))

serialize_example(len(list_inp), list_inp, list_tar)

My input and target shapes are 2d array (Material of dataset is spectrogram). Therefore, my Tfrecords file includes two features likes [number_of_dataset, x, y]. About 100,000 dataset was successfully saved as Tfrecords file.

And I have problem when I read Tfrecords file to get batch. This is my code

import tensorflow as tf
import os
import numpy as np

shuffle_buffer_size = 50000
batch_size = 10
record_file = '/data2/dataset/tfrecords/train_set.tfrecords'

raw_dataset =
print('raw_dataset', raw_dataset) # ==> raw_dataset <TFRecordDatasetV2 shapes: (), types: tf.string>

raw_dataset = raw_dataset.repeat()
print('repeat', raw_dataset) # ==> repeat <RepeatDataset shapes: (), types: tf.string>

raw_dataset = raw_dataset.shuffle(shuffle_buffer_size)
print('shuffle', raw_dataset) # ==> shuffle <ShuffleDataset shapes: (), types: tf.string>

raw_dataset = raw_dataset.batch(batch_size, drop_remainder=True)
print('batch', raw_dataset) # ==> batch <BatchDataset shapes: (10,), types: tf.string>

raw_example = next(iter(raw_dataset)) 

parsed = tf.train.Example.FromString(raw_example.numpy()) # ==> RuntimeWarning: Unexpected end-group tag: Not all data was converted

print('parsed', parsed) # ==> ''

input = parsed.features.feature['input'].float_list.value
print('input', input) # ==> []
target = parsed.features.feature['target'].float_list.value
print('target', target) # ==> []

Here are results from code:

raw_dataset <TFRecordDatasetV2 shapes: (), types: tf.string>
repeat <RepeatDataset shapes: (), types: tf.string>
shuffle <ShuffleDataset shapes: (), types: tf.string>
batch <BatchDataset shapes: (10,), types: tf.string> RuntimeWarning: Unexpected end-group tag: Not all data was converted
  parsed = tf.train.Example.FromString(raw_example.numpy())
input []
target []

As a result, I wonder how I get the batch from Tfrecords file to train. RuntimeWarning: Unexpected end-group tag: Not all data was converted Could you give advice? Thank you very much.

Usage example


raw_dataset =

raw_dataset = raw_dataset.repeat()

raw_dataset = raw_dataset.shuffle(shuffle_buffer_size)

raw_dataset = raw_dataset.batch(batch_size, drop_remainder=True)

raw_example = next(iter(raw_dataset)) 

parsed = tf.train.Example.FromString(raw_example.numpy())

input = parsed.features.feature['input'].float_list.value
target = parsed.features.feature['target'].float_list.value

Answer questions aaudiber

@kaen2891, tf.train.Example.FromString parses a single example, not a batch of examples. See for how to read TFRecord files with


Related questions

ModuleNotFoundError: No module named 'tensorflow.contrib' hot 8
Error occurred when finalizing GeneratorDataset iterator hot 6
ModuleNotFoundError: No module named 'tensorflow.contrib'
When importing TensorFlow, error loading Hadoop
tf.keras.layers.Conv1DTranspose ?
tensorflow-gpu CUPTI errors hot 4
[TF 2.0] tf.keras.optimizers.Adam hot 4
Lossy conversion from float32 to uint8. Range [0, 1]. Convert image to uint8 prior to saving to suppress this warning. hot 4
TF2.0 AutoGraph issue hot 4
Tf.Keras metrics issue hot 4
module 'tensorflow' has no attribute 'ConfigProto' hot 4
TF 2.0 'Tensor' object has no attribute 'numpy' while using .numpy() although eager execution enabled by default hot 4
ModuleNotFoundError: No module named 'tensorflow.examples.tutorials' hot 4
AttributeError: module &#39;tensorflow.python.framework.op_def_registry&#39; has no attribute &#39;register_op_list&#39; hot 4
tensorflow2.0 detected 'xla_gpu' , but 'gpu' expected hot 3
Github User Rank List