profile
viewpoint

Ask questionsImageAugmentation using tf.keras.preprocessing.image.ImageDataGenerator and tf.datasets: model.fit() is running infinitely

What I need help with / What I was wondering I am facing issue while running the fit() function in TensorFlow(v 2.2.0-rc4) with augmented images(using ImageDataGenerator) passed as a dataset. The fit() function is running infinitely without stopping.

What I've tried so far I tried it with the default code which was shared in Tensorflow documentation.

Please find the code snippet below:

import tensorflow as tf from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.models import Sequential, Model from tensorflow.keras.layers import Dense, Dropout, Flatten from tensorflow.keras.layers import Conv2D, MaxPooling2D from tensorflow.keras.layers import Input, Dense

flowers = tf.keras.utils.get_file( 'flower_photos', 'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz', untar=True)

img_gen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, rotation_range=20)

images, labels = next(img_gen.flow_from_directory(flowers))

print(images.dtype, images.shape) print(labels.dtype, labels.shape)

train_data_gen = img_gen.flow_from_directory( batch_size=32, directory=flowers, shuffle=True, target_size=(256, 256), class_mode='categorical')

ds = tf.data.Dataset.from_generator(lambda: train_data_gen, output_types=(tf.float32, tf.float32), output_shapes=([32, 256, 256, 3], [32, 5]) )

ds = ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)

it = iter(ds) batch = next(it) print(batch)

def create_model(): model = Sequential() model.add(Conv2D(32, (3, 3), activation='relu', input_shape=images[0].shape)) model.add(Conv2D(32, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.5)) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.5)) model.add(Flatten()) model.add(Dense(64, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(5, activation='softmax')) return model

model = create_model() model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=["accuracy"]) model.fit(ds, verbose=1, batch_size= 32, epochs =1)

This last line of code - fit() is running infinitly without stopping. I had also tried passing steps_per_epoch = total_no_of_train_records/batch_size.

It would be nice if... I would like you to confirm whethere this is a bug in the tensorflow datasets package and in which release will this be fixed.

Environment information

  • System: Google colaborator
  • Python version: v3.6.9
  • `tensorflow version: v2.2.0-rc4
tensorflow/tensorflow

Answer questions tomerk

So two things because It's also come to my attention that apparently the ImageDataGenerator isn't always supposed to loop forever.


  1. When you use tf.data from_generator and specify a shape, the generator must always yield objects of that shape. In your call you're specifying a fixed batch size as part of the shape:
ds = tf.data.Dataset.from_generator(lambda: train_data_gen,
output_types=(tf.float32, tf.float32),
output_shapes=([32, 256, 256, 3],
[32, 5])
)

When it sees a partial batch of 22 on the last step the shapes do not match so it errors. You can fix this by leaving the batch size unspecified (as None) in from_generator:

ds = tf.data.Dataset.from_generator(lambda: train_data_gen,
output_types=(tf.float32, tf.float32),
output_shapes=([None, 256, 256, 3],
[None, 5])
)

  1. Apparently there is a setting where the ImageDataGenerator isn't supposed to loop forever and shouldn't require steps_per_epoch: If you pass the result of flow_from_directory directly to Keras fit without converting it to a dataset yourself. In this specific setting the len information attached to the ImageDataGenerator sequences has historically been used as an implicit steps_per_epoch.

(This does not happen if you manually loop over the generator w/ a for loop, as shown in the documentation of the ImageDataGenerator). This isn't the case for your example code because when you convert it to a dataset manually that cardinality information gets lost.

However it seems we did introduce a regression in that setting at some point. (Though we're not positive which exact version of TF it was introduced in). It did make its way into the TF 2.0 release, and we're currently exploring whether to do a patch release to fix it.

useful!

Related questions

ModuleNotFoundError: No module named 'tensorflow.contrib' hot 8
Error occurred when finalizing GeneratorDataset iterator hot 6
ModuleNotFoundError: No module named 'tensorflow.contrib'
When importing TensorFlow, error loading Hadoop
tf.keras.layers.Conv1DTranspose ?
tensorflow-gpu CUPTI errors hot 4
[TF 2.0] tf.keras.optimizers.Adam hot 4
Lossy conversion from float32 to uint8. Range [0, 1]. Convert image to uint8 prior to saving to suppress this warning. hot 4
TF2.0 AutoGraph issue hot 4
Tf.Keras metrics issue hot 4
module 'tensorflow' has no attribute 'ConfigProto' hot 4
TF 2.0 'Tensor' object has no attribute 'numpy' while using .numpy() although eager execution enabled by default hot 4
ModuleNotFoundError: No module named 'tensorflow.examples.tutorials' hot 4
AttributeError: module 'tensorflow.python.framework.op_def_registry' has no attribute 'register_op_list' hot 4
tensorflow2.0 detected 'xla_gpu' , but 'gpu' expected hot 3
Github User Rank List