profile
viewpoint

Ask questionsError occurred when finalizing GeneratorDataset iterator

System information

  • OS Platform and Distribution: Arch Linux, 5.4.2-arch1-1-ARCH
  • TensorFlow installed from: binary
  • TensorFlow version: 2.1.0rc0-1
  • Keras version: 2.2.4-tf
  • Python version: 3.8
  • GPU model and memory: 2x GTX 1080 Ti 11GB"`

Describe the current behavior executing Tensorflow's MNIST handwriting example produces error: the error dissapears if the code doesn't use OneDeviceStrategy or MirroredStrategy

W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled

Code to reproduce the issue

import tensorflow as tf
 import tensorflow_datasets as tfds
 import time
 
 from tensorflow.keras.optimizers import Adam
 
 def build_model():
     filters = 48
     units = 24
     kernel_size = 7
     learning_rate = 1e-4
     model = tf.keras.Sequential([
       tf.keras.layers.Conv2D(filters=filters, kernel_size=(kernel_size, kernel_size), activation='relu', input_shape=(28, 28, 1)),
       tf.keras.layers.MaxPooling2D(),
       tf.keras.layers.Flatten(),
       tf.keras.layers.Dense(units, activation='relu'),
       tf.keras.layers.Dense(10, activation='softmax')
     ])
     model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam(learning_rate), metrics=['accuracy'])
     return model
 
 datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True)
 mnist_train, mnist_test = datasets['train'], datasets['test']
 
 num_train_examples = info.splits['train'].num_examples
 num_test_examples = info.splits['test'].num_examples
 
 strategy = tf.distribute.OneDeviceStrategy(device='/gpu:0')
 
 BUFFER_SIZE = 10000
 BATCH_SIZE = 32
 
 def scale(image, label):
   image = tf.cast(image, tf.float32)
   image /= 255
   return image, label
 
 train_dataset = mnist_train.map(scale).shuffle(BUFFER_SIZE).repeat().batch(BATCH_SIZE).prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
 eval_dataset = mnist_test.map(scale).repeat().batch(BATCH_SIZE).prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
 
 with strategy.scope():
   model = build_model()
 
 epochs=5
 start = time.perf_counter()
 model.fit(
         train_dataset,
         validation_data=eval_dataset,
         steps_per_epoch=num_train_examples/epochs,
         validation_steps=num_test_examples/epochs,
         epochs=epochs)
 elapsed = time.perf_counter() - start
 print('elapsed: {:0.3f}'.format(elapsed))
tensorflow/tensorflow

Answer questions olk

I've downgraded my system:

  • Python 3.7.4
  • Tensorflow-2.1.0-rc1

Still facing the error:

Train for 30000.0 steps, validate for 5000.0 steps Epoch 1/2 2019-12-17 19:21:54.361240: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2019-12-17 19:21:55.824790: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2019-12-17 19:21:56.980785: W tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Not found: ./bin/ptxas not found Relying on driver to perform ptx compilation. This message will be only logged once. 30000/30000 [==============================] - 115s 4ms/step - loss: 0.0856 - accuracy: 0.9761 - val_loss: 0.0376 - val_accuracy: 0.9879 Epoch 2/2 29990/30000 [============================>.] - ETA: 0s - loss: 0.0152 - accuracy: 0.99582019-12-17 19:25:28.372294: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled 30000/30000 [==============================] - 111s 4ms/step - loss: 0.0152 - accuracy: 0.9958 - val_loss: 0.0375 - val_accuracy: 0.9889 2019-12-17 19:25:40.010887: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled 2019-12-17 19:25:40.031138: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled elapsed: 226.391

seams to be related to tensorflow-2.1.0-rc1

useful!

Related questions

ModuleNotFoundError: No module named 'tensorflow.contrib'
ModuleNotFoundError: No module named 'tensorflow.contrib'
When importing TensorFlow, error loading Hadoop hot 4
The flag 'log_dir' is defined twice. hot 3
[TF 2.0] Dataset has no attribute 'make_one_shot_iterator' hot 3
Lossy conversion from float32 to uint8. Range [0, 1]. Convert image to uint8 prior to saving to suppress this warning. hot 3
TF2.0 AutoGraph issue hot 3
Error loading tensorflow hot 3
AttributeError: module 'tensorflow' has no attribute 'set_random_seed' hot 3
AttributeError: module 'tensorflow' has no attribute 'Session' hot 3
No tf.lite.experimental.nn.bidirectional_dynamic_rnn ops is finded hot 3
AttributeError: module 'tensorflow' has no attribute 'app' hot 3
Incorrect Error TypeError: padded_batch() missing 1 required positional argument: 'padded_shapes' hot 3
tensorflow2.0 detected 'xla_gpu' , but 'gpu' expected hot 2
Using tensorflow gpu 2.1 with Cuda 10.2 hot 2
Github User Rank List