profile
viewpoint

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

@renjie-liu Any update on this? Please let me know, and thanks.

jatkinson-CRL

comment created time in a month

PullRequestReviewEvent

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

@renjie-liu There are numerous examples above that will produce a converted TFLite model exhibiting this problem. @Saduf2019 was nice enough to create a gist of one of these examples, that can be found here https://colab.research.google.com/gist/Saduf2019/aedeb27168af0e2632d83292222044f5/untitled387.ipynb. If you really still need just a model file then I can save and upload a model converted by one of the above scripts, please let me know.

jatkinson-CRL

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

@renjie-liu Yes, this works properly with fixed input sizes in tf-nightly. The issue is that it does not support dynamic input sizes. The most important of these is the sequence length. Conv2DLstm supports these in regular tensorflow. Furthermore, Conv2D supports dynamic input (width, height, channels) sizes in TFLite, so why does Conv2DLstm not? There should be no error with dynamic input shapes.

jatkinson-CRL

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

@jvishnuvardhan Another thing I am noticing with tf-nightly, the output details for the Conv2DLstm model with fixed input size are still wrong.

[{'name': 'Identity', 'index': 35, 'shape': array([1, 1, 1, 3], dtype=int32), 'shape_signature': array([-1, -1, -1,  3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

This indicates that the output shape is [1, 1, 1, 3]. This does not seem to affect the actual output size of the model, as the output from inference is indeed the same shape as the input. Besides the sequence length dimension being dropped, they are both [1, 256, 128, 3] in these examples. On the contrary, the output shape in the output details for the Conv2D model is correctly [1, 256, 128, 3]. Perhaps this is related to the error when using dynamic input sizes with Conv2DLstms?

jatkinson-CRL

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

@jvishnuvardhan Thanks for getting back to me so quickly! Very interesting, you are correct that this works with tf-nightly. I was able to reproduce on my local machine with nightly build from 5 days ago. However, it is still highly desirable that dynamic input sizes are supported, and tf-nightly still fails with that.

I'm nervous about using tf-nightly in production for our systems. Is there a release planned any time soon that would contain at least this fixed size input inference functionality?

jatkinson-CRL

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

@jvishnuvardhan @Saduf2019 My team is trying to make some system design decisions that would be influenced by the resolution of this issue. We were hoping we could get a rough estimate of the timeline for resolving this. Do either of you have any idea what that looks like? Please let me know, and thank you!

jatkinson-CRL

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

I found another bug related to this. I decided to try using fixed input size instead of dynamic sizes. This works as expected for Conv2D, but causes a segmentation fault for Conv2DLstm's. Here is the associated code:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

# a function to convert the input tensorflow model to a tflite model and test the models
# with some test data
def convert_and_test(model, test_data):    
    converter = tf.lite.TFLiteConverter.from_keras_model(model)

    # this is needed to convert tf.Slice, see https://github.com/tensorflow/tensorflow/issues/35590
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
                                           tf.lite.OpsSet.SELECT_TF_OPS]
    tflite_model = converter.convert()

    interpreter = tf.lite.Interpreter(model_content=tflite_model)
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()
    
    # print IO details
    print(input_details)
    print(output_details)
    
    # models have fixed input size, so no need to resize the input/output tensors
    interpreter.allocate_tensors()
    
    # Test regular model on data
    tf_results = model(test_data)
    
    # verify shape is correct
    print('Test data shape: ', test_data.shape)
    print('TF model output shape: ', tf_results.shape)
    
    # Test the model
    interpreter.set_tensor(input_details[0]['index'], test_data)
    
    try:
        interpreter.invoke()

        # get output if inference was successful
        tflite_results = interpreter.get_tensor(output_details[0]['index'])

        # verify shape is correct
        print('TF lite model output shape: ', tflite_results.shape)

        # Compare model results
        for tf_result, tflite_result in zip(tf_results, tflite_results):
            np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)

    except RuntimeError as err:
        print('Runtime Error Caught: %s' % err)


# define two helper functions for creating conv2d and conv2dlstms with some
# common features
def _conv_lstm(filters, kernel_size, dilation_rate, return_sequences):
    conv_layer = layers.ConvLSTM2D(filters=filters, 
                                    kernel_size=kernel_size,
                                    strides=(1, 1),
                                    padding='same',
                                    data_format='channels_last',
                                    dilation_rate=dilation_rate,
                                    activation='relu',
                                    recurrent_activation='hard_sigmoid',
                                    return_sequences=return_sequences)
    return conv_layer

def _conv(filters, kernel_size, dilation_rate):
    conv_layer = layers.Conv2D(filters=filters, 
                                    kernel_size=kernel_size,
                                    strides=(1, 1),
                                    padding='same',
                                    data_format='channels_last',
                                    dilation_rate=dilation_rate,
                                    activation='relu')
    return conv_layer

# not important for this example, some arbitrary value
num_classes = 3

# define a model that DOES work
batch_size = 1
width = 256
height = 128
channels = 1

# using fixed input sizes with regular Conv2D layers is successful
working_inputs = keras.Input(shape=(width, height, channels), name='image_sequence')
working_conv_1 = _conv(filters=8, kernel_size=(3, 3), dilation_rate=(1, 1))(working_inputs)
working_conv_bn_1 = layers.BatchNormalization()(working_conv_1)
working_outputs = layers.Conv2D(filters=num_classes, 
                        kernel_size=(1, 1),
                        strides=(1, 1),
                        padding='same',
                        data_format='channels_last',
                        dilation_rate=(1, 1),
                        activation=None)(working_conv_bn_1)
working_model = keras.Model(inputs=working_inputs, outputs=working_outputs)
working_model.summary(line_length=140)

# create some data to test model with
working_data = np.random.rand(batch_size, width, height, channels).astype(np.float32)

# show that model fails to invoke
print('About to convert and test working fully convolutional model')
convert_and_test(working_model, working_data)

# define a model that DOES NOT work
# since we are using Conv2dLstms here, dimension 1 (0-based) is sequence length
seq_len = 3

# using fixed input sizes with Conv2DLstms layers results in a seg fault
failing_inputs = keras.Input(shape=(seq_len, width, height, channels), name='image_sequence')
failing_conv_lstm_1 = _conv_lstm(filters=8, kernel_size=(3, 3), dilation_rate=(1, 1), return_sequences=False)(failing_inputs)
failing_conv_lstm_bn_1 = layers.BatchNormalization()(failing_conv_lstm_1)
failing_outputs = layers.Conv2D(filters=num_classes, 
                                kernel_size=(1, 1),
                                strides=(1, 1),
                                padding='same',
                                data_format='channels_last',
                                dilation_rate=(1, 1),
                                activation=None)(failing_conv_lstm_bn_1)

failing_model = keras.Model(inputs=failing_inputs, outputs=failing_outputs)
failing_model.summary(line_length=140)

# create some data to test model with
failing_data = np.random.rand(batch_size, seq_len, width, height, channels).astype(np.float32).astype(np.float32)

# show that model fails to invoke
print('About to convert and test failing fully convolutional LSTM model')
convert_and_test(failing_model, failing_data)
jatkinson-CRL

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

After some research, I found this issue where the user manually resized output tensors as well as input tensors (https://github.com/tensorflow/tensorflow/issues/37012). I tried this and that obviously fixed the issue of the output tensor size not being automatically adjusted when the input tensor size changes, but unfortunately it does not fix the issue of Runtime Error Caught: Fill dimensions must be >= 0Node number 5 (FILL) failed to invoke. being thrown when invoke() is called. For completeness, I'm adding the updated test script with this change below:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

# a function to convert the input tensorflow model to a tflite model and test the models
# with some test data
def convert_and_test(model, test_data, output_data_shape):    
    converter = tf.lite.TFLiteConverter.from_keras_model(model)

    # this is needed to convert tf.Slice, see https://github.com/tensorflow/tensorflow/issues/35590
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
                                           tf.lite.OpsSet.SELECT_TF_OPS]
    tflite_model = converter.convert()

    interpreter = tf.lite.Interpreter(model_content=tflite_model)
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    # since model has dynamic input shape, we need to reshape the input and output tensor each time that shape changes
    # output needs to be manually resized as well, see https://github.com/tensorflow/tensorflow/issues/37012
    interpreter.resize_tensor_input(input_details[0]['index'], test_data.shape)
    interpreter.resize_tensor_input(output_details[0]['index'], output_data_shape)
    interpreter.allocate_tensors()
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()
    
    # print IO details after resizing and tensor allocation
    print(input_details)
    print(output_details)

    # Test regular model on data
    tf_results = model(test_data)
    
    # verify shape is correct
    print('Test data shape: ', test_data.shape)
    print('TF model output shape: ', tf_results.shape)
    
    # Test the model
    interpreter.set_tensor(input_details[0]['index'], test_data)
    
    try:
        interpreter.invoke()

        # get output if inference was successful
        tflite_results = interpreter.get_tensor(output_details[0]['index'])

        # verify shape is correct
        print('TF lite model output shape: ', tflite_results.shape)

        # Compare model results
        for tf_result, tflite_result in zip(tf_results, tflite_results):
            np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)

    except RuntimeError as err:
        print('Runtime Error Caught: %s' % err)


# define two helper functions for creating conv2d and conv2dlstms with some
# common features
def _conv_lstm(filters, kernel_size, dilation_rate, return_sequences):
    conv_layer = layers.ConvLSTM2D(filters=filters, 
                                    kernel_size=kernel_size,
                                    strides=(1, 1),
                                    padding='same',
                                    data_format='channels_last',
                                    dilation_rate=dilation_rate,
                                    activation='relu',
                                    recurrent_activation='hard_sigmoid',
                                    return_sequences=return_sequences)
    return conv_layer

def _conv(filters, kernel_size, dilation_rate):
    conv_layer = layers.Conv2D(filters=filters, 
                                    kernel_size=kernel_size,
                                    strides=(1, 1),
                                    padding='same',
                                    data_format='channels_last',
                                    dilation_rate=dilation_rate,
                                    activation='relu')
    return conv_layer

# not important for this example, some arbitrary value
num_classes = 3

# define a model that DOES work
working_inputs = keras.Input(shape=(None, None, 1), name='image_sequence')
working_conv_1 = _conv(filters=8, kernel_size=(3, 3), dilation_rate=(1, 1))(working_inputs)
working_conv_bn_1 = layers.BatchNormalization()(working_conv_1)
working_outputs = layers.Conv2D(filters=num_classes, 
                        kernel_size=(1, 1),
                        strides=(1, 1),
                        padding='same',
                        data_format='channels_last',
                        dilation_rate=(1, 1),
                        activation=None)(working_conv_bn_1)
working_model = keras.Model(inputs=working_inputs, outputs=working_outputs)
working_model.summary(line_length=140)

# create some data to test model with
batch_size = 1
width = 256
height = 128
channels = 1
working_data = np.random.rand(batch_size, width, height, channels).astype(np.float32)
output_shape = (batch_size, width, height, num_classes)

# show that model fails to invoke
print('About to convert and test working fully convolutional model')
convert_and_test(working_model, working_data, output_shape)

# define a model that DOES NOT work
failing_inputs = keras.Input(shape=(None, None, None, 1), name='image_sequence')
failing_conv_lstm_1 = _conv_lstm(filters=8, kernel_size=(3, 3), dilation_rate=(1, 1), return_sequences=False)(failing_inputs)
failing_conv_lstm_bn_1 = layers.BatchNormalization()(failing_conv_lstm_1)
failing_outputs = layers.Conv2D(filters=num_classes, 
                                kernel_size=(1, 1),
                                strides=(1, 1),
                                padding='same',
                                data_format='channels_last',
                                dilation_rate=(1, 1),
                                activation=None)(failing_conv_lstm_bn_1)

failing_model = keras.Model(inputs=failing_inputs, outputs=failing_outputs)
failing_model.summary(line_length=140)

# create some data to test model with
# since we are using Conv2dLstms here, dimension 1 (0-based) is sequence length
seq_len = 3
failing_data = np.random.rand(batch_size, seq_len, width, height, channels).astype(np.float32).astype(np.float32)
output_shape = (batch_size, width, height, num_classes)

# show that model fails to invoke
print('About to convert and test failing fully convolutional LSTM model')
convert_and_test(failing_model, failing_data, output_shape)
jatkinson-CRL

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

Please let me know if there's anything else I can do to help resolve this problem. I am trying to run a model with Conv2DLstms in a compute-constrained environment and would like to help resolve this problem ASAP. Just wanted to reiterate this in case it gets lost in the noise of the above message.

jatkinson-CRL

comment created time in 2 months

issue openedtensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

<em>Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template</em>

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): tf-nightly-gpu: v1.12.1-39890-gf74cc7a696 2.4.0-dev20200821 (problem also happens on tensorflow-gpu v2.3.0)
  • Python version: 3.7.7
  • Bazel version (if compiling from source): N/A
  • GCC/Compiler version (if compiling from source): N/A
  • CUDA/cuDNN version: N/A
  • GPU model and memory: N/A

You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with:

  1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
  2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the current behavior All model's containing Conv2DLstm layers (https://www.tensorflow.org/api_docs/python/tf/keras/layers/ConvLSTM2D) can be successfully converted to TFLite models, but then they always fail when running inference using the TFLite interpreter.

Specifically, the following Runtime Error is thrown when the interpreter's invoke method is called: Runtime Error Caught: Fill dimensions must be >= 0Node number 7 (FILL) failed to invoke.

Describe the expected behavior All model's containing Conv2DLstm layers can successfully run inference using the TFLite interpreter.

Standalone code to reproduce the issue Here is a standalone script to reproduce this issue with either tf-nightly or tensorflow r2.3.0.

`import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers import numpy as np

a function to convert the input tensorflow model to a tflite model and test the models

with some test data

def convert_and_test(model, test_data):
converter = tf.lite.TFLiteConverter.from_keras_model(model)

# this is needed to convert tf.Slice, see https://github.com/tensorflow/tensorflow/issues/35590
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
                                       tf.lite.OpsSet.SELECT_TF_OPS]
tflite_model = converter.convert()

interpreter = tf.lite.Interpreter(model_content=tflite_model)
input_details = interpreter.get_input_details()

# since model has dynamic input shape, we need to reshape the input tensor each time that shape changes
interpreter.resize_tensor_input(input_details[0]['index'], test_data.shape)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# print IO details after resizing and tensor allocation
print(input_details)
print(output_details)

# Test regular model on data
tf_results = model(test_data)

# verify shape is correct
print('Test data shape: ', test_data.shape)
print('TF model output shape: ', tf_results.shape)

# Test the model
interpreter.set_tensor(input_details[0]['index'], test_data)

try:
    interpreter.invoke()

    # get output if inference was successful
    tflite_results = interpreter.get_tensor(output_details[0]['index'])

    # verify shape is correct
    print('TF lite model output shape: ', tflite_results.shape)

    # Compare model results
    for tf_result, tflite_result in zip(tf_results, tflite_results):
        np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)

except RuntimeError as err:
    print('Runtime Error Caught: %s' % err)

define two helper functions for creating conv2d and conv2dlstms with some

common features

def _conv_lstm(filters, kernel_size, dilation_rate, return_sequences): conv_layer = layers.ConvLSTM2D(filters=filters, kernel_size=kernel_size, strides=(1, 1), padding='same', data_format='channels_last', dilation_rate=dilation_rate, activation='relu', recurrent_activation='hard_sigmoid', return_sequences=return_sequences) return conv_layer

def _conv(filters, kernel_size, dilation_rate): conv_layer = layers.Conv2D(filters=filters, kernel_size=kernel_size, strides=(1, 1), padding='same', data_format='channels_last', dilation_rate=dilation_rate, activation='relu') return conv_layer

not important for this example, some arbitrary value

num_classes = 3

define a model that DOES work

working_inputs = keras.Input(shape=(None, None, 1), name='image_sequence') working_conv_1 = _conv(filters=8, kernel_size=(3, 3), dilation_rate=(1, 1))(working_inputs) working_conv_bn_1 = layers.BatchNormalization()(working_conv_1) working_outputs = layers.Conv2D(filters=num_classes, kernel_size=(1, 1), strides=(1, 1), padding='same', data_format='channels_last', dilation_rate=(1, 1), activation=None)(working_conv_bn_1) working_model = keras.Model(inputs=working_inputs, outputs=working_outputs) working_model.summary(line_length=140)

create some data to test model with

working_data = np.random.rand(1, 256, 128, 1).astype(np.float32)

show that model fails to invoke

print('About to convert and test working fully convolutional model') convert_and_test(working_model, working_data)

define a model that DOES NOT work

failing_inputs = keras.Input(shape=(None, None, None, 1), name='image_sequence') failing_conv_lstm_1 = _conv_lstm(filters=8, kernel_size=(3, 3), dilation_rate=(1, 1), return_sequences=False)(failing_inputs) failing_conv_lstm_bn_1 = layers.BatchNormalization()(failing_conv_lstm_1) failing_outputs = layers.Conv2D(filters=num_classes, kernel_size=(1, 1), strides=(1, 1), padding='same', data_format='channels_last', dilation_rate=(1, 1), activation=None)(failing_conv_lstm_bn_1)

failing_model = keras.Model(inputs=failing_inputs, outputs=failing_outputs) failing_model.summary(line_length=140)

create some data to test model with

since we are using Conv2dLstms here, dimension 1 (0-based) is sequence length

failing_data = np.random.rand(1, 3, 256, 128, 1).astype(np.float32)

show that model fails to invoke

print('About to convert and test failing fully convolutional LSTM model') convert_and_test(failing_model, failing_data)`

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

One interesting thing I noticed is that the output_details for the model that fails does not seem to properly reshape the output after the call to allocate_tensors(). I'm not sure if that is related, but it was the one difference between the working model and failing model that I noticed (besides the RuntimeError, of course).

Please let me know if there is anything else I can do to help diagnose and fix this, or if you have any other questions. I'm very determined to solve this issue ASAP.

For completeness, here is total output of the script above:

`2020-08-21 14:36:44.930663: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2020-08-21 14:36:45.796378: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1 2020-08-21 14:36:45.801517: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:42:00.0 name: TITAN RTX computeCapability: 7.5 coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 23.65GiB deviceMemoryBandwidth: 625.94GiB/s 2020-08-21 14:36:45.801546: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2020-08-21 14:36:45.803090: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2020-08-21 14:36:45.804734: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10 2020-08-21 14:36:45.804935: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10 2020-08-21 14:36:45.806469: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10 2020-08-21 14:36:45.807182: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10 2020-08-21 14:36:45.810354: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7 2020-08-21 14:36:45.812420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-08-21 14:36:45.812791: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2020-08-21 14:36:45.837505: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2994095000 Hz 2020-08-21 14:36:45.839632: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ed46b621a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-08-21 14:36:45.839661: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-08-21 14:36:46.496091: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ed46bcdcb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-08-21 14:36:46.496148: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): TITAN RTX, Compute Capability 7.5 2020-08-21 14:36:46.498003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:42:00.0 name: TITAN RTX computeCapability: 7.5 coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 23.65GiB deviceMemoryBandwidth: 625.94GiB/s 2020-08-21 14:36:46.498048: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2020-08-21 14:36:46.498078: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2020-08-21 14:36:46.498094: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10 2020-08-21 14:36:46.498110: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10 2020-08-21 14:36:46.498126: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10 2020-08-21 14:36:46.498141: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10 2020-08-21 14:36:46.498157: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7 2020-08-21 14:36:46.501695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-08-21 14:36:46.501752: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2020-08-21 14:36:46.986756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-08-21 14:36:46.986826: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 2020-08-21 14:36:46.986834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N 2020-08-21 14:36:46.988998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21417 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:42:00.0, compute capability: 7.5) WARNING:tensorflow:From /home/jatkinson/anaconda3/envs/tensorflow_2p3/lib/python3.7/site-packages/tensorflow/python/training/tracking/tracking.py:111: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: This property should not be used in TensorFlow 2.0, as updates are applied automatically. 2020-08-21 14:36:47.466889: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. WARNING:tensorflow:From /home/jatkinson/anaconda3/envs/tensorflow_2p3/lib/python3.7/site-packages/tensorflow/python/training/tracking/tracking.py:111: Layer.updates (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: This property should not be used in TensorFlow 2.0, as updates are applied automatically. 2020-08-21 14:36:47.753827: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1 2020-08-21 14:36:47.754022: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session 2020-08-21 14:36:47.755162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:42:00.0 name: TITAN RTX computeCapability: 7.5 coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 23.65GiB deviceMemoryBandwidth: 625.94GiB/s 2020-08-21 14:36:47.755198: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2020-08-21 14:36:47.755226: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2020-08-21 14:36:47.755237: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10 2020-08-21 14:36:47.755246: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10 2020-08-21 14:36:47.755256: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10 2020-08-21 14:36:47.755265: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10 2020-08-21 14:36:47.755275: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7 2020-08-21 14:36:47.755957: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-08-21 14:36:47.755992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-08-21 14:36:47.755997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 2020-08-21 14:36:47.756002: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N 2020-08-21 14:36:47.756718: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21417 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:42:00.0, compute capability: 7.5) 2020-08-21 14:36:47.763790: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:816] Optimization results for grappler item: graph_to_optimize 2020-08-21 14:36:47.763842: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0.008ms. 2020-08-21 14:36:47.763850: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-08-21 14:36:47.796767: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:313] Ignored output_format. 2020-08-21 14:36:47.796837: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:316] Ignored drop_control_dependency. 2020-08-21 14:36:47.800774: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:42:00.0 name: TITAN RTX computeCapability: 7.5 coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 23.65GiB deviceMemoryBandwidth: 625.94GiB/s 2020-08-21 14:36:47.800822: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2020-08-21 14:36:47.800846: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2020-08-21 14:36:47.800856: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10 2020-08-21 14:36:47.800864: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10 2020-08-21 14:36:47.800873: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10 2020-08-21 14:36:47.800881: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10 2020-08-21 14:36:47.800890: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7 2020-08-21 14:36:47.801613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-08-21 14:36:47.801647: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-08-21 14:36:47.801653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 2020-08-21 14:36:47.801658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N 2020-08-21 14:36:47.802417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21417 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:42:00.0, compute capability: 7.5) 2020-08-21 14:36:47.816103: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7 2020-08-21 14:36:49.126173: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2020-08-21 14:36:51.254963: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1 2020-08-21 14:36:51.255136: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session 2020-08-21 14:36:51.256116: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:42:00.0 name: TITAN RTX computeCapability: 7.5 coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 23.65GiB deviceMemoryBandwidth: 625.94GiB/s 2020-08-21 14:36:51.256160: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2020-08-21 14:36:51.256184: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2020-08-21 14:36:51.256194: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10 2020-08-21 14:36:51.256205: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10 2020-08-21 14:36:51.256227: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10 2020-08-21 14:36:51.256237: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10 2020-08-21 14:36:51.256248: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7 2020-08-21 14:36:51.256905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-08-21 14:36:51.256941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-08-21 14:36:51.256946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 2020-08-21 14:36:51.256952: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N 2020-08-21 14:36:51.257671: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21417 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:42:00.0, compute capability: 7.5) 2020-08-21 14:36:51.267953: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:816] Optimization results for grappler item: graph_to_optimize 2020-08-21 14:36:51.268000: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: Graph size after: 112 nodes (0), 133 edges (0), time = 1.388ms. 2020-08-21 14:36:51.268004: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: Graph size after: 112 nodes (0), 133 edges (0), time = 1.449ms. 2020-08-21 14:36:51.268008: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:816] Optimization results for grappler item: functional_3_conv_lst_m2d_while_body_5034 2020-08-21 14:36:51.268012: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-08-21 14:36:51.268015: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-08-21 14:36:51.268019: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:816] Optimization results for grappler item: functional_3_conv_lst_m2d_while_cond_5033 2020-08-21 14:36:51.268022: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0.001ms. 2020-08-21 14:36:51.268026: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0ms. 2020-08-21 14:36:51.342805: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:313] Ignored output_format. 2020-08-21 14:36:51.342872: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:316] Ignored drop_control_dependency. INFO: Created TensorFlow Lite delegate for select TF ops. 2020-08-21 14:36:51.388179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:42:00.0 name: TITAN RTX computeCapability: 7.5 coreClock: 1.77GHz coreCount: 72 deviceMemorySize: 23.65GiB deviceMemoryBandwidth: 625.94GiB/s 2020-08-21 14:36:51.388213: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2020-08-21 14:36:51.388235: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2020-08-21 14:36:51.388244: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10 2020-08-21 14:36:51.388252: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10 2020-08-21 14:36:51.388260: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10 2020-08-21 14:36:51.388268: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10 2020-08-21 14:36:51.388276: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7 2020-08-21 14:36:51.388956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2020-08-21 14:36:51.388990: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-08-21 14:36:51.388996: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0 2020-08-21 14:36:51.389000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N 2020-08-21 14:36:51.389756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21417 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:42:00.0, compute capability: 7.5) INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 16 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 1 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 2 nodes delegated out of 41 nodes with 1 partitions.

Model: "functional_1"


Layer (type) Output Shape Param #

image_sequence (InputLayer) [(None, None, None, 1)] 0


conv2d (Conv2D) (None, None, None, 8) 80


batch_normalization (BatchNormalization) (None, None, None, 8) 32


conv2d_1 (Conv2D) (None, None, None, 3) 27

Total params: 139 Trainable params: 123 Non-trainable params: 16


About to convert and test working fully convolutional model [{'name': 'image_sequence', 'index': 0, 'shape': array([ 1, 256, 128, 1], dtype=int32), 'shape_signature': array([-1, -1, -1, 1], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] [{'name': 'Identity', 'index': 9, 'shape': array([ 1, 256, 128, 3], dtype=int32), 'shape_signature': array([-1, -1, -1, 3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] Test data shape: (1, 256, 128, 1) TF model output shape: (1, 256, 128, 3) TF lite model output shape: (1, 256, 128, 3) Model: "functional_3"


Layer (type) Output Shape Param #

image_sequence (InputLayer) [(None, None, None, None, 1)] 0


conv_lst_m2d (ConvLSTM2D) (None, None, None, 8) 2624


batch_normalization_1 (BatchNormalization) (None, None, None, 8) 32


conv2d_2 (Conv2D) (None, None, None, 3) 27

Total params: 2,683 Trainable params: 2,667 Non-trainable params: 16


About to convert and test failing fully convolutional LSTM model [{'name': 'image_sequence', 'index': 0, 'shape': array([ 1, 3, 256, 128, 1], dtype=int32), 'shape_signature': array([-1, -1, -1, -1, 1], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] [{'name': 'Identity', 'index': 37, 'shape': array([1, 1, 1, 3], dtype=int32), 'shape_signature': array([-1, -1, -1, 3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] Test data shape: (1, 3, 256, 128, 1) TF model output shape: (1, 256, 128, 3) Runtime Error Caught: Fill dimensions must be >= 0Node number 5 (FILL) failed to invoke. `

created time in 2 months

issue closedtensorflow/tensorflow

TFLite Inference Runtime Error with Model' Conv2dLstm

<em>Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template</em>

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary):
  • TensorFlow version (use command below):
  • Python version:
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with:

  1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
  2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the current behavior

Describe the expected behavior

Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/Jupyter/any notebook.

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

closed time in 2 months

jatkinson-CRL

issue openedtensorflow/tensorflow

TFLite Inference Runtime Error with Model' Conv2dLstm

<em>Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template</em>

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary):
  • TensorFlow version (use command below):
  • Python version:
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with:

  1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
  2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the current behavior

Describe the expected behavior

Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/Jupyter/any notebook.

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

created time in 2 months

issue commentaraffin/rl-baselines-zoo

Hyperparameter tuning fails with Optuna 2.0.0

Thanks!

jatkinson-CRL

comment created time in 3 months

issue openedaraffin/rl-baselines-zoo

Hyperparameter tuning fails with Optuna 2.0.0

Describe the bug Hyperparameter tuning fails with Optuna >= v2.0.0.

Code example Install optuna >= v2.0.0. Then, simply try to tune hyperparameters for any environment:

python train.py --algo ppo2 --env CartPole-v1 --optimize --n-trials 1000 --n-jobs 6 --sampler tpe --pruner median --verbose 1 --n-timesteps 1000

This almost immediately throws the error:

...
File "~/rl-baselines-zoo/utils/callbacks.py", line 31, in _on_step
    if self.trial.should_prune(self.eval_idx):
TypeError: should_prune() takes 1 positional argument but 2 were given

Taking a look at the optuna code, should_prune() in version 2.0.0 (https://github.com/optuna/optuna/blob/v2.0.0/optuna/trial/_trial.py#L570) does not have an optional step argument anymore. In versions <=1.5.0, should_prune() has an optional step argument (https://github.com/optuna/optuna/blob/v1.5.0/optuna/trial/_trial.py#L538).

Downgrading to optuna <= v1.5.0 indeed fixes this error. Thanks for the great package!

System Info Describe the characteristic of your environment:

  • Describe how stable baselines was installed (pip, docker, source, ...) pip
  • GPU models and configuration Titan RTX
  • Python version 3.7.7
  • Tensorflow version 1.15.0
  • Gym version 0.17.2
  • Pybullet version n/a
  • Versions of any other relevant libraries n/a Additional context n/a

created time in 3 months

more