profile
viewpoint

issue commenttensorflow/tensorflow

[RNN] Invoke the tflite model for inference with dynamic batchsize

Can you try set batch size to 1? Then inference one by one?

Thank you for your answer. I also tried this method, but this method of cyclically calling inference will cause serious calculation delays. We know that the mobile terminal has high requirements for speed, and we hope that items can be placed in a batch for parallel calculation to reduce inference time.

TFL rnn/lstm kernel is stateful (the states are maintained internally), so it's hard to change batch_size during inference time. If you're fine with binary size, maybe it's possible to have multiple models with different batch_size.

I see, thank you for your answer. Since dynamic batchsize can be supported when applying inference in TF, does TFlite have plans to support dynamic batchsize when applying inference too?

TFL hasn't modeled resource variables yet, so we don't have any near-term plan. Sorry about that.

shendaw

comment created time in 16 hours

issue commenttensorflow/tensorflow

[RNN] Invoke the tflite model for inference with dynamic batchsize

Can you try set batch size to 1? Then inference one by one?

Thank you for your answer. I also tried this method, but this method of cyclically calling inference will cause serious calculation delays. We know that the mobile terminal has high requirements for speed, and we hope that items can be placed in a batch for parallel calculation to reduce inference time.

TFL rnn/lstm kernel is stateful (the states are maintained internally), so it's hard to change batch_size during inference time.

If you're fine with binary size, maybe it's possible to have multiple models with different batch_size.

shendaw

comment created time in 16 hours

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

Haoliang, Can you help take a look? Thanks!

jatkinson-CRL

comment created time in 19 hours

issue commenttensorflow/tensorflow

[RNN] Invoke the tflite model for inference with dynamic batchsize

Can you try set batch size to 1?

Then inference one by one?

shendaw

comment created time in a day

Pull request review commenttensorflow/tensorflow

[TFLite 16x8] Fixes for TANH and LOGISTIC

 TfLiteStatus TanhPrepare(TfLiteContext* context, TfLiteNode* node) {         (data->input_left_shift == 0 || data->input_left_shift == 1);      if (!param_scale_pot) {-      // In case of general scale parameter, we need to do a rescaling.-      // Magic constant 4096:-      // We need to scale down to (-2^3, 2^3) / 3 is kInputIntegerBits/ interval-      // from 16-bit (-2^15, 2^15),-      // so we need to multiply by-      // 2^(15 - kInputIntegerBits) = 2^12 = 4096.-      data->input_multiplier = static_cast<int32_t>(input->params.scale * 4096);+      // Calculate multiplier to change input scale to 1/(3*4096)+      // as required by the table lookup.+      // In this scaling +/-2^17 represents +/-10.7++      double multiplier = input->params.scale * 4096.0 * 3.0;+      data->input_left_shift = 0;++      while (multiplier <= 32767.0 / 2.0 && data->input_left_shift <= 30) {

looks like this can be replaced with a std::frexp.

wwwind

comment created time in 6 days

Pull request review commenttensorflow/tensorflow

[TFLite 16x8] Fixes for TANH and LOGISTIC

 TfLiteStatus TanhPrepare(TfLiteContext* context, TfLiteNode* node) {         (data->input_left_shift == 0 || data->input_left_shift == 1);      if (!param_scale_pot) {-      // In case of general scale parameter, we need to do a rescaling.-      // Magic constant 4096:-      // We need to scale down to (-2^3, 2^3) / 3 is kInputIntegerBits/ interval-      // from 16-bit (-2^15, 2^15),-      // so we need to multiply by-      // 2^(15 - kInputIntegerBits) = 2^12 = 4096.-      data->input_multiplier = static_cast<int32_t>(input->params.scale * 4096);+      // Calculate multiplier to change input scale to 1/(3*4096)

thanks, it would be great to add to the comments

wwwind

comment created time in 6 days

Pull request review commenttensorflow/tensorflow

[TFLite 16x8] Fixes for TANH and LOGISTIC

 inline void Logistic(int32_t input_zero_point, int32_t input_range_radius,   } } -inline void Logistic(int32_t input_multiplier, int32_t input_size,-                     const int16_t* ptr_input_data, int16_t* ptr_output_data) {+inline void Logistic(int32_t input_multiplier, int32_t input_left_shift,+                     int32_t input_size, const int16_t* ptr_input_data,+                     int16_t* ptr_output_data) {   // We use the LUT for sigmoid and take into account, that   // tanh(x) = 2*sigmoid(2*x) - 1 -  int32_t input_data_mul = (input_multiplier > 0) ? input_multiplier : 1;+  // We scale by 3/4 to expand range [-8,8]->[-10.7,10.7].+  // In case of general parameter scale, multiplier 3 is taken into account+  // in TanhPrepare function and it is included in+  // input_multiplier already.++  if (input_multiplier == 0) {  // power of two case+    input_multiplier = 3 << input_left_shift;+    input_left_shift = 0;+  }++  int32_t round = (input_left_shift > 0) ? 1 << (input_left_shift - 1) : 0;    for (int i = 0; i < input_size; ++i, ptr_input_data++, ptr_output_data++) {-    int32_t input_data = (*ptr_input_data) * input_data_mul;+    int32_t input_data =+        ((*ptr_input_data) * input_multiplier + round) >> input_left_shift; -    // Scale by 3/4 to expand range [-8,8]->[-10.7,10.7] and-    // we do interpolation on unsigned values.-    uint32_t abs_input_data = 3 * abs(input_data);+    // We do interpolation on unsigned values.+    uint32_t abs_input_data = abs(input_data);      // We divide by 2 power of 9, because     // we need to divide by 2 in power of 7 for     // the input conversion + 1/4 from the scale above.-    uint8_t uh = abs_input_data >> 9;-    uint32_t ua = sigmoid_table_uint16[uh];-    uint32_t ub = sigmoid_table_uint16[uh + 1];-    uint32_t ut = abs_input_data & 0x1ff;+    uint32_t uh = abs_input_data >> 9;

nit: please use more meaningful name rather than uh

wwwind

comment created time in 6 days

PullRequestReviewEvent
PullRequestReviewEvent

issue commenttensorflow/tensorflow

[TFLu] int8 ops slower than f32

We have optimized for neon on arm, but for micro, unfortunately those simd instructions are not available.

Hi Pete, do you have any suggestions?

Thanks

Abhipray

comment created time in 13 days

issue commenttensorflow/tensorflow

TFLite TransposeConvV2 Operator Slow on x86 CPU Ubuntu

ruy is enabled by default for arm devices, so the numbers should be similar

CoreyCole

comment created time in 15 days

PullRequestReviewEvent

issue commenttensorflow/tensorflow

TFLite TransposeConvV2 Operator Slow on x86 CPU Ubuntu

Thanks for sharing! Will take a look!

CoreyCole

comment created time in 16 days

issue commenttensorflow/tensorflow

TFLite TransposeConvV2 Operator Slow on x86 CPU Ubuntu

I see, this is really strange, can you add --define=ruy_profiler=true as well when building the benchmark_tool, we should be able to see the detailed profiling,

CoreyCole

comment created time in 16 days

issue commenttensorflow/tensorflow

TFLite: C++/Java: experimental kernel ctc_beam_search_decoder returns always buffer length=length+1

I mean why python gives you a 2-d tensor while the java gives you a 1-d tensor?

cefaci

comment created time in 16 days

issue commenttensorflow/tensorflow

TFLite: C++/Java: experimental kernel ctc_beam_search_decoder returns always buffer length=length+1

Hi Cefaci,

Looking at the results again, I think I'm a little bit confused.

The python outputs a 2-d tensor, shaped [1, 9], and the result is

[[ 20 11 47 12 47 27 26 26 26]]

while the java outputs a 1-d tensor, shaped [10], and the result is

[20, 11, 47, 12, 47, 27, 26, 26, 26, 0]

?

And we should expect a 2-d tensor right?

"top_paths is set to 3", but we still only have 1 output right (while the output contains all the pats)?

cefaci

comment created time in 17 days

issue commenttensorflow/tensorflow

TFLite TransposeConvV2 Operator Slow on x86 CPU Ubuntu

Hi, can you try adding

--define=tflite_with_ruy=true when building the benchmark tool?

thanks!

CoreyCole

comment created time in 18 days

issue commenttensorflow/tensorflow

TFLite: C++/Java: experimental kernel ctc_beam_search_decoder returns always buffer length=length+1

Thanks a lot for the debug!

Wonder:

why there are three outputs in java? shouldn't it be one output?

cefaci

comment created time in 20 days

issue commenttensorflow/tensorflow

TFLite: C++/Java: experimental kernel ctc_beam_search_decoder returns always buffer length=length+1

It would be great if we can have a minimal reproducible example (just have beam_search & sparse_to_dense)?

Then we can compare the c++ inputs/outputs with java inputs outputs?

+Xunkai, wdyt?

cefaci

comment created time in 21 days

issue commenttensorflow/tensorflow

TFLite: C++/Java: experimental kernel ctc_beam_search_decoder returns always buffer length=length+1

Xunkai, seems it might relate the java api, can you help take a look? thanks!

cefaci

comment created time in 21 days

issue commenttensorflow/tensorflow

TFLite: C++/Java: experimental kernel ctc_beam_search_decoder returns always buffer length=length+1

Hi,

Thanks for filing the bug, wonder do you have the tflite model?

Also, wonder if this only occurs with java usage? Have you tried with python tflite api?

cefaci

comment created time in 22 days

Pull request review commenttensorflow/tensorflow

[TFLite 16x8] Fixes for TANH and LOGISTIC

 TfLiteStatus TanhPrepare(TfLiteContext* context, TfLiteNode* node) {         (data->input_left_shift == 0 || data->input_left_shift == 1);      if (!param_scale_pot) {-      // In case of general scale parameter, we need to do a rescaling.-      // Magic constant 4096:-      // We need to scale down to (-2^3, 2^3) / 3 is kInputIntegerBits/ interval-      // from 16-bit (-2^15, 2^15),-      // so we need to multiply by-      // 2^(15 - kInputIntegerBits) = 2^12 = 4096.-      data->input_multiplier = static_cast<int32_t>(input->params.scale * 4096);+      // Calculate multiplier to change input scale to 1/(3*4096)

can you explain more about how the math works here in the comments?

thanks

wwwind

comment created time in 24 days

PullRequestReviewEvent

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

Thanks for the snippet, it looks like the issue is related to TensorArray has fixed [1, -1, -1, 8].

Haoliang & YC, do you have any idea how to fix the issue?

jatkinson-CRL

comment created time in 24 days

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

I see, it's probably some ops does not support this behavior, do you have the converted tflite model that we can take a look?

thanks

jatkinson-CRL

comment created time in a month

issue commenttensorflow/tensorflow

TFLite Inference Runtime Error with Models Containing Conv2dLstm

Hi Jatkinson,

It seems the input shape is not fixed?

Can you try set the fixed input size and try again?

thanks

jatkinson-CRL

comment created time in a month

issue commenttensorflow/tensorflow

Cannot see per operator profiling in android studio with tensorflow lite models

Hi Xunkai & Lu, can you guys take a look?

thanks

anidh

comment created time in a month

PullRequestReviewEvent

issue commenttensorflow/tensorflow

FL16 model run on GPU

Hi Sachin, can you help take a look?

rxiang040

comment created time in a month

PullRequestReviewEvent

issue commenttensorflow/tensorflow

Convert saved model - issue generated shapes

Current tflite has limited dynamic shape support.

You can either set a fixed shape or calling resizeInputShape

hahmad2008

comment created time in a month

issue commenttensorflow/tensorflow

TFLITE Relocate Tensor Fail

Hi Karim, can you help take a look?

Thanks!

DeepakG19

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TFLite] 16x8 quantization: fixes for SLICE, TRANSPOSE operators

 std::string GetMinimumRuntimeVersionForModel(const Model& model) {           {{OperatorType::kTranspose, 1}, "1.6.0"},           {{OperatorType::kTranspose, 2}, "1.14.0"},           {{OperatorType::kTranspose, 3}, "1.15.0"},+          {{OperatorType::kTranspose, 5}, kPendingReleaseOpVersion},

Seems it's "2.3.0"?

+jdduke@

wwwind

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLITE Relocate Tensor Fail

sorry, I mean this:

Generated .tflite Input Details.
{'shape': array([ 1, 256, 256, 3], dtype=int32), 'quantization': (0.0, 0), 'dtype': <class 'numpy.float32'>, 'index': 0, 'name': 'image'}
{'shape': array([ 1, 256, 256, 1], dtype=int32), 'quantization': (0.0, 0), 'dtype': <class 'numpy.float32'>, 'index': 1, 'name': 'mask'}
{'shape': array([ 1, 64, 64, 1], dtype=int32), 'quantization': (0.0, 0), 'dtype': <class 'numpy.float32'>, 'index': 2, 'name': 'mask2'}
{'shape': array([ 1, 128, 128, 1], dtype=int32), 'quantization': (0.0, 0), 'dtype': <class 'numpy.float32'>, 'index': 3, 'name': 'mask4'}

Actual Input Detail
(1, 432, 492, 3) (1, 432, 492, 1) (1, 216, 246, 1) (1, 108, 123, 1)

Also, it's possible your model has some fixed shape in the middle which can break the shape propagation.

You will need to build the model with dynamic shape in the first place.

Can you try if not using resize input. and see if it works?

DeepakG19

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLITE Relocate Tensor Fail

Not sure about what model you're using, but

Actual Input Detail
(1, 432, 492, 3) (1, 432, 492, 1) (1, 216, 246, 1) (1, 108, 123, 1)

Allocating Tensors based on Actual Input Values
interpreter.resize_tensor_input(input_details[0]['index'], (1,h,w,3))
interpreter.resize_tensor_input(input_details[1]['index'], (1,h,w,1))
interpreter.resize_tensor_input(input_details[2]['index'], (1,int(h/4),int(w/4),1))
interpreter.resize_tensor_input(input_details[3]['index'], (1,int(h/2),int(w/2),1))
interpreter.allocate_tensors()

seems like the resize here may have some inconsistent shape.

DeepakG19

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLITE Relocate Tensor Fail

It seems the allocation failed because of the resize input.

Can you try not resize input?

thanks

DeepakG19

comment created time in 2 months

issue commenttensorflow/tensorflow

TFLite quantization silently converts bias tensor to int8 (int32 expected), causing interpreter to crash

Can you try set experimental quantizer to be true and try again?

Thanks!

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/python/lite.py#L421

rajansaini691

comment created time in 2 months

issue commenttensorflow/tensorflow

SPLIT_V for tensorflow lite micro

Hi Pete,

Do you mind taking a look?

Thanks!

Abhipray

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TFLite] 16x8 quantization: fixes for SLICE, TRANSPOSE operators

 TfLiteStatus Eval(TfLiteContext* context, TfLiteNode* node) {     case kTfLiteInt8:       TF_LITE_SLICE(int8_t, kernel_type);       break;+    case kTfLiteInt16:+      TF_LITE_SLICE(int16_t, kernel_type);

can you update the version?

if it's missing/inconsistent, I feel like we should fix it.

wwwind

comment created time in 2 months

issue closedtensorflow/tensorflow

Creating outputs from the final_state returned by dynamic_rnn causes conversion to fail

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS 7
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (or github SHA if from source): r1.14

Command used to run the converter or code if you’re using the Python API import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_saved_model('./savedmodel') tflite_model = converter.convert() open("lstm.tflite", "wb").write(tflite_model)

The output from the converter invocation

Traceback (most recent call last): File "saved2lite.py", line 4, in <module> tflite_model = converter.convert() File "/workdir/env-cpu/lib/python2.7/site-packages/tensorflow/lite/python/lite.py", line 898, in convert **converter_kwargs) File "/workdir/env-cpu/lib/python2.7/site-packages/tensorflow/lite/python/convert.py", line 404, in toco_convert_impl input_data.SerializeToString()) File "/workdir/env-cpu/lib/python2.7/site-packages/tensorflow/lite/python/convert.py", line 172, in toco_convert_protos "TOCO failed. See console for info.\n%s\n%s\n" % (stdout, stderr)) tensorflow.lite.python.convert.ConverterError: TOCO failed. See console for info. 2020-01-13 14:56:27.199876: F tensorflow/lite/toco/tooling_util.cc:918] Check failed: GetOpWithOutput(model, output_array) Specified output array "lstm/body/encoder/rnn/while/Identity_4" is not produced by any op in this graph. Is it a typo? This should not happen. If you trigger this error please send a bug report (with code to reporduce this error), to the TensorFlow Lite team.

Failure details

Model conversion fails when final_state is returned as an output

Any other info / logs

relevant code in model definition:

output, next_state = tf.compat.v1.lite.experimental.nn.dynamic_rnn(...) c_out = next_state[0].c h_out = next_state[0].h

relevant code in conversion to SavedModel (which does work):

tensor_info_c_out = tf.saved_model.utils.build_tensor_info(c_out) tensor_info_h_out = tf.saved_model.utils.build_tensor_info(h_out)

prediction_signature = tf.saved_model.signature_def_utils.build_signature_def(inputs = {'x': x}, outputs = {'y': output, 'c_out': c_out, 'h_out':h_out), method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME)

The TFLite conversion works if I don't return the next state variables, c_out and h_out, but I need those to run the next step of the prediction.

If I run the conversion with tfnightly (using the same SavedModel file saved using 1.14) I get a different, seemingly unrelated, error:

Some of the operators in the model are not supported by the standard TensorFlow Lite runtime. If those are native TensorFlow operators, you might be able to use the extended runtime by passing --enable_select_tf_ops, or by setting target_ops=TFLITE_BUILTINS,SELECT_TF_OPS when calling tf.lite.TFLiteConverter(). Otherwise, if you have a custom implementation for them you can disable this error with --allow_custom_ops, or by setting allow_custom_ops=True when calling tf.lite.TFLiteConverter(). Here is a list of builtin operators you are using: CAST, FULLY_CONNECTED, GATHER, MUL, NOT_EQUAL, RESHAPE, SOFTMAX, SQUEEZE, TOPK_V2. Here is a list of operators for which you will need custom implementations: TensorListFromTensor, TensorListReserve, TensorListStack, While.

closed time in 2 months

xkr1

issue commenttensorflow/tensorflow

The use of tflite Model of C3D Network in Android

Thai recently made a change to support > 4d dimension, can you retry with the selected ops?

Thanks

zxj11838

comment created time in 2 months

issue commenttensorflow/tensorflow

[RNN] Converting network with LSTM layer to int8 sefaults whole python

fused unidirectional lstm does not support integer-only just yet.

Jian is working on it.

ppershing

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.lite.TFLiteConverter crashes when converting Keras model

Hi Jaesung,

can you throw a warning in the python layer?

thanks

haifengkao

comment created time in 2 months

issue commenttensorflow/tensorflow

Simple TFLite UNet slower on mobile GPU than CPU

Hi Chao, can you help take a look?

thanks

jakobknapp

comment created time in 2 months

issue commenttensorflow/tensorflow

We cannot duplicate the value since it's not constant. Failed to duplicate values for the stateful op.

Only the conversion requires fixed shape, you can resize the input later in the runtime.

invencode

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.lite.TFLiteConverter crashes when converting Keras model

Sure, we can through an exception if the batch size is not set during the conversion.

haifengkao

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.lite.TFLiteConverter crashes when converting Keras model

Great! Shall we close the bug then?

The reason is in 2.2 the lstm ops are not fused but now they are.

haifengkao

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.lite.TFLiteConverter crashes when converting Keras model

Can you try to follow the example here:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/experimental_new_converter/Keras_LSTM_fusion_Codelab.ipynb

run_model = tf.function(lambda x: model(x))
# This is important, let's fix the input size.
BATCH_SIZE = 1
STEPS = 28
INPUT_SIZE = 28
concrete_func = run_model.get_concrete_function(
    tf.TensorSpec([BATCH_SIZE, STEPS, INPUT_SIZE], model.inputs[0].dtype))

# model directory.
MODEL_DIR = "keras_lstm"
model.save(MODEL_DIR, save_format="tf", signatures=concrete_func)

converter = tf.lite.TFLiteConverter.from_saved_model(MODEL_DIR)
tflite_model = converter.convert()

Thanks a lot!

Cheers,

haifengkao

comment created time in 2 months

issue commentgoogle-research/google-research

Question about exporting an integer-only MobileBERT to TF-Lite format.

I think the real issue is the model trained in 1.x world but the quantization needs 2.x. (it's easier for us to do internally)

@saberkun we probably need to migrate mobilebert to 2.x asap.

wdyt?

nadongguri

comment created time in 2 months

issue commentgoogle-research/google-research

Question about exporting an integer-only MobileBERT to TF-Lite format.

@liufengdb

Can you help take a look?

Thanks

nadongguri

comment created time in 2 months

issue commentgoogle-research/google-research

Question about exporting an integer-only MobileBERT to TF-Lite format.

That's interesting, can you update to the latest tf-nightly and try again?

We're counting on the new quantizer.

thanks

nadongguri

comment created time in 2 months

pull request commenttensorflow/tensorflow

[TFLite, 16x8] 16x8 Reference kernel for the MEAN operator.

if you run the tensorflow/lite/tools/versioning/op_version_test it will fail, so can you update it?

thanks

wwwind

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TFLite, 16x8] 16x8 Reference kernel for the MEAN operator.

 int GetBuiltinOperatorVersion(const OpSignature& op_sig) {      case BuiltinOperator_CONCATENATION:     case BuiltinOperator_SOFTMAX:+    case BuiltinOperator_MEAN:

can you update the test as well?

also the runtime version?

thanks!

wwwind

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TFLite, 16x8] 16x8 Reference kernel for the MEAN operator.

 QuantizeOpTest/INT8,30 -ConstInt8MeanOpTest.QuantizedDifferentScale ConstUint8(Max|Min)OpTest/.+,29 ConstUint8(Mean)OpTest/.+-ConstInt8(Mean|Max|Min)OpTest/.+,29

this is not supported by nnapi right?

wwwind

comment created time in 2 months

issue commenttensorflow/tensorflow

[RNN] Converting network with LSTM layer to int8 sefaults whole python

fused unidirectional_lstm does not support int8 yet (WIP), sorry about that.

if you use float, it should work.

ppershing

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

[TFLite] 16x8 reference kernel for PAD operator

 string GetMinimumRuntimeVersionForModel(const Model& model) {           {{OperatorType::kMul, 4}, kPendingReleaseOpVersion},           {{OperatorType::kPad, 1}, "1.5.0"},           {{OperatorType::kPad, 2}, "1.14.0"},+          {{OperatorType::kPad, 3}, kPendingReleaseOpVersion},

we should modify tensorflow/lite/tools/versioning/runtime_version.cc

wwwind

comment created time in 3 months

Pull request review commenttensorflow/tensorflow

[TFLite, 16x8] 16x8 Reference kernel for the MEAN operator.

 TfLiteStatus EvalMean(TfLiteContext* context, TfLiteNode* node) {               GetTensorData<int64_t>(temp_sum)));       break;     case kTfLiteInt8: {-      tflite::MeanParams op_params;-      op_params.axis_count = num_axis;-      ResolveAxis(GetTensorData<int>(op_context.axis), num_axis, &op_params);-      const TfLiteTensor* input = op_context.input;-      // TODO(b/139102329): Handle all the cases in the combined reference-      // method.-      if (op_context.params->keep_dims && NumDimensions(input) == 4 &&-          op_params.axis_count == 2 &&-          ((op_params.axis[0] == 1 && op_params.axis[1] == 2) ||-           (op_params.axis[0] == 2 && op_params.axis[1] == 1))) {-        reference_integer_ops::Mean(-            op_params, data->multiplier, data->shift, GetTensorShape(input),-            GetTensorData<int8_t>(input), op_context.input->params.zero_point,-            GetTensorShape(op_context.output),-            GetTensorData<int8_t>(op_context.output),-            op_context.output->params.zero_point);-      } else if (input->params.zero_point ==-                     op_context.output->params.zero_point &&-                 input->params.scale == op_context.output->params.scale) {-        TF_LITE_ENSURE(-            context,-            reference_ops::Mean(-                GetTensorData<int8_t>(input), input->dims->data,-                input->dims->size, GetTensorData<int8_t>(op_context.output),-                op_context.output->dims->data, op_context.output->dims->size,-                GetTensorData<int>(op_context.axis), num_axis,-                op_context.params->keep_dims, GetTensorData<int>(temp_index),-                GetTensorData<int>(resolved_axis),-                GetTensorData<int>(temp_sum)));-      } else {-        TF_LITE_ENSURE(-            context,-            reference_ops::QuantizedMeanOrSum<>(-                GetTensorData<int8_t>(input), input->params.zero_point,-                input->params.scale, input->dims->data, input->dims->size,-                GetTensorData<int8_t>(op_context.output),-                op_context.output->params.zero_point,-                op_context.output->params.scale, op_context.output->dims->data,-                op_context.output->dims->size,-                GetTensorData<int>(op_context.axis), num_axis,-                op_context.params->keep_dims, GetTensorData<int>(temp_index),-                GetTensorData<int>(resolved_axis), GetTensorData<int>(temp_sum),-                /*compute_sum=*/false));-      }+      EvalMeanReferenceOps<int8_t>(context, op_context, num_axis, data,+                                   temp_index, resolved_axis, temp_sum);+    } break;+    case kTfLiteInt16: {+      EvalMeanReferenceOps<int16_t>(context, op_context, num_axis, data,

keep the tf_lite_esnure

wwwind

comment created time in 3 months

Pull request review commenttensorflow/tensorflow

[TFLite, 16x8] 16x8 Reference kernel for the MEAN operator.

 TfLiteStatus EvalMean(TfLiteContext* context, TfLiteNode* node) {               GetTensorData<int64_t>(temp_sum)));       break;     case kTfLiteInt8: {-      tflite::MeanParams op_params;-      op_params.axis_count = num_axis;-      ResolveAxis(GetTensorData<int>(op_context.axis), num_axis, &op_params);-      const TfLiteTensor* input = op_context.input;-      // TODO(b/139102329): Handle all the cases in the combined reference-      // method.-      if (op_context.params->keep_dims && NumDimensions(input) == 4 &&-          op_params.axis_count == 2 &&-          ((op_params.axis[0] == 1 && op_params.axis[1] == 2) ||-           (op_params.axis[0] == 2 && op_params.axis[1] == 1))) {-        reference_integer_ops::Mean(-            op_params, data->multiplier, data->shift, GetTensorShape(input),-            GetTensorData<int8_t>(input), op_context.input->params.zero_point,-            GetTensorShape(op_context.output),-            GetTensorData<int8_t>(op_context.output),-            op_context.output->params.zero_point);-      } else if (input->params.zero_point ==-                     op_context.output->params.zero_point &&-                 input->params.scale == op_context.output->params.scale) {-        TF_LITE_ENSURE(-            context,-            reference_ops::Mean(-                GetTensorData<int8_t>(input), input->dims->data,-                input->dims->size, GetTensorData<int8_t>(op_context.output),-                op_context.output->dims->data, op_context.output->dims->size,-                GetTensorData<int>(op_context.axis), num_axis,-                op_context.params->keep_dims, GetTensorData<int>(temp_index),-                GetTensorData<int>(resolved_axis),-                GetTensorData<int>(temp_sum)));-      } else {-        TF_LITE_ENSURE(-            context,-            reference_ops::QuantizedMeanOrSum<>(-                GetTensorData<int8_t>(input), input->params.zero_point,-                input->params.scale, input->dims->data, input->dims->size,-                GetTensorData<int8_t>(op_context.output),-                op_context.output->params.zero_point,-                op_context.output->params.scale, op_context.output->dims->data,-                op_context.output->dims->size,-                GetTensorData<int>(op_context.axis), num_axis,-                op_context.params->keep_dims, GetTensorData<int>(temp_index),-                GetTensorData<int>(resolved_axis), GetTensorData<int>(temp_sum),-                /*compute_sum=*/false));-      }+      EvalMeanReferenceOps<int8_t>(context, op_context, num_axis, data,+                                   temp_index, resolved_axis, temp_sum);+    } break;+    case kTfLiteInt16: {+      EvalMeanReferenceOps<int16_t>(context, op_context, num_axis, data,+                                    temp_index, resolved_axis, temp_sum);     } break;     case kTfLiteUInt8: {-      // TODO(b/139102329): Handle all the cases in the combined reference-      // method.-      tflite::MeanParams op_params;-      op_params.axis_count = num_axis;-      ResolveAxis(GetTensorData<int>(op_context.axis), num_axis, &op_params);-      if (op_context.params->keep_dims &&-          NumDimensions(op_context.input) == 4 && op_params.axis_count == 2 &&-          ((op_params.axis[0] == 1 && op_params.axis[1] == 2) ||-           (op_params.axis[0] == 2 && op_params.axis[1] == 1))) {-        reference_ops::Mean(op_params, GetTensorShape(op_context.input),-                            GetTensorData<uint8_t>(op_context.input),-                            op_context.input->params.zero_point,-                            op_context.input->params.scale,-                            GetTensorShape(op_context.output),-                            GetTensorData<uint8_t>(op_context.output),-                            op_context.output->params.zero_point,-                            op_context.output->params.scale);-      } else if (op_context.input->params.zero_point ==-                     op_context.output->params.zero_point &&-                 op_context.input->params.scale ==-                     op_context.output->params.scale) {-        TF_LITE_ENSURE(-            context,-            reference_ops::Mean(-                GetTensorData<uint8_t>(op_context.input),-                op_context.input->dims->data, op_context.input->dims->size,-                GetTensorData<uint8_t>(op_context.output),-                op_context.output->dims->data, op_context.output->dims->size,-                GetTensorData<int>(op_context.axis), num_axis,-                op_context.params->keep_dims, GetTensorData<int>(temp_index),-                GetTensorData<int>(resolved_axis),-                GetTensorData<int>(temp_sum)));-      } else {-        TF_LITE_ENSURE(-            context,-            reference_ops::QuantizedMeanOrSum<>(-                GetTensorData<uint8_t>(op_context.input),-                op_context.input->params.zero_point,-                op_context.input->params.scale, op_context.input->dims->data,-                op_context.input->dims->size,-                GetTensorData<uint8_t>(op_context.output),-                op_context.output->params.zero_point,-                op_context.output->params.scale, op_context.output->dims->data,-                op_context.output->dims->size,-                GetTensorData<int>(op_context.axis), num_axis,-                op_context.params->keep_dims, GetTensorData<int>(temp_index),-                GetTensorData<int>(resolved_axis), GetTensorData<int>(temp_sum),-                /*compute_sum=*/false));-      }+      EvalMeanReferenceOps<uint8_t>(context, op_context, num_axis, data,

we should keep the tf_lite_ensure

wwwind

comment created time in 3 months

Pull request review commenttensorflow/tensorflow

[TFLite, 16x8] 16x8 Reference kernel for the MEAN operator.

 void ResolveAxis(const int* axis_data, int axis_count,   } } +template <typename integer_type>+TfLiteStatus EvalMeanReferenceOps(TfLiteContext* context,+                                  const OpContext& op_context, int num_axis,+                                  OpData* data, TfLiteTensor* temp_index,+                                  TfLiteTensor* resolved_axis,+                                  TfLiteTensor* temp_sum) {+  tflite::MeanParams op_params;+  op_params.axis_count = num_axis;+  ResolveAxis(GetTensorData<int>(op_context.axis), num_axis, &op_params);+  const TfLiteTensor* input = op_context.input;+  // TODO(b/139102329): Handle all the cases in the combined reference+  // method.+  if (op_context.params->keep_dims && NumDimensions(input) == 4 &&+      op_params.axis_count == 2 &&+      ((op_params.axis[0] == 1 && op_params.axis[1] == 2) ||+       (op_params.axis[0] == 2 && op_params.axis[1] == 1))) {+    if (std::is_same<integer_type, uint8_t>::value) {+      reference_ops::Mean(op_params, GetTensorShape(op_context.input),+                          GetTensorData<uint8_t>(op_context.input),+                          op_context.input->params.zero_point,+                          op_context.input->params.scale,+                          GetTensorShape(op_context.output),+                          GetTensorData<uint8_t>(op_context.output),+                          op_context.output->params.zero_point,+                          op_context.output->params.scale);+    } else {+      reference_integer_ops::Mean(+          op_params, data->multiplier, data->shift, GetTensorShape(input),+          GetTensorData<integer_type>(input),+          op_context.input->params.zero_point,+          GetTensorShape(op_context.output),+          GetTensorData<integer_type>(op_context.output),+          op_context.output->params.zero_point);+    }+  } else if (input->params.zero_point == op_context.output->params.zero_point &&+             input->params.scale == op_context.output->params.scale) {+    TF_LITE_ENSURE(+        context,+        reference_ops::Mean(+            GetTensorData<integer_type>(input), input->dims->data,+            input->dims->size, GetTensorData<integer_type>(op_context.output),+            op_context.output->dims->data, op_context.output->dims->size,+            GetTensorData<int>(op_context.axis), num_axis,+            op_context.params->keep_dims, GetTensorData<int>(temp_index),+            GetTensorData<int>(resolved_axis), GetTensorData<int>(temp_sum)));+  } else {+    TF_LITE_ENSURE(+        context,+        reference_ops::QuantizedMeanOrSum<>(+            GetTensorData<integer_type>(input), input->params.zero_point,+            input->params.scale, input->dims->data, input->dims->size,+            GetTensorData<integer_type>(op_context.output),+            op_context.output->params.zero_point,+            op_context.output->params.scale, op_context.output->dims->data,+            op_context.output->dims->size, GetTensorData<int>(op_context.axis),+            num_axis, op_context.params->keep_dims,+            GetTensorData<int>(temp_index), GetTensorData<int>(resolved_axis),+            GetTensorData<int>(temp_sum),+            /*compute_sum=*/false));+  }+}

this should have a return value here

wwwind

comment created time in 3 months

issue commenttensorflow/tensorflow

Rolled RNN cannot be converted to INT8

Hi YC, do you have any idea about this?

Thanks!

MatteoArm

comment created time in 3 months

issue commenttensorflow/tensorflow

Rolled RNN cannot be converted to INT8

looks like the problem is related to pybind?

MatteoArm

comment created time in 3 months

issue closedtensorflow/tensorflow

[RNN] [TFLiteConverter.] Input tensors containing unknown dimensions fails when coupled with LSTM

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux-4.19.104+-x86_64-with-Ubuntu-18.04-bionic (Google Colab's default environment)
  • TensorFlow installed from (source or binary): pip install tf-nightly
  • TensorFlow version (or github SHA if from source): 2.2.0-dev20200421

Command used to run the converter or code if you’re using the Python API If possible, please share a link to Colab/Jupyter/any notebook.

https://colab.research.google.com/drive/1-hjOM3qW5gY1PAqEZhFGHj7FFY5BaZsm

from tensorflow.lite.python import lite
from tensorflow.python import keras
import numpy as np

input_a = keras.layers.Input(shape=(3,3,), name='input_a')
interm_b = tf.keras.layers.LSTM(4, name='interm_1')(input_a)
output_c = keras.layers.Dense(1, name='dense_1')(interm_b)

model = tf.keras.models.Model(inputs=[input_a], outputs=[output_c])
model.compile(optimizer='sgd', loss='mean_squared_error')
model.summary()

batch_size = 10
sample_input = np.ones((batch_size,3,3),dtype=np.float32)

expected_value = model.predict(sample_input)

converter = lite.TFLiteConverterV2.from_keras_model(model = model)
converter.experimental_new_converter = True
with open("model.tflite", "wb") as f:
    f.write(converter.convert())

interpreter = lite.Interpreter(model_path="model.tflite")
print(interpreter.get_input_details())
interpreter.resize_tensor_input(0,[batch_size, 3,3])
interpreter.allocate_tensors()
interpreter.set_tensor(0, sample_input)
interpreter.invoke()
interpreter.get_tensor(interpreter.get_output_details()[0]["index"])

The output from the converter invocation

Model: "model_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_a (InputLayer)         [(None, 3, 3)]            0         
_________________________________________________________________
interm_1 (LSTM)              (None, 4)                 128       
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 5         
=================================================================
Total params: 133
Trainable params: 133
Non-trainable params: 0
_________________________________________________________________
[{'name': 'input_a', 'index': 0, 'shape': array([1, 3, 3], dtype=int32), 'shape_signature': array([-1,  3,  3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-3-492497848c68> in <module>()
     27 interpreter.allocate_tensors()
     28 interpreter.set_tensor(0, sample_input)
---> 29 interpreter.invoke()
     30 interpreter.get_tensor(interpreter.get_output_details()[0]["index"])

/usr/local/lib/python3.6/dist-packages/tensorflow/lite/python/interpreter.py in invoke(self)
    512     """
    513     self._ensure_safe()
--> 514     self._interpreter.Invoke()
    515 
    516   def reset_all_variables(self):

RuntimeError: tensorflow/lite/kernels/concatenation.cc:74 t->dims->data[d] != t0->dims->data[d] (10 != 1)Node number 50 (CONCATENATION) failed to prepare.
Node number 10 (WHILE) failed to invoke.

Failure details The conversion is successful, but the generated model cannot be resized to variable batch size. Input tensors containing unknown dimensions fails when coupled with LSTM Same script would work just fine if one removes creation of interm_b and pass input_a as the input to generate the output_c.

closed time in 3 months

madarez

issue commenttensorflow/tensorflow

[RNN] [TFLiteConverter.] Input tensors containing unknown dimensions fails when coupled with LSTM

our fused ops will need the dimensions to be known because the states are maintained inside the kernel. so it is wai. :(

madarez

comment created time in 3 months

more