Ask questions[TFLu] int8 ops slower than f32


System information

  • Host OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • TensorFlow installed from (source or binary): pip install tf-nightly
  • Tensorflow version (commit SHA if source): 2.4.0-dev20200908
  • Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.): Cortex M4f

Describe the problem I compared the time spent by MicroInterpreter::Invoke() to perform different ops on the same model with with int8 quantization and without. I also tried the CMSIS-NN kernels for some of the ops. The problem is that besides the fully connected op, every other op is the same or slower with the int8 ops.

Here is a table showing the average time in ticks spent by each op's Eval(). The first column shows model with int8 quantization using cmsis-nn kernels for mul, add, fullyconnected. The second column uses the reference kernels and third is floating point.

q7cmsis q7 ref f32
fullyconnected 5990 8611
tanh 15114 15122
add 2686 2887
mul 2202 1834
sub 3301 3299
split_v 898 915
split 794 817
reshape 441 443

The tanh kernel performs the worst with 13x slower than the floating point equivalent. Is this expected or known behavior?

Please provide the exact sequence of commands/steps when you ran into the problem I have attached the models that I used for profiling.


Answer questions renjie-liu

We have optimized for neon on arm, but for micro, unfortunately those simd instructions are not available.

Hi Pete, do you have any suggestions?



Related questions

ModuleNotFoundError: No module named 'tensorflow.contrib' hot 9
Tf.Keras metrics issue hot 8
Error occurred when finalizing GeneratorDataset iterator hot 7
Error loading tensorflow hot 6
module 'tensorflow' has no attribute 'ConfigProto' hot 6
TF 2.0 'Tensor' object has no attribute 'numpy' while using .numpy() although eager execution enabled by default hot 6
tensorflow-gpu CUPTI errors
Lossy conversion from float32 to uint8. Range [0, 1]. Convert image to uint8 prior to saving to suppress this warning.
ModuleNotFoundError: No module named 'tensorflow.contrib'
When importing TensorFlow, error loading Hadoop
OSError: SavedModel file does not exist at: saved_model_dir/{saved_model.pbtxt|saved_model.pb}
AttributeError: module 'tensorflow.python.framework.op_def_registry' has no attribute 'register_op_list'
tf.keras.layers.Conv1DTranspose ?
[TF 2.0] tf.keras.optimizers.Adam hot 4
TF2.0 AutoGraph issue hot 4
Github User Rank List