Ask questionsControlling the Quantization Graph-Rewriter

System information

Not applicable to this feature request.

Describe the problem

Our (Syntiant Corp's) neural network inference chips support a continuous range of parameter and activation quantization levels for reducing power consumption. Consequently, we aggressively tune our quantization levels for each application. Based on the research literature and product datasheets we are seeing, it is highly likely there are other chip makers with similar requirements. TF's current graph re-writer approach finds matching blocks in the graph and wraps them in fake quantization operations. This approach poorly serves our use cases for the following reasons:

  1. Different layers can have different quantizations. The graph re-writing approach is global to the graph.
  2. The graph re-writer attempts to heuristically match the properties of operations that should be re-written. This will generally work for traditional stored-program architectures, but when you are meddling with layers to match silicon you need to drop into TensorBoard to figure out whether the re-writer picked up the unit. If the unit is not picked up, then you are better off not using the re-writer.
  3. We have little transparency into changes in the TF codebase on these features. With more explicit specification of layer quantization it is possible to know when the quantization assumptions change and we can track the latest releases of TF.

Our request: We would like to work with an API in which the quantization operations are more explicitly specified at the layer (Keras) or op level. We could then plug the API into our specification of neural network layers built to explicitly match the low-level operations implemented in silicon.

Thank you for open sourcing TF and your efforts in supporting the community. :)

For reference:

Source code / logs

Not applicable to this feature request.


Answer questions suharshs

We are working on a Keras approach to this that will allow the proper configurability. Closing this issue.


Related questions

ModuleNotFoundError: No module named 'tensorflow.contrib'
Error occurred when finalizing GeneratorDataset iterator
ModuleNotFoundError: No module named 'tensorflow.contrib'
When importing TensorFlow, error loading Hadoop hot 4
The flag 'log_dir' is defined twice. hot 3
[TF 2.0] Dataset has no attribute 'make_one_shot_iterator' hot 3
Lossy conversion from float32 to uint8. Range [0, 1]. Convert image to uint8 prior to saving to suppress this warning. hot 3
TF2.0 AutoGraph issue hot 3
Error loading tensorflow hot 3
AttributeError: module 'tensorflow' has no attribute 'set_random_seed' hot 3
No tf.lite.experimental.nn.bidirectional_dynamic_rnn ops is finded hot 3
Incorrect Error TypeError: padded_batch() missing 1 required positional argument: 'padded_shapes' hot 3
tensorflow2.0 detected 'xla_gpu' , but 'gpu' expected hot 2
Using tensorflow gpu 2.1 with Cuda 10.2 hot 2
Restoring Keras model fails inside a distribution strategy scope hot 2
Github User Rank List