profile
viewpoint

genehwung/community 0

Stores documents used by the TensorFlow developer community

genehwung/EmacsEverywhere 0

AutoHotKey configuration for toggling on/off emacs keybindings in windows7

genehwung/model-analysis 0

Model analysis tools for TensorFlow

issue commenttensorflow/tensorflow

Resource exhausted: OOM when allocating tensor with TF 2.4.0

Similar issue here. OOM exception during a call to model.fit() while it is trying to allocate a tensor.

(0) Resource exhausted: OOM when allocating tensor with shape[3072,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

The recommendation I have seen everywhere on github and stackoverflow is to reduce the minibatch size. I don't think the batch size is the issue in my case since:

  1. error occurs only in the 13th trials. Each trial uses a new instance of the same model architecture. batch size = 8. Previous instance is discarded each time. Why would first 12 trials fit comfortably into GPU memory and not the 13th?
  2. I can run 12 trials with batch size = 16. Also crashes at 13th.
  3. nvidia-smi logs shows the GPU RAM utilisation doesn't exceeds 58% (batch size = 8) and returns to 0% before each trial (indicating that there is no GPU memory leak).

Reproduced with tensorflow 2.3.1, 2.4.0, 2.4.1; 1080 ti; ubuntu 20.04.

lmocsi

comment created time in 2 minutes

Pull request review commenttensorflow/community

RFC: Tensorflow model optimization compression API

+# Tensorflow Model Optimization Compression API++| Status        | Draft       |+:-------------- |:---------------------------------------------------- |+| **RFC #**     | TBD [NNN](https://github.com/tensorflow/community/pull/NNN) (update when you have community PR #)|+| **Author(s)** | Jaehong Kim (kimjaehong@google.com), Alan Chiao (alanchiao@google.com), Jae Yoo (jaeyoo@google.com) |+| **Sponsor**   | TBD (whomever@tensorflow.org)                 |+| **Updated**   | 2020-12-21++## Objective++Build a Keras-base API and set of guidelines that help compression algorithm developer to implement their own model compression algorithm (e.g. [Weight Clustering](https://arxiv.org/abs/1510.00149), [WEST](https://arxiv.org/abs/1811.08417)) and provide a standard way to testing/benchmark and create their own user API for model developers that includes compressed model deployment to TF serving, TFLite, and tf.js.++### Goals+* Enables algorithms that optimize the weights of a model but not the activations, which includes all [traditional lossless compression algorithms](https://en.wikipedia.org/wiki/Lossless_compression#:~:text=Lossless%20compression%20is%20a%20class,reconstructed%20from%20the%20compressed%20data.).+* Enables applying algorithms both during-training and post-training.+* Enables decompressing the weights either before inference or during inference.++### Non-Goals+* Optimize the activations of a model for accelerated inference. (e.g.+  [full-integer quantization](https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization) changes dtype of activations to integer from float.)+* The algorithms that modify the output shape of a layer. (e.g. variant of structured pruning that reduces some output shape of a layer.)++## Motivation++Today, many compression researchers fork and modify model and layer code directly. For initial training research for a small number of architectures, this would be the simplest thing to do today, given the maximal flexibility on top of existing TF Core and Keras APIs. It’s not too bad since for weight optimization, there are only a few layers to consider (Dense, LSTM, Conv, and+Embedding) for broad model coverage.++With the compression API, algorithm developers can focus the core part of their algorithm. Once they implemented the algorithm, our API and guideline gave them a standard way to test, benchmark and export the model developer APIs for their compression algorithm.++We had a small study for algorithm developer candidates for our compression APIs. It can help us to understand what kinds of requirements are needed to support several compression algorithms and what features are most important. More details are below.++TF MOT already supports several optimization algorithms such as pruning, quantization aware training, and tensor encoding. Also, ARM contributed a weight clustering algorithm. Now we require a common part of these optimization algorithms. For the first step of that, we'd like to start from the compression algorithm (subset of optimization algorithm). because it's much easier than supporting all kinds of optimization algorithms and has a meaningful impact.++## User Benefit++In this design, we'd like to reduce the common engineering cost for the compression algorithm developers.++* Write unit test model coverage test, and benchmark. Provide the comparisons of compression algorithms.+* Deployment compressed model. (TF serving, TFLite, and tf.js)+* Support TF 2.0 Keras features compatibility. (e.g. distributed training.)++## Design Proposal++We propose the compression algorithm API which helps algorithm developers create model developer APIs for their own compression algorithm.+Our API also provides guidelines for testing and benchmark. For now, we only have guidelines to apply a compression algorithm for simple MNIST vision cases. We'd like to provide an example for tensorflow [official models](https://github.com/tensorflow/models/tree/master/official) in the future.++### Tutorials and Examples+We provide the tutorial for [SVD](https://en.wikipedia.org/wiki/Singular_value_decomposition) compression algorithm that shows how we implement the SVD algorithm using TFMOT compression API by colab. This tutorial includes:++* Algorithm developer side.+    1. The algorithm developer implementing the SVD algorithm uses the `WeightCompressionAlgorithm` class.++        ```python+        class SVD(algorithm.WeightCompressionAlgorithm):+          """SVD compression module config."""++          def __init__(self, params):+            self.params = params++          def init_training_weights(+              self, pretrained_weight: tf.Tensor):+            """Init function from pre-trained model case."""+            rank = self.params.rank++            # Dense Layer+            if len(pretrained_weight.shape) == 2:+              u, sv = tf_svd_factorization_2d(pretrained_weight, rank)+            else:+              raise NotImplementedError('Only for dimension=2 is supported.')++            self.add_training_weight(+                name='u',+                shape=u.shape,+                dtype=u.dtype,+                initializer=tf.keras.initializers.Constant(u))+            self.add_training_weight(+                name='sv',+                shape=sv.shape,+                dtype=sv.dtype,+                initializer=tf.keras.initializers.Constant(sv))++          def project_training_weights(self, u: tf.Tensor, sv: tf.Tensor) -> tf.Tensor:+            return tf.matmul(u, sv)++          def get_compressible_weights(+              self, original_layer: tf.keras.layers.Layer) -> List[str]:+            rank = self.params.rank+            if isinstance(original_layer, tf.keras.layers.Dense):+              input_dim = original_layer.kernel.shape[0]+              output_dim = original_layer.kernel.shape[1]+              if input_dim * output_dim > (input_dim + output_dim) * rank:+                return ['kernel']+            return []+        ```++    1. Export the model developer API for the SVD algorithm.+        ```python+        class SVDParams(object):+          """Define container for parameters for SVD algorithm."""++          def __init__(self, rank):+            self.rank = rank++        def optimize(to_optimize: tf.keras.Model, params: SVDParams) -> tf.keras.Model:+          """Model developer API for optimizing a model."""++          def _optimize_layer(layer):+            # Require layer to be built so that the SVD-factorized weights+            # can be initialized from the weights.+            if not layer.built:+              raise ValueError(+                  'Applying SVD currently requires passing in a built model')++            return algorithm.create_layer_for_training(layer, algorithm=SVD(params))++          return tf.keras.models.clone_model(+              to_optimize, clone_function=_optimize_layer)+        ```++* Model developer side.+    1. The model developer uses the SVD algorithm.+        ```python+        params = SVDParams(rank=32)+        compressed_model = optimize(model, params)++        loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)+        compressed_model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])++        compressed_model.fit(x_train, y_train, epochs=2)+        compressed_model.evaluate(x_test, y_test, verbose=2)+        ```+    1. Deploys their compressed model to TFLite model+        ```python+        compressed_model.save('/tmp/model_svd_compressed')++        def tflite_convert(saved_model_path, tflite_path):+          converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_path)+          converted = converter.convert()+          open(tflite_path, 'wb').write(converted)++        tflite_convert('/tmp/model_svd_compressed',+                       '/tmp/tflite/model_svd_compressed.tflite')+        ```++We also want to provide an example of well-known compression algorithms. Here’s algorithm list at least we have to provide:+* [Weight clustering](https://arxiv.org/abs/1510.00149) : Most famous compression algorithm that can be used widely.+* [WEST](https://arxiv.org/abs/1811.08417) : Example for language model area.+* [Pruning](https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras) : Example for scheduling feature.++### Weight compression algorithm API++<p align="center">+ <img src=20201221-tfmot-compression-api/class_graph.png />+</p>++This is an API for a layer weight based compression algorithm.++First, we start from a pre-trained model which the model developer has. And then convert the pre-trained model to training phase model for compression fine-tuning training. During the convert to training phase model, We call `init_training_weights` for each tensor that we want to compress which is specified from the `get_compressible_weights` method.++During the training phase, `project_training_weights` method is called for each training step. After fine-tuning training for compression is finished, we convert the training phase model to a compressed model. We only call the `compress_training_weights` function once for each compressible tensor for converting.++Compressed model contains the `decompress_weights` function in the graph. It’s possible to call the `decompress_weights` for each inference step. To improve performance, we’ll cache the decompressed one depending on flags if we have enough space.++```python+class WeightCompressionAlgorithm(metaclass=abc.ABCMeta):+  """Interface for weight compression algorithm that acts on a per-layer basis.++     This allows both options of either decompressing during inference or+     decompressing prior to inference (where compression occurs by applying a+     tool such as zip to the model file).++     This interface is a purely functional one.+  """++  @abc.abstractmethod+  def get_compressible_weights(

Done. Thanks for reducing confusion.

Xhark

comment created time in 8 minutes

Pull request review commenttensorflow/community

RFC: Tensorflow model optimization compression API

+# Tensorflow Model Optimization Compression API++| Status        | Draft       |+:-------------- |:---------------------------------------------------- |+| **RFC #**     | [342](https://github.com/tensorflow/community/pull/342) |+| **Author(s)** | Jaehong Kim (kimjaehong@google.com), Alan Chiao (alanchiao@google.com), Jae Yoo (jaeyoo@google.com) |+| **Sponsor**   | Francois Chollet (fchollet@google.com)                 |+| **Updated**   | 2020-12-21++## Objective++Build a Keras-base API and set of guidelines that help compression algorithm developer to implement their own model compression algorithm (e.g. [Weight Clustering](https://arxiv.org/abs/1510.00149), [WEST](https://arxiv.org/abs/1811.08417)) and provide a standard way to testing/benchmark and create their own user API for model developers that includes compressed model deployment to TF serving, TFLite, and tf.js.++### Goals+* Enables algorithms that optimize the weights of a model but not the activations, which includes all [traditional lossless compression algorithms](https://en.wikipedia.org/wiki/Lossless_compression#:~:text=Lossless%20compression%20is%20a%20class,reconstructed%20from%20the%20compressed%20data.).+* Enables applying algorithms both during-training and post-training.+* Enables decompressing the weights either before inference or during inference.++### Non-Goals+* Optimize the activations of a model for accelerated inference. (e.g.+  [full-integer quantization](https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization) changes dtype of activations to integer from float.)+* The algorithms that modify the output shape of a layer. (e.g. variant of structured pruning that reduces some output shape of a layer.)++## Motivation++Today, many compression researchers fork and modify model and layer code directly. For initial training research for a small number of architectures, this would be the simplest thing to do today, given the maximal flexibility on top of existing TF Core and Keras APIs. It’s not too bad since for weight optimization, there are only a few layers to consider (Dense, LSTM, Conv, and+Embedding) for broad model coverage.++With the compression API, algorithm developers can focus the core part of their algorithm. Once they implemented the algorithm, our API and guideline gave them a standard way to test, benchmark and export the model developer APIs for their compression algorithm.++We had a small study for algorithm developer candidates for our compression APIs. It can help us to understand what kinds of requirements are needed to support several compression algorithms and what features are most important. More details are below.++TF MOT already supports several optimization algorithms such as pruning, quantization aware training, and tensor encoding. Also, ARM contributed a weight clustering algorithm. Now we require a common part of these optimization algorithms. For the first step of that, we'd like to start from the compression algorithm (subset of optimization algorithm). because it's much easier than supporting all kinds of optimization algorithms and has a meaningful impact.++## User Benefit++In this design, we'd like to reduce the common engineering cost for the compression algorithm developers.++* Write unit test model coverage test, and benchmark. Provide the comparisons of compression algorithms.+* Deployment compressed model. (TF serving, TFLite, and tf.js)+* Support TF 2.0 Keras features compatibility. (e.g. distributed training.)++## Design Proposal++We propose the compression algorithm API which helps algorithm developers create model developer APIs for their own compression algorithm.+Our API also provides guidelines for testing and benchmark. For now, we only have guidelines to apply a compression algorithm for simple MNIST vision cases. We'd like to provide an example for tensorflow [official models](https://github.com/tensorflow/models/tree/master/official) in the future.++### Tutorials and Examples+We provide the tutorial for [SVD](https://en.wikipedia.org/wiki/Singular_value_decomposition) compression algorithm that shows how we implement the SVD algorithm using TFMOT compression API by colab. This tutorial includes:++#### Algorithm developer side+1. The algorithm developer implementing the SVD algorithm uses the `WeightCompressor` class.++```python+class SVD(algorithm.WeightCompressor):+  """SVD compression module config."""++  def __init__(self, params):+    self.params = params++  def init_training_weights(+      self, pretrained_weight: tf.Tensor):+    """Init function from pre-trained model case."""+    rank = self.params.rank++    # Dense Layer+    if len(pretrained_weight.shape) == 2:+      u, sv = tf_svd_factorization_2d(pretrained_weight, rank)+    else:+      raise NotImplementedError('Only for dimension=2 is supported.')++    self.add_training_weight(+        name='u',+        shape=u.shape,+        dtype=u.dtype,+        initializer=tf.keras.initializers.Constant(u))+    self.add_training_weight(+        name='sv',+        shape=sv.shape,+        dtype=sv.dtype,+        initializer=tf.keras.initializers.Constant(sv))++  def project_training_weights(self, u: tf.Tensor, sv: tf.Tensor) -> tf.Tensor:+    return tf.matmul(u, sv)++  def get_compressible_weights(+      self, original_layer: tf.keras.layers.Layer) -> List[str]:+    rank = self.params.rank+    if isinstance(original_layer, tf.keras.layers.Dense):+      input_dim = original_layer.kernel.shape[0]+      output_dim = original_layer.kernel.shape[1]+      if input_dim * output_dim > (input_dim + output_dim) * rank:+        return ['kernel']+    return []+```++2. Export the model developer API for the SVD algorithm.+```python+class SVDParams(object):+  """Define container for parameters for SVD algorithm."""++  def __init__(self, rank):+    self.rank = rank++def optimize(to_optimize: tf.keras.Model, params: SVDParams) -> tf.keras.Model:+  """Model developer API for optimizing a model."""++  def _optimize_layer(layer):+    # Require layer to be built so that the SVD-factorized weights+    # can be initialized from the weights.+    if not layer.built:+      raise ValueError(+          'Applying SVD currently requires passing in a built model')++    return algorithm.create_layer_for_training(layer, algorithm=SVD(params))++  return tf.keras.models.clone_model(+      to_optimize, clone_function=_optimize_layer)+```++#### Model developer side+1. The model developer uses the SVD algorithm.+```python+params = SVDParams(rank=32)

Okay, That makes more user experience simpler.

I thought params & optimize is not a part of core API. because it can be different for each algorithms. But it's okay to make API shape like that on SVD example.

SVD only has one optimize methods, but other algorithms like pruning can have two steps. (1. original -> training model (mask + weight), 2. training model -> compressible model(sparse tensor))

I've updated RFCs to apply that. Thanks!

Xhark

comment created time in 8 minutes

push eventtensorflow/tensorflow

Lu Wang

commit sha b9559be1ad7f33e63b1907ff11932cc7c1fe46ea

Update the link in the documentation to use the stable 0.1 build PiperOrigin-RevId: 353551348 Change-Id: I861b4822985b8e6e0ffc9b0947f2b8824b8163e3

view details

push time in 17 minutes

issue commenttensorflow/tensorflow

from_dlpack unable to process arrays with column-major strides

Hi!

I was wondering if there is any progress on this issue.

Thanks!

Regards, Miguel

alecgunny

comment created time in an hour

issue commenttensorflow/tensorflow

Resource exhausted: OOM when allocating tensor with TF 2.4.0

Hello. Tagging myself into this issue and keeping it alive.

I believe that I might be facing this issue too. I have upgraded from tf1.14 to 2.4 recently and I think my code is good, although I need to generate some small generic standalone code in the process of debugging my code. It will take me a few weeks to get some code in a Colab to see if I can replicate it there. Will let you know the outcome. Please advise if anyone resolves this.

lmocsi

comment created time in 2 hours

issue commenttensorflow/tensorflow

tf.reduce_max silently produces incorrect answers on large tensors e.g., (2048,2048,1024) on RTX 3090

I tried the same thing in pytorch and it works correctly. So I do not think this is a problem with my drivers, or the basic formulation of my setup.

In [14]: torch_sector = torch_sector.cuda()
In [15]: torch_sector.dtype
Out[15]: torch.float32
In [16]: torch_sector.device
Out[16]: device(type='cuda', index=0)
In [17]: torch_sector.max()                                                                                                                                                                 
Out[17]: tensor(856689.3750, device='cuda:0')
In [18]: sector_rct.max()
Out[18]: 856689.4
varung

comment created time in 2 hours

issue commenttensorflow/tensorflow

Cannot convert model containing categorical_column_with_vocabulary_list op

Please take a look at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/hashtable/README.md in order to use the provided hash table op kernels as a custom op library.

icoffeebeans

comment created time in 2 hours

issue commenttensorflow/tensorflow

Cannot convert model containing categorical_column_with_vocabulary_list op

@abattery is this still the case?

Removed AddHashtableOps support in Python temporarily. However, you can still add this to an interpreter in C++.

If it's been added back, do you have any example code on how to use it from python? As mentioned in this stack overflow post, I was able to add converter.allow_custom_ops = True to get past tf.HashTableV2 missing custom implemenation errors during conversion to tflite, however I'm not clear on how to perform inference with the tflite model.

icoffeebeans

comment created time in 3 hours

issue commenttensorflow/tensorflow

Operator Softplus is not supported by the standard TensorFlow Lite runtime

Are you satisfied with the resolution of your issue? <a href="https://docs.google.com/forms/d/e/1FAIpQLSfaP12TRhd9xSxjXZjcZFNXPGk4kc1-qMdv3gc6bEP90vY1ew/viewform?entry.85265664=Yes&entry.2137816233=https://github.com/tensorflow/tensorflow/issues/46625">Yes</a> <a href="https://docs.google.com/forms/d/e/1FAIpQLSfaP12TRhd9xSxjXZjcZFNXPGk4kc1-qMdv3gc6bEP90vY1ew/viewform?entry.85265664=No&entry.2137816233=https://github.com/tensorflow/tensorflow/issues/46625">No</a>

bugreporter450

comment created time in 3 hours

issue commenttensorflow/tensorflow

Operator Softplus is not supported by the standard TensorFlow Lite runtime

duplicate issue of https://github.com/tensorflow/tensorflow/issues/46626

bugreporter450

comment created time in 3 hours

issue closedtensorflow/tensorflow

Operator Softplus is not supported by the standard TensorFlow Lite runtime

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): a GNU/Linux system with Linux kernel 4.15.0 on 1 6-core 3.60GHz Intel Core CPU i7-6850K with 64 GB RAM equipped with a NVIDIA Corporation GP102 GPUs
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): tensorflow2.1.0-GPU
  • Python version: 3.6

Describe the current behavior When I converted the trained hdf5 model to tflite, the following operator non-support occurred :Softplus

Describe the expected behavior The hdf5 model should be successfully converted to the format of tflite.

Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/Jupyter/any notebook.

batch_size = 122
epochs = 148
num_classes = 10
import os
save_dir = 'model'
model_name = 'trained_model.h5'
import keras as keras
(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
img_rows, img_cols = x_train.shape[1], x_train.shape[2]

x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

import keras as keras
model = keras.models.Sequential()
model.add(keras.layers.ThresholdedReLU(theta=0.3597445834106594))
model.add(keras.layers.MaxPooling2D(pool_size = (1, 1), strides = (1, 1), padding = 'valid'))

model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(num_classes, activation='softplus'))
model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adadelta(), metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test))

model_path = os.path.join(save_dir, model_name)
model.save(model_path)

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

Exception: We are continually in the process of adding support to TensorFlow Lite for more ops. It would be helpful if you could inform us of how this conversion went by opening a github issue at https://github.com/tensorflow/tensorflow/issues/new?template=40-tflite-op-request.md
 and pasting the following:

Some of the operators in the model are not supported by the standard TensorFlow Lite runtime. If those are native TensorFlow operators, you might be able to use the extended runtime by passing --enable_select_tf_ops, or by setting target_ops=TFLITE_BUILTINS,SELECT_TF_OPS when calling tf.lite.TFLiteConverter(). Otherwise, if you have a custom implementation for them you can disable this error with --allow_custom_ops, or by setting allow_custom_ops=True when calling tf.lite.TFLiteConverter(). Here is a list of builtin operators you are using: CAST, FULLY_CONNECTED, GREATER, MAX_POOL_2D, MUL. Here is a list of operators for which you will need custom implementations: Softplus.

closed time in 3 hours

bugreporter450

issue commenttensorflow/tensorflow

Operator Softsign is not supported by the standard TensorFlow Lite runtime

Please consider enabling Softsign operator with TF select option. https://www.tensorflow.org/lite/guide/ops_select

bugreporter450

comment created time in 3 hours

issue commenttensorflow/tensorflow

TF ConvertedModel: Invoke fails with "Node number X (CONCATENATION) failed to prepare" error

It is hard to reproduce your problem in my side. Is it possible to create a reproducible notebook and share it to us?

MaxxTr

comment created time in 3 hours

issue commenttensorflow/tensorflow

tflite model can not inference when use tensorflow op

You need to build your binary with TFLite build with the same or newer version of the TensorFlow that were used for the converter. I doubt that your TFLite model is converted by the tf-nightly but your inference program is based on the 2.4.0 verison.

zhaohb

comment created time in 3 hours

issue commenttensorflow/tensorflow

How the quantization on BERT

You can enable TF kernels fallback using TF Select. See instructions: https://www.tensorflow.org/lite/guide/ops_select

TLCFYBJJHYYSND

comment created time in 3 hours

issue openedtensorflow/tensorflow

Unable to tf.saved_model.load() from a trained Keras model

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.3.1
  • Python version: 3.8

Describe the current behavior I have a Saved Model. When I attempt to load it in the JVM, it crashes the JVM - see https://github.com/tensorflow/java/issues/194 . When I use tf.saved_model.load(), it fails complaining about, The same saveable will be restored with two names: layer_with_weights-0/layer_with_weights-1/layer_with_weights-0/_table/.ATTRIBUTES/table.

The model does work in TF-Serving and when using keras.models.load_model()

Describe the expected behavior The model should load everywhere and should not cause a core dump in the JVM.

Standalone code to reproduce the issue The model is proprietary. I am struggling with how to triage the issue into something that I can share. If you have thoughts, please let me know!

Other info / logs The SavedModel is created from a Keras model.

See https://github.com/tensorflow/java/issues/194 for some JVM dump logs.

created time in 3 hours

Pull request review commenttensorflow/tensorflow

Refactor ReshapeSparseTensor into a template+class

 limitations under the License.  namespace tensorflow { +using CPUDevice = Eigen::ThreadPoolDevice;++namespace functor {++template <>+struct ReshapeSparseTensor<CPUDevice> {+  Status operator()(OpKernelContext *context, const TensorShape &input_shape,+                    const TensorShape &output_shape,+                    typename TTypes<int64>::ConstMatrix input_indices,+                    typename TTypes<int64>::Matrix output_indices) const {+    (void)context;+    int64 input_rank = input_shape.dims();+    int64 output_rank = output_shape.dims();+    int64 nnz = input_indices.dimension(0);

Nit: Let's make them const.

benbarsdell

comment created time in 4 hours

Pull request review commenttensorflow/tensorflow

Refactor ReshapeSparseTensor into a template+class

 limitations under the License. #ifndef TENSORFLOW_CORE_KERNELS_RESHAPE_UTIL_H_ #define TENSORFLOW_CORE_KERNELS_RESHAPE_UTIL_H_ +#include "tensorflow/core/framework/tensor_shape.h"+#include "tensorflow/core/framework/tensor_types.h"+#include "tensorflow/core/lib/core/status.h"+ namespace tensorflow {  class OpKernelContext; class Tensor;  // Reshapes the input indices and input shape to the target shape.+// Note: This template is explicitly instantiated for CPU device only.+template <typename Device> void ReshapeSparseTensor(OpKernelContext *context,                          const Tensor &input_indices_in,                          const Tensor &input_shape_in,                          const Tensor &target_shape_in, int output_indices_idx,                          int output_shape_idx); +namespace functor {++template <typename Device>+struct ReshapeSparseTensor {

Nit: For clarity, let's rename this to ReshapeSparseTensorFunctor.

benbarsdell

comment created time in 4 hours

Pull request review commenttensorflow/tensorflow

Refactor ReshapeSparseTensor into a template+class

 void ReshapeSparseTensor(OpKernelContext *context,   for (int j = 0; j < output_shape.dims(); ++j) {     output_shape_vec(j) = output_shape.dim_size(j);   }++  Tensor *result_indices = nullptr;+  OP_REQUIRES_OK(context,+                 context->allocate_output(output_indices_idx,+                                          TensorShape({nnz, output_rank}),+                                          &result_indices));+  if (nnz > 0) {+    OP_REQUIRES_OK(context, functor::ReshapeSparseTensor<Device>()(+                                context, input_shape, output_shape,+                                input_indices_in.matrix<int64>(),+                                result_indices->matrix<int64>()));+  } } +#define EXPLICITLY_INSTANTIATE_FUNCTION(Device)                    \+  template void ReshapeSparseTensor<Device>(                       \+      OpKernelContext *context, const Tensor &input_indices_in,    \+      const Tensor &input_shape_in, const Tensor &target_shape_in, \+      int output_indices_idx, int output_shape_idx)+EXPLICITLY_INSTANTIATE_FUNCTION(CPUDevice);+#undef EXPLICITLY_INSTANTIATE_FUNCTION

Shouldn't this be in the header?

benbarsdell

comment created time in 4 hours

Pull request review commenttensorflow/tensorflow

Refactor ReshapeSparseTensor into a template+class

 limitations under the License.  namespace tensorflow { +using CPUDevice = Eigen::ThreadPoolDevice;++namespace functor {++template <>+struct ReshapeSparseTensor<CPUDevice> {+  Status operator()(OpKernelContext *context, const TensorShape &input_shape,

context is not used and should be removed from the parameter list. We can add it back later if you use it in the GPU implementation.

benbarsdell

comment created time in 4 hours

Pull request review commenttensorflow/community

RFC: Tensorflow model optimization compression API

+# Tensorflow Model Optimization Compression API++| Status        | Draft       |+:-------------- |:---------------------------------------------------- |+| **RFC #**     | TBD [NNN](https://github.com/tensorflow/community/pull/NNN) (update when you have community PR #)|+| **Author(s)** | Jaehong Kim (kimjaehong@google.com), Alan Chiao (alanchiao@google.com), Jae Yoo (jaeyoo@google.com) |+| **Sponsor**   | TBD (whomever@tensorflow.org)                 |+| **Updated**   | 2020-12-21++## Objective++Build a Keras-base API and set of guidelines that help compression algorithm developer to implement their own model compression algorithm (e.g. [Weight Clustering](https://arxiv.org/abs/1510.00149), [WEST](https://arxiv.org/abs/1811.08417)) and provide a standard way to testing/benchmark and create their own user API for model developers that includes compressed model deployment to TF serving, TFLite, and tf.js.++### Goals+* Enables algorithms that optimize the weights of a model but not the activations, which includes all [traditional lossless compression algorithms](https://en.wikipedia.org/wiki/Lossless_compression#:~:text=Lossless%20compression%20is%20a%20class,reconstructed%20from%20the%20compressed%20data.).+* Enables applying algorithms both during-training and post-training.+* Enables decompressing the weights either before inference or during inference.++### Non-Goals+* Optimize the activations of a model for accelerated inference. (e.g.+  [full-integer quantization](https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization) changes dtype of activations to integer from float.)+* The algorithms that modify the output shape of a layer. (e.g. variant of structured pruning that reduces some output shape of a layer.)++## Motivation++Today, many compression researchers fork and modify model and layer code directly. For initial training research for a small number of architectures, this would be the simplest thing to do today, given the maximal flexibility on top of existing TF Core and Keras APIs. It’s not too bad since for weight optimization, there are only a few layers to consider (Dense, LSTM, Conv, and+Embedding) for broad model coverage.++With the compression API, algorithm developers can focus the core part of their algorithm. Once they implemented the algorithm, our API and guideline gave them a standard way to test, benchmark and export the model developer APIs for their compression algorithm.++We had a small study for algorithm developer candidates for our compression APIs. It can help us to understand what kinds of requirements are needed to support several compression algorithms and what features are most important. More details are below.++TF MOT already supports several optimization algorithms such as pruning, quantization aware training, and tensor encoding. Also, ARM contributed a weight clustering algorithm. Now we require a common part of these optimization algorithms. For the first step of that, we'd like to start from the compression algorithm (subset of optimization algorithm). because it's much easier than supporting all kinds of optimization algorithms and has a meaningful impact.++## User Benefit++In this design, we'd like to reduce the common engineering cost for the compression algorithm developers.++* Write unit test model coverage test, and benchmark. Provide the comparisons of compression algorithms.+* Deployment compressed model. (TF serving, TFLite, and tf.js)+* Support TF 2.0 Keras features compatibility. (e.g. distributed training.)++## Design Proposal++We propose the compression algorithm API which helps algorithm developers create model developer APIs for their own compression algorithm.+Our API also provides guidelines for testing and benchmark. For now, we only have guidelines to apply a compression algorithm for simple MNIST vision cases. We'd like to provide an example for tensorflow [official models](https://github.com/tensorflow/models/tree/master/official) in the future.++### Tutorials and Examples+We provide the tutorial for [SVD](https://en.wikipedia.org/wiki/Singular_value_decomposition) compression algorithm that shows how we implement the SVD algorithm using TFMOT compression API by colab. This tutorial includes:++* Algorithm developer side.+    1. The algorithm developer implementing the SVD algorithm uses the `WeightCompressionAlgorithm` class.++        ```python+        class SVD(algorithm.WeightCompressionAlgorithm):+          """SVD compression module config."""++          def __init__(self, params):+            self.params = params++          def init_training_weights(+              self, pretrained_weight: tf.Tensor):+            """Init function from pre-trained model case."""+            rank = self.params.rank++            # Dense Layer+            if len(pretrained_weight.shape) == 2:+              u, sv = tf_svd_factorization_2d(pretrained_weight, rank)+            else:+              raise NotImplementedError('Only for dimension=2 is supported.')++            self.add_training_weight(+                name='u',+                shape=u.shape,+                dtype=u.dtype,+                initializer=tf.keras.initializers.Constant(u))+            self.add_training_weight(+                name='sv',+                shape=sv.shape,+                dtype=sv.dtype,+                initializer=tf.keras.initializers.Constant(sv))++          def project_training_weights(self, u: tf.Tensor, sv: tf.Tensor) -> tf.Tensor:+            return tf.matmul(u, sv)++          def get_compressible_weights(+              self, original_layer: tf.keras.layers.Layer) -> List[str]:+            rank = self.params.rank+            if isinstance(original_layer, tf.keras.layers.Dense):+              input_dim = original_layer.kernel.shape[0]+              output_dim = original_layer.kernel.shape[1]+              if input_dim * output_dim > (input_dim + output_dim) * rank:+                return ['kernel']+            return []+        ```++    1. Export the model developer API for the SVD algorithm.+        ```python+        class SVDParams(object):+          """Define container for parameters for SVD algorithm."""++          def __init__(self, rank):+            self.rank = rank++        def optimize(to_optimize: tf.keras.Model, params: SVDParams) -> tf.keras.Model:+          """Model developer API for optimizing a model."""++          def _optimize_layer(layer):+            # Require layer to be built so that the SVD-factorized weights+            # can be initialized from the weights.+            if not layer.built:+              raise ValueError(+                  'Applying SVD currently requires passing in a built model')++            return algorithm.create_layer_for_training(layer, algorithm=SVD(params))++          return tf.keras.models.clone_model(+              to_optimize, clone_function=_optimize_layer)+        ```++* Model developer side.+    1. The model developer uses the SVD algorithm.+        ```python+        params = SVDParams(rank=32)+        compressed_model = optimize(model, params)++        loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)+        compressed_model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])++        compressed_model.fit(x_train, y_train, epochs=2)+        compressed_model.evaluate(x_test, y_test, verbose=2)+        ```+    1. Deploys their compressed model to TFLite model+        ```python+        compressed_model.save('/tmp/model_svd_compressed')++        def tflite_convert(saved_model_path, tflite_path):+          converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_path)+          converted = converter.convert()+          open(tflite_path, 'wb').write(converted)++        tflite_convert('/tmp/model_svd_compressed',+                       '/tmp/tflite/model_svd_compressed.tflite')+        ```++We also want to provide an example of well-known compression algorithms. Here’s algorithm list at least we have to provide:+* [Weight clustering](https://arxiv.org/abs/1510.00149) : Most famous compression algorithm that can be used widely.+* [WEST](https://arxiv.org/abs/1811.08417) : Example for language model area.+* [Pruning](https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras) : Example for scheduling feature.++### Weight compression algorithm API++<p align="center">+ <img src=20201221-tfmot-compression-api/class_graph.png />+</p>++This is an API for a layer weight based compression algorithm.++First, we start from a pre-trained model which the model developer has. And then convert the pre-trained model to training phase model for compression fine-tuning training. During the convert to training phase model, We call `init_training_weights` for each tensor that we want to compress which is specified from the `get_compressible_weights` method.++During the training phase, `project_training_weights` method is called for each training step. After fine-tuning training for compression is finished, we convert the training phase model to a compressed model. We only call the `compress_training_weights` function once for each compressible tensor for converting.++Compressed model contains the `decompress_weights` function in the graph. It’s possible to call the `decompress_weights` for each inference step. To improve performance, we’ll cache the decompressed one depending on flags if we have enough space.++```python+class WeightCompressionAlgorithm(metaclass=abc.ABCMeta):+  """Interface for weight compression algorithm that acts on a per-layer basis.++     This allows both options of either decompressing during inference or+     decompressing prior to inference (where compression occurs by applying a+     tool such as zip to the model file).++     This interface is a purely functional one.+  """++  @abc.abstractmethod+  def get_compressible_weights(

One point of confusion here is that it isn't clear whether the string represents the attribute name or the variable name (they're different: layer.kernel vs layer.kernel.name). Returning the actual weight object would make it clearer.

Xhark

comment created time in 5 hours

Pull request review commenttensorflow/community

RFC: Tensorflow model optimization compression API

+# Tensorflow Model Optimization Compression API++| Status        | Draft       |+:-------------- |:---------------------------------------------------- |+| **RFC #**     | TBD [NNN](https://github.com/tensorflow/community/pull/NNN) (update when you have community PR #)|+| **Author(s)** | Jaehong Kim (kimjaehong@google.com), Alan Chiao (alanchiao@google.com), Jae Yoo (jaeyoo@google.com) |+| **Sponsor**   | TBD (whomever@tensorflow.org)                 |+| **Updated**   | 2020-12-21++## Objective++Build a Keras-base API and set of guidelines that help compression algorithm developer to implement their own model compression algorithm (e.g. [Weight Clustering](https://arxiv.org/abs/1510.00149), [WEST](https://arxiv.org/abs/1811.08417)) and provide a standard way to testing/benchmark and create their own user API for model developers that includes compressed model deployment to TF serving, TFLite, and tf.js.++### Goals+* Enables algorithms that optimize the weights of a model but not the activations, which includes all [traditional lossless compression algorithms](https://en.wikipedia.org/wiki/Lossless_compression#:~:text=Lossless%20compression%20is%20a%20class,reconstructed%20from%20the%20compressed%20data.).+* Enables applying algorithms both during-training and post-training.+* Enables decompressing the weights either before inference or during inference.++### Non-Goals+* Optimize the activations of a model for accelerated inference. (e.g.+  [full-integer quantization](https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization) changes dtype of activations to integer from float.)+* The algorithms that modify the output shape of a layer. (e.g. variant of structured pruning that reduces some output shape of a layer.)++## Motivation++Today, many compression researchers fork and modify model and layer code directly. For initial training research for a small number of architectures, this would be the simplest thing to do today, given the maximal flexibility on top of existing TF Core and Keras APIs. It’s not too bad since for weight optimization, there are only a few layers to consider (Dense, LSTM, Conv, and+Embedding) for broad model coverage.++With the compression API, algorithm developers can focus the core part of their algorithm. Once they implemented the algorithm, our API and guideline gave them a standard way to test, benchmark and export the model developer APIs for their compression algorithm.++We had a small study for algorithm developer candidates for our compression APIs. It can help us to understand what kinds of requirements are needed to support several compression algorithms and what features are most important. More details are below.++TF MOT already supports several optimization algorithms such as pruning, quantization aware training, and tensor encoding. Also, ARM contributed a weight clustering algorithm. Now we require a common part of these optimization algorithms. For the first step of that, we'd like to start from the compression algorithm (subset of optimization algorithm). because it's much easier than supporting all kinds of optimization algorithms and has a meaningful impact.++## User Benefit++In this design, we'd like to reduce the common engineering cost for the compression algorithm developers.++* Write unit test model coverage test, and benchmark. Provide the comparisons of compression algorithms.+* Deployment compressed model. (TF serving, TFLite, and tf.js)+* Support TF 2.0 Keras features compatibility. (e.g. distributed training.)++## Design Proposal++We propose the compression algorithm API which helps algorithm developers create model developer APIs for their own compression algorithm.+Our API also provides guidelines for testing and benchmark. For now, we only have guidelines to apply a compression algorithm for simple MNIST vision cases. We'd like to provide an example for tensorflow [official models](https://github.com/tensorflow/models/tree/master/official) in the future.++### Tutorials and Examples+We provide the tutorial for [SVD](https://en.wikipedia.org/wiki/Singular_value_decomposition) compression algorithm that shows how we implement the SVD algorithm using TFMOT compression API by colab. This tutorial includes:++* Algorithm developer side.+    1. The algorithm developer implementing the SVD algorithm uses the `WeightCompressionAlgorithm` class.++        ```python+        class SVD(algorithm.WeightCompressionAlgorithm):+          """SVD compression module config."""++          def __init__(self, params):+            self.params = params++          def init_training_weights(+              self, pretrained_weight: tf.Tensor):+            """Init function from pre-trained model case."""+            rank = self.params.rank++            # Dense Layer+            if len(pretrained_weight.shape) == 2:+              u, sv = tf_svd_factorization_2d(pretrained_weight, rank)+            else:+              raise NotImplementedError('Only for dimension=2 is supported.')++            self.add_training_weight(+                name='u',+                shape=u.shape,+                dtype=u.dtype,+                initializer=tf.keras.initializers.Constant(u))+            self.add_training_weight(+                name='sv',+                shape=sv.shape,+                dtype=sv.dtype,+                initializer=tf.keras.initializers.Constant(sv))++          def project_training_weights(self, u: tf.Tensor, sv: tf.Tensor) -> tf.Tensor:+            return tf.matmul(u, sv)++          def get_compressible_weights(+              self, original_layer: tf.keras.layers.Layer) -> List[str]:+            rank = self.params.rank+            if isinstance(original_layer, tf.keras.layers.Dense):+              input_dim = original_layer.kernel.shape[0]+              output_dim = original_layer.kernel.shape[1]+              if input_dim * output_dim > (input_dim + output_dim) * rank:+                return ['kernel']+            return []+        ```++    1. Export the model developer API for the SVD algorithm.+        ```python+        class SVDParams(object):+          """Define container for parameters for SVD algorithm."""++          def __init__(self, rank):+            self.rank = rank++        def optimize(to_optimize: tf.keras.Model, params: SVDParams) -> tf.keras.Model:+          """Model developer API for optimizing a model."""++          def _optimize_layer(layer):+            # Require layer to be built so that the SVD-factorized weights+            # can be initialized from the weights.+            if not layer.built:+              raise ValueError(+                  'Applying SVD currently requires passing in a built model')++            return algorithm.create_layer_for_training(layer, algorithm=SVD(params))++          return tf.keras.models.clone_model(+              to_optimize, clone_function=_optimize_layer)+        ```++* Model developer side.+    1. The model developer uses the SVD algorithm.+        ```python+        params = SVDParams(rank=32)+        compressed_model = optimize(model, params)++        loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)+        compressed_model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])++        compressed_model.fit(x_train, y_train, epochs=2)+        compressed_model.evaluate(x_test, y_test, verbose=2)+        ```+    1. Deploys their compressed model to TFLite model+        ```python+        compressed_model.save('/tmp/model_svd_compressed')++        def tflite_convert(saved_model_path, tflite_path):+          converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_path)+          converted = converter.convert()+          open(tflite_path, 'wb').write(converted)++        tflite_convert('/tmp/model_svd_compressed',+                       '/tmp/tflite/model_svd_compressed.tflite')+        ```++We also want to provide an example of well-known compression algorithms. Here’s algorithm list at least we have to provide:+* [Weight clustering](https://arxiv.org/abs/1510.00149) : Most famous compression algorithm that can be used widely.+* [WEST](https://arxiv.org/abs/1811.08417) : Example for language model area.+* [Pruning](https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras) : Example for scheduling feature.++### Weight compression algorithm API++<p align="center">+ <img src=20201221-tfmot-compression-api/class_graph.png />+</p>++This is an API for a layer weight based compression algorithm.++First, we start from a pre-trained model which the model developer has. And then convert the pre-trained model to training phase model for compression fine-tuning training. During the convert to training phase model, We call `init_training_weights` for each tensor that we want to compress which is specified from the `get_compressible_weights` method.++During the training phase, `project_training_weights` method is called for each training step. After fine-tuning training for compression is finished, we convert the training phase model to a compressed model. We only call the `compress_training_weights` function once for each compressible tensor for converting.++Compressed model contains the `decompress_weights` function in the graph. It’s possible to call the `decompress_weights` for each inference step. To improve performance, we’ll cache the decompressed one depending on flags if we have enough space.++```python+class WeightCompressionAlgorithm(metaclass=abc.ABCMeta):+  """Interface for weight compression algorithm that acts on a per-layer basis.++     This allows both options of either decompressing during inference or+     decompressing prior to inference (where compression occurs by applying a+     tool such as zip to the model file).++     This interface is a purely functional one.+  """++  @abc.abstractmethod+  def get_compressible_weights(+      self, original_layer: tf.keras.layers.Layer) -> List[str]:+    """Define compressible weights for each layer.++    Args:+       original_layer: tf.keras.layers.Layer representing a layer from the+       original model.++    Returns:+       List of attribute names as string representing list of compressible+       weights for the given layer. (e.g. return value ['kernel'] means+       layer.kernel is compressible.)+    """++  @abc.abstractmethod+  def init_training_weights(+      self, pretrained_weight: tf.Tensor):+    """Initialize training weights for the training model. It calls the `add_training_weight` method several times to add training weights.++    Args:+      pretrained_weight: tf.Tensor of a pretrained weight of a layer that will+        be compressed eventually.+    """++  def add_training_weight(+      self, *args, **kwargs):+    """Add training weight for the training model. This method is called from `init_training_weights`.++    Args:+      *args, **kwargs: args and kwargs for training_model.add_weight.+    """++  @abc.abstractmethod+  def project_training_weights(self, *training_weights: tf.Tensor) -> tf.Tensor:+    """Define a piece of the forward pass during training, which operates on a single compressible weight.+    The default throws an error when training occurs.++    Args:+       *training_weights: tf.Tensors representing any variables used during+         training, for a single compressible weight, in the order returned in+         `init_training_weights`.++    Returns:+       tf.Tensor to set the compressible weight to.+    """++  def update_training_weight(self, index: integer, tensor: tf.Tensor):+    """Update a training weight on an index to a given tensor value.++    This method is for the case that training weight should update to specific+    value not from the model optimizer. It'll throws an error if it can't+    find the training weight.++    Args:+      index: integer indicates index of training weight to update.+      tensor: tf.Tensor to update specific training weight.+    """++  @abc.abstractmethod+  def compress_training_weights(self, *training_weights: tf.Tensor) -> List[tf.Tensor]:

Ok, sounds fine then. Thanks for the clarification.

Xhark

comment created time in 4 hours

Pull request review commenttensorflow/community

RFC: Tensorflow model optimization compression API

+# Tensorflow Model Optimization Compression API++| Status        | Draft       |+:-------------- |:---------------------------------------------------- |+| **RFC #**     | [342](https://github.com/tensorflow/community/pull/342) |+| **Author(s)** | Jaehong Kim (kimjaehong@google.com), Alan Chiao (alanchiao@google.com), Jae Yoo (jaeyoo@google.com) |+| **Sponsor**   | Francois Chollet (fchollet@google.com)                 |+| **Updated**   | 2020-12-21++## Objective++Build a Keras-base API and set of guidelines that help compression algorithm developer to implement their own model compression algorithm (e.g. [Weight Clustering](https://arxiv.org/abs/1510.00149), [WEST](https://arxiv.org/abs/1811.08417)) and provide a standard way to testing/benchmark and create their own user API for model developers that includes compressed model deployment to TF serving, TFLite, and tf.js.++### Goals+* Enables algorithms that optimize the weights of a model but not the activations, which includes all [traditional lossless compression algorithms](https://en.wikipedia.org/wiki/Lossless_compression#:~:text=Lossless%20compression%20is%20a%20class,reconstructed%20from%20the%20compressed%20data.).+* Enables applying algorithms both during-training and post-training.+* Enables decompressing the weights either before inference or during inference.++### Non-Goals+* Optimize the activations of a model for accelerated inference. (e.g.+  [full-integer quantization](https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization) changes dtype of activations to integer from float.)+* The algorithms that modify the output shape of a layer. (e.g. variant of structured pruning that reduces some output shape of a layer.)++## Motivation++Today, many compression researchers fork and modify model and layer code directly. For initial training research for a small number of architectures, this would be the simplest thing to do today, given the maximal flexibility on top of existing TF Core and Keras APIs. It’s not too bad since for weight optimization, there are only a few layers to consider (Dense, LSTM, Conv, and+Embedding) for broad model coverage.++With the compression API, algorithm developers can focus the core part of their algorithm. Once they implemented the algorithm, our API and guideline gave them a standard way to test, benchmark and export the model developer APIs for their compression algorithm.++We had a small study for algorithm developer candidates for our compression APIs. It can help us to understand what kinds of requirements are needed to support several compression algorithms and what features are most important. More details are below.++TF MOT already supports several optimization algorithms such as pruning, quantization aware training, and tensor encoding. Also, ARM contributed a weight clustering algorithm. Now we require a common part of these optimization algorithms. For the first step of that, we'd like to start from the compression algorithm (subset of optimization algorithm). because it's much easier than supporting all kinds of optimization algorithms and has a meaningful impact.++## User Benefit++In this design, we'd like to reduce the common engineering cost for the compression algorithm developers.++* Write unit test model coverage test, and benchmark. Provide the comparisons of compression algorithms.+* Deployment compressed model. (TF serving, TFLite, and tf.js)+* Support TF 2.0 Keras features compatibility. (e.g. distributed training.)++## Design Proposal++We propose the compression algorithm API which helps algorithm developers create model developer APIs for their own compression algorithm.+Our API also provides guidelines for testing and benchmark. For now, we only have guidelines to apply a compression algorithm for simple MNIST vision cases. We'd like to provide an example for tensorflow [official models](https://github.com/tensorflow/models/tree/master/official) in the future.++### Tutorials and Examples+We provide the tutorial for [SVD](https://en.wikipedia.org/wiki/Singular_value_decomposition) compression algorithm that shows how we implement the SVD algorithm using TFMOT compression API by colab. This tutorial includes:++#### Algorithm developer side+1. The algorithm developer implementing the SVD algorithm uses the `WeightCompressor` class.++```python+class SVD(algorithm.WeightCompressor):+  """SVD compression module config."""++  def __init__(self, params):+    self.params = params++  def init_training_weights(+      self, pretrained_weight: tf.Tensor):+    """Init function from pre-trained model case."""+    rank = self.params.rank++    # Dense Layer+    if len(pretrained_weight.shape) == 2:+      u, sv = tf_svd_factorization_2d(pretrained_weight, rank)+    else:+      raise NotImplementedError('Only for dimension=2 is supported.')++    self.add_training_weight(+        name='u',+        shape=u.shape,+        dtype=u.dtype,+        initializer=tf.keras.initializers.Constant(u))+    self.add_training_weight(+        name='sv',+        shape=sv.shape,+        dtype=sv.dtype,+        initializer=tf.keras.initializers.Constant(sv))++  def project_training_weights(self, u: tf.Tensor, sv: tf.Tensor) -> tf.Tensor:+    return tf.matmul(u, sv)++  def get_compressible_weights(+      self, original_layer: tf.keras.layers.Layer) -> List[str]:+    rank = self.params.rank+    if isinstance(original_layer, tf.keras.layers.Dense):+      input_dim = original_layer.kernel.shape[0]+      output_dim = original_layer.kernel.shape[1]+      if input_dim * output_dim > (input_dim + output_dim) * rank:+        return ['kernel']+    return []+```++2. Export the model developer API for the SVD algorithm.+```python+class SVDParams(object):+  """Define container for parameters for SVD algorithm."""++  def __init__(self, rank):+    self.rank = rank++def optimize(to_optimize: tf.keras.Model, params: SVDParams) -> tf.keras.Model:+  """Model developer API for optimizing a model."""++  def _optimize_layer(layer):+    # Require layer to be built so that the SVD-factorized weights+    # can be initialized from the weights.+    if not layer.built:+      raise ValueError(+          'Applying SVD currently requires passing in a built model')++    return algorithm.create_layer_for_training(layer, algorithm=SVD(params))++  return tf.keras.models.clone_model(+      to_optimize, clone_function=_optimize_layer)+```++#### Model developer side+1. The model developer uses the SVD algorithm.+```python+params = SVDParams(rank=32)

This object seems like an implementation detail. And it isn't clear why optimize is separate from SVD.

We could simply do:

compressed_model = SVD(rank=32).optimize(model)

and have optimize be a method on the SVD class. The SVD class would contain the weights -- no need for a separate container object.

Xhark

comment created time in 4 hours

issue commenttensorflow/tensorflow

Building TF Lite C++ shared library for macOS, linux, iOS and Android

@alejouribesanchez I have built your code with no issue. Are you sure you are linking it with the correct library, to use C API you should build with command below:

bazel build -c opt --config=opt  //tensorflow/lite/c:tensorflowlite_c

And link your target with libtensorflowlite_c.dylib, see lines below:

ADD_LIBRARY(tensorflowlite_c SHARED IMPORTED)
set_property(TARGET tensorflowlite_c PROPERTY IMPORTED_LOCATION ${CMAKE_CURRENT_SOURCE_DIR}/lib/libtensorflowlite_c.dylib)

add_executable(TFLiteC main.cpp)
target_link_libraries(TFLiteC tensorflowlite_c)

how to build tflite to work with iOS and Android in this way? using bazel build -c opt --config=opt //tensorflow/lite:tensorflowlite I got Mach-O 64-bit dynamically linked shared library x86_64 how to get iOS and android arm64?

ebraraktas

comment created time in 5 hours

issue commenttensorflow/tensorflow

help me to solve this i am running a windows machine 8 gb ram python 3.8 cudnn 6.5 cuda 11 geforce 210

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

seviksindia

comment created time in 5 hours

pull request commenttensorflow/tensorflow

Add blank_index parameter for ctc_greedy_decoder

Yes, I'm still valid!

pvarouktsis

comment created time in 5 hours

push eventtensorflow/tensorflow

Christian Sigg

commit sha 1e66b5790c93a1752fd8ea92a146aff811f2ee95

[NFC] Internal change, simplify a copybara rule. PiperOrigin-RevId: 353528845 Change-Id: Ic7bc176279e6e97227e5dfd7baca191ec33197eb

view details

push time in 6 hours

push eventtensorflow/tensorflow

Meghna Natraj

commit sha fb6d7a79d301636de0e494e26a97bd70e148d49b

Replace MNIST with a random dataset to avoid external network connections. PiperOrigin-RevId: 353527450 Change-Id: Iecff06741d3e20316a5f4cb7a39ece27c0a65d86

view details

push time in 6 hours

issue commenttensorflow/tensorflow

embedding_lookup cause ran out of memory

As you are testing without see EmbeddingBag see:

https://github.com/google/jax/issues/3206 https://github.com/tensorflow/addons/issues/2201 https://github.com/tensorflow/tensorflow/issues/32675

shz0116

comment created time in 6 hours

more