profile
viewpoint

keras-team/keras-applications 1391

Reference implementations of popular deep learning models.

tanzhenyu/baselines-tf2 6

openai baselines with tensorflow 2.0

tanzhenyu/spinup-tf2 2

spinup tutorial with tensorflow 2.0

tanzhenyu/addons 0

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons

tanzhenyu/baselines 0

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

tanzhenyu/community 0

Stores documents used by the TensorFlow developer community

tanzhenyu/examples 0

TensorFlow examples

tanzhenyu/governance 0

Governance of the Keras API.

tanzhenyu/keras 0

Deep Learning for humans

startedgoogle/lifetime_value

started time in 2 days

pull request commenttensorflow/tensorflow

Correctly counting the number of params for TextVectorization

I'm having the same problem on colab, here is notebook. I tried to install nightly with but I'm still seeing this same problem pip install -q tf-nightly

>>> tf.__version__
'2.1.0'

Hmm...if you install nightly why the version would be 2.1.0? It should be something like dev20200225

lagejoao

comment created time in 2 days

PR opened tensorflow/community

Reviewers
RFC: Keras categorical inputs

Finalize design reviews from https://github.com/tensorflow/community/pull/188

+401 -0

0 comment

1 changed file

pr created time in 3 days

push eventtanzhenyu/community

tanzhenyu

commit sha 591cdec9791c3b360b4ef85b875b27f012afc800

Finalize design reviews for Keras categorical inputs. Finalize design reviews.

view details

push time in 3 days

fork tanzhenyu/community

Stores documents used by the TensorFlow developer community

fork in 3 days

create barnchtanzhenyu/community

branch : keras-categorical

created branch time in 3 days

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-01-03                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`Lookup`, `CategoryCrossing`, `Vectorize`, `FingerPrint`), and an extension to existing op (`tf.sparse.from_dense`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++The proposed layers should support ragged tensors.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.Lookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.Vectorize(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((

use tf.split for that?

tanzhenyu

comment created time in 3 days

issue commenttensorflow/tensorflow

Add K Medoids Estimator to tf canned estimators

Hello, can I work on it? I would like to try it out.

Of course. Can you start by sending out design ideas according to RFC rule?

tambulkar

comment created time in 5 days

issue commenttensorflow/tensorflow

Keras does not verify supports_masking

@omalleyt12 do you remember if we implicitly swallow it?

asadovsky

comment created time in 6 days

issue commenttensorflow/tensorflow

Named dictionary inputs and outputs for tf.keras.Model

IIUC this breaks backward compatibility if users were passing a dictionary before? Seems niche case we can drop for the broader benefit here...

huyng

comment created time in 6 days

pull request commentgoogle-research/google-research

Add a more efficient implementation

Zhenyu, can you take a look. How do we build a custom kernel from source and make sure it is available with bazel

Done

hazimehh

comment created time in 8 days

Pull request review commenttensorflow/tensorflow

Comments for alternate implementation in Adadelta Paper #36785

 class Adadelta(optimizer_v2.OptimizerV2):       ([pdf](http://arxiv.org/pdf/1212.5701v1.pdf))    """-+  #def __init__(self, lr=1.0, rho=0.95, epsilon=None, decay=0., **kwargs):+  #Adadelta function definition as per paper by M.D. Zeiler (https://arxiv.org/pdf/1212.5701.pdf) where epsilon=1e-6and learning rate=1.0

Please put this at top of docstring. Also please put description, i.e., which section in this paper that a recommended learning rate is proposed so users are well aware of it.

abhilash1910

comment created time in 8 days

Pull request review commenttensorflow/tensorflow

Comments for alternate implementation in Adadelta Paper #36785

 class Adadelta(optimizer_v2.OptimizerV2):       ([pdf](http://arxiv.org/pdf/1212.5701v1.pdf))    """-+  #def __init__(self, lr=1.0, rho=0.95, epsilon=None, decay=0., **kwargs):

No need for this.

abhilash1910

comment created time in 8 days

Pull request review commentgoogle-research/google-research

Add a more efficient implementation

+# coding=utf-8+# Copyright 2020 The Google Research Authors.+#+# Licensed under the Apache License, Version 2.0 (the "License");+# you may not use this file except in compliance with the License.+# You may obtain a copy of the License at+#+#     http://www.apache.org/licenses/LICENSE-2.0+#+# Unless required by applicable law or agreed to in writing, software+# distributed under the License is distributed on an "AS IS" BASIS,+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+# See the License for the specific language governing permissions and+# limitations under the License.++"""Main code for creating the tree ensemble layer."""+import tensorflow as tf+from tensorflow import keras+from tensorflow.python.framework import ops+from tensorflow.keras.initializers import RandomUniform++# Assumes that neural_trees_ops.so is in the current directory.+tf_trees_module = tf.load_op_library('./neural_trees_ops.so')++# Register the custom gradient for NTComputeOutputOp.+@ops.RegisterGradient('NTComputeOutputOp')+def _nt_compute_input_and_internal_params_gradients_op_cc(+    op, grad_loss_wrt_tree_output, _):+  """Associated a custom gradient with an op."""+  output_logits_dim = op.get_attr('output_logits_dim')+  depth = op.get_attr('depth')+  parallelize_over_samples = op.get_attr('parallelize_over_samples')+  return [tf_trees_module.nt_compute_input_and_internal_params_gradients_op(+      grad_loss_wrt_tree_output, op.inputs[0], op.inputs[1], op.inputs[2],+      op.inputs[3], output_logits_dim, depth, parallelize_over_samples), None]+++class TEL(keras.layers.Layer):+    """A custom layer containing additive differentiable decision trees.++    Each tree in the layer is composed of splitting (internal) nodes and leaves.+    A splitting node "routes" the samples left or right based on the+    corresponding activation. Samples can be routed in a hard way (i.e., sent+    to only one child) or in a soft way. The decision whether to hard or soft+    route is controlled by the smooth_step_param (see details below).+    The trees are modeled using smooth functions and can be optimized+    using standard continuous optimization methods (e.g., SGD).++    The layer can be combined with other Keras layers and can be used anywhere+    in the neural net.++    Attributes:+      output_logits_dim: Dimension of the output.+      trees_num: Number of trees in the layer.+      depth: Depth of each tree.+      smooth_step_param: A non-negative float. Larger values make the trees+        more likely to hard route samples (i.e., samples reach fewer leaves).+        Values >= 1 are recommended to exploit conditional computation.+        Note smooth_step_param = 1/gamma, where gamma is the parameter defined+        in the TEL paper.+      sum_outputs: Boolean. If true, the outputs of the trees will be added,+        leading to a 2D tensor of shape=[batch_size, output_logits_dim].+        Otherwise, the tree outputs are not added and the layer output is+        a 2D tensor of shape=[batch_size, trees_num * output_logits_dim].+      parallelize_over_samples: Boolean, If true, parallelizes the updates over+        the samples in the batch. Might lead to speedups when the number of+        trees is small (at the cost of extra memory consumption).+      split_initializer: A Keras initializer for the internal (splitting) nodes.+      leaf_initializer: A Keras initializer for the leaves.+      split_regularizer: A Keras regularizer for the internal (splitting) nodes.+      leaf_regularizer: A Keras regularizer for the leaves.+    Input shape: A tensor of shape=[batch_size, input_dim].+    Output shape: A tensor of shape=[batch_size, output_logits_dim] if+      sum_outputs=True. Otherwise, a tensor of shape=[batch_size, trees_num *+      output_logits_dim].+    """++    def __init__(self,+                 output_logits_dim,

same, use "units" here.

hazimehh

comment created time in 9 days

Pull request review commentgoogle-research/google-research

Add a more efficient implementation

+from tensorflow import keras++# Import the layer.+# If the current working directory is not tf_trees, then uncomment the following+# lines and change "/path/to/" to the parent directory of tf_trees.+# import sys+# sys.path.insert(1, '/path/to/tf_trees')+from tel import TEL+# The documentation of TEL can be accessed as follows+?TEL++# We will fit TEL on the Boston Housing regression dataset.+# First, load the dataset.+from keras.datasets import boston_housing+(x_train, y_train), (x_test, y_test) = boston_housing.load_data()++# Define the tree layer; here we choose 10 trees, each of depth 3.+# Note output_logits_dim is the dimension of the tree output.+# output_logits_dim = 1 in this case, but should be equal to the+# number of classes if used as an output layer in a classification task.+tree_layer = TEL(output_logits_dim=1, trees_num=10, depth=3)

maybe say "units" instead of "output_logits_dim" to be consistent with keras.layers.Dense?

hazimehh

comment created time in 9 days

Pull request review commentgoogle-research/google-research

Add a more efficient implementation

 # The Tree Ensemble Layer -This repository contains a new *tree ensemble layer* which can be used anywhere in a neural network. We provide a low-level Tensorflow implementation along with a high-level Keras API.+This repository contains a new *tree ensemble layer* (TEL) for neural networks. The layer is differentiable so SGD can be used to train the neural network (including TEL). The layer supports conditional computation for both training and inference, i.e., when updating/evaluating a certain node in the tree, only the samples that reach that node are used in computations (this is to be contrasted with the dense computations in neural networks). We provide a low-level TensorFlow implementation along with a high-level Keras API. -The layer is differentiable so SGD can be used to train the neural network (including the tree layer). The layer supports conditional computation for training and inference: when updating/evaluating a certain node in the tree, only the samples that reach that node are used in computations (this is to be contrasted with the dense computations in neural networks).+More details to be added soon.++## Installation+The installation instructions below assume that you have Python and TensorFlow already installed. Use Method 1 if you have installed TensorFlow from source. Otherwise, use Method 2.++### Method 1: Compile using Bazel+First, copy the file "BUILD" (available in the tf_trees directory) to the directory "tensorflow/core/user_ops".+Then, run the following command:+```bash+bazel build --config opt //tensorflow/core/user_ops:neural_trees_ops.so+```++### Method 2: Compile using G++

Also say Tensorflow Binary Installation.

hazimehh

comment created time in 9 days

Pull request review commentgoogle-research/google-research

Add a more efficient implementation

 # The Tree Ensemble Layer -This repository contains a new *tree ensemble layer* which can be used anywhere in a neural network. We provide a low-level Tensorflow implementation along with a high-level Keras API.+This repository contains a new *tree ensemble layer* (TEL) for neural networks. The layer is differentiable so SGD can be used to train the neural network (including TEL). The layer supports conditional computation for both training and inference, i.e., when updating/evaluating a certain node in the tree, only the samples that reach that node are used in computations (this is to be contrasted with the dense computations in neural networks). We provide a low-level TensorFlow implementation along with a high-level Keras API. -The layer is differentiable so SGD can be used to train the neural network (including the tree layer). The layer supports conditional computation for training and inference: when updating/evaluating a certain node in the tree, only the samples that reach that node are used in computations (this is to be contrasted with the dense computations in neural networks).+More details to be added soon.++## Installation+The installation instructions below assume that you have Python and TensorFlow already installed. Use Method 1 if you have installed TensorFlow from source. Otherwise, use Method 2.++### Method 1: Compile using Bazel+First, copy the file "BUILD" (available in the tf_trees directory) to the directory "tensorflow/core/user_ops".+Then, run the following command:+```bash+bazel build --config opt //tensorflow/core/user_ops:neural_trees_ops.so+```++### Method 2: Compile using G+++From inside the tf_trees directory, run the following commands:+```bash+TF_CFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') )+TF_LFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))') )+g++ -std=c++11 -shared neural_trees_ops.cc neural_trees_kernels.cc neural_trees_helpers.cc -o neural_trees_ops.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2+```+Note: On OS X, add the flag "-undefined dynamic_lookup" (without quotes) to the last command above.  ## Example Usage In Keras, the layer can be used as follows: ```python-from keras.models import Sequential-from neural_trees_layer import NeuralTrees+from tensorflow import keras+# If the current working directory is not tf_trees, then uncomment the following+# lines and change "/path/to/" to the parent directory of tf_trees.

Can this be done at some init file? Seems weird to have user uncomment that

hazimehh

comment created time in 9 days

Pull request review commentgoogle-research/google-research

Add a more efficient implementation

 # The Tree Ensemble Layer -This repository contains a new *tree ensemble layer* which can be used anywhere in a neural network. We provide a low-level Tensorflow implementation along with a high-level Keras API.+This repository contains a new *tree ensemble layer* (TEL) for neural networks. The layer is differentiable so SGD can be used to train the neural network (including TEL). The layer supports conditional computation for both training and inference, i.e., when updating/evaluating a certain node in the tree, only the samples that reach that node are used in computations (this is to be contrasted with the dense computations in neural networks). We provide a low-level Tensorflow implementation along with a high-level Keras API.

"to train the neural network that is mixed with convolutional layers and TEL"

hazimehh

comment created time in 14 days

Pull request review commentgoogle-research/google-research

Add a more efficient implementation

 # The Tree Ensemble Layer -This repository contains a new *tree ensemble layer* which can be used anywhere in a neural network. We provide a low-level Tensorflow implementation along with a high-level Keras API.+This repository contains a new *tree ensemble layer* (TEL) for neural networks. The layer is differentiable so SGD can be used to train the neural network (including TEL). The layer supports conditional computation for both training and inference, i.e., when updating/evaluating a certain node in the tree, only the samples that reach that node are used in computations (this is to be contrasted with the dense computations in neural networks). We provide a low-level TensorFlow implementation along with a high-level Keras API. -The layer is differentiable so SGD can be used to train the neural network (including the tree layer). The layer supports conditional computation for training and inference: when updating/evaluating a certain node in the tree, only the samples that reach that node are used in computations (this is to be contrasted with the dense computations in neural networks).+More details to be added soon.++## Installation+The installation instructions below assume that you have Python and TensorFlow already installed. Use Method 1 if you have installed TensorFlow from source. Otherwise, use Method 2.++### Method 1: Compile using Bazel+First, copy the file "BUILD" (available in the tf_trees directory) to the directory "tensorflow/core/user_ops".+Then, run the following command:+```bash+bazel build --config opt //tensorflow/core/user_ops:neural_trees_ops.so

Let's say "If you have TensorFlow sources installed, you can make use of TensorFlow's build system to compile your op."

hazimehh

comment created time in 9 days

issue commenttensorflow/tensorflow

Pretrained Inception V4 to Keras Application Folder

keras applications is actually in the merging process with tf model garden. Maybe we should have a formal discussion of where this should go.

Alpheron

comment created time in 11 days

issue commenttensorflow/tensorflow

Contrib AdaMax implementation producing NaNs on GPU

Still having this problem on 1.15 is there any solution besides updating to 2.0.0?

Did you try tf.contrib optimizer or tf.keras optimizer?

benleetownsend

comment created time in 11 days

pull request commenttensorflow/tensorflow

BaseDenseAttention now supports attention dropout

It broke internal tests so @roumposg submitted a separate commit here

claverru

comment created time in 12 days

issue commenttensorflow/tensorflow

Named dictionary inputs and outputs for tf.keras.Model

This seems a good feature request to have, for now all we support is through DenseFeatures

huyng

comment created time in 13 days

pull request commenttensorflow/tensorflow

export CombinerPreprocessingLayer and Combiner

Haifeng, can you take a look at this? https://github.com/tensorflow/community/pull/188

It was reviewed and implemented, I haven't merged this into master yet.

haifeng-jin

comment created time in 15 days

issue commenttensorflow/tensorflow

Keras model.train() should automatically run table initializers.

Hi @tanzhenyu

This is not just affecting 1.15. I have also tried 2.1, and the same colab can reproduce the issue in 2.1 as well by calling tf.compat.v1.disable_eager_execution().

Are there plans to fix this in 2.x, since it is not specific to 1.15.

"You can run tf.tables_initailizer". In 2.x, we also no longer have access to the underlying sessions, any suggestions on how to run the initializer in TF 2.1? Thanks.

If you use tf.compat.v1.disable_eager_execution(), everything runs in graph mode, so you need to use tf.compat.v1.Session() to initialize it.

yzhuang

comment created time in 15 days

pull request commenttensorflow/tensorflow

export CombinerPreprocessingLayer and Combiner

This will likely not pass tests since it requires proto definition. If you need, can you email me and Francois so we can do this for you?

haifeng-jin

comment created time in 15 days

issue closedtensorflow/tensorflow

Keras model.train() should automatically run table initializers.

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Centos 7, OS X 10.15
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 1.15 and 2.1
  • Python version: 2.7 and 3.6
  • Bazel version (if compiling from source): N/A
  • GCC/Compiler version (if compiling from source): N/A
  • CUDA/cuDNN version: N/A
  • GPU model and memory: N/A

Describe the current behavior We have Keras layers that use HashTables (e.g. tf.lookup.StaticHashTable), and these tables use initializers such as tf.lookup.KeyValueTensorInitializer. When we perform model training using model.train() in non-eager mode, it does not run these table initializers and hence causes the training to crash. Currently, we work around the issue with an ugly hack like this, by saving a reference to the initializer and running it manually.

    if not tf.executing_eagerly():
      tf1.keras.backend.get_session().run(self.table._init_op)

Describe the expected behavior Calling model.train() should initialize all initializers, including hash table initializers such as tf.lookup.KeyValueTensorInitializer.

Code to reproduce the issue Any Keras layer using StaticHashTable would repro the problem. See https://gist.github.com/yzhuang/0744b487c7a5ab1b65a5b152a06cda7c#file-keraslayertableinitialization-ipynb

Suggestions My suggestion is to support HashTable initialization or publish documentation / guidance on how to use HashTable with Keras Layers.

Thanks! 🙏

closed time in 16 days

yzhuang

issue commenttensorflow/tensorflow

Keras model.train() should automatically run table initializers.

You can run tf.tables_initailizer, Keras historically doesn't have tables so it doesn't track that. Moving forward, since we don't have any immediate plans to release 1.16 or backporting this, I think this might be the only way.

yzhuang

comment created time in 16 days

pull request commenttensorflow/tensorflow

Fix: Can't set None on TextVectorization layer's split parameter problem

Can you review it, please? @tanzhenyu

Let me know after all runs are passed then I will approve it.

rushabh-v

comment created time in 17 days

issue closedtensorflow/tensorflow

model.run_eagerly=False is much slower than model.run_eagerly=True

<em>Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template</em>

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS Linux release 7.6.1810
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.0
  • Python version: Python 3.7.5

Describe the current behavior I try to implemente a simple fm algorithm in tensorflow 2.0. I found use keras.fit is very slow in default params. If I change the model.run_eagerly to True the performance will be better. Then I tried turn off eager_execution by tf.compat.v1.disable_eager_execution(), the performance is the same as tf1.14 with estimator.

  1. default
2019-12-12 15:37:04.152742: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags.
2019-12-12 15:37:04.174603: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2500000000 Hz
2019-12-12 15:37:04.187437: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556f783dc5b0 executing computations on platform Host. Devices:
2019-12-12 15:37:04.187474: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
Epoch 1/1000
    121/Unknown - 31s 256ms/step - loss: 0.6940 - AUC: 0.4324   
  1. model.run_eagerly=True
2019-12-12 15:38:36.014767: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags.
2019-12-12 15:38:36.038416: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2500000000 Hz
2019-12-12 15:38:36.051835: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b826158d40 executing computations on platform Host. Devices:
2019-12-12 15:38:36.051874: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
Epoch 1/1000
     96/Unknown - 7s 72ms/step - loss: 0.6902 - AUC: 0.3739    
  1. tf.compat.v1.disable_eager_execution()
WARNING:tensorflow:OMP_NUM_THREADS is no longer used by the default Keras config. To configure the number of threads, use tf.config.threading APIs.
2019-12-12 15:39:18.986171: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags.
2019-12-12 15:39:19.008642: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2500000000 Hz
2019-12-12 15:39:19.020877: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x561448d34bb0 executing computations on platform Host. Devices:
2019-12-12 15:39:19.020914: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From /home/luoxinchen/anaconda3/envs/tf2.0/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Train on 1000 steps
Epoch 1/1000
1000/1000 [==============================] - 1s 1ms/step - loss: 0.6942 - AUC: 0.4701  
Epoch 2/1000
1000/1000 [==============================] - 1s 633us/step - loss: 0.6939 - AUC: 0.4797
Epoch 3/1000
1000/1000 [==============================] - 1s 640us/step - loss: 0.6936 - AUC: 0.4908
Epoch 4/1000

Describe the expected behavior I think keras.fit with model.run_eagerly=False will use tf.function to wrap the training loop, and it's performance should be close to the disable eager execution. But it's perform awfully, it even slower than model.run_eagerly=True.

Code to reproduce the issue

import os
import sys
import timeit

import numpy as np
import pandas as pd

import tensorflow as tf
from tensorflow import keras

# tf.compat.v1.disable_eager_execution()
tf.config.threading.set_inter_op_parallelism_threads(8)
os.environ['OMP_NUM_THREADS'] = '1'

bucket = int(1e7)

class MyModel(keras.Model):

  def __init__(self):
    super(MyModel, self).__init__()

  def build(self, input_shape):
    self.user_emb = self.add_weight(
        shape=(bucket + 1, 32),
        dtype=tf.float32,
        initializer=tf.keras.initializers.TruncatedNormal(),
        name="user_emb")
    self.item_emb = self.add_weight(
        shape=(bucket + 1, 32),
        dtype=tf.float32,
        initializer=tf.keras.initializers.TruncatedNormal(),
        name="item_emb")
    self.bias = tf.Variable(0.0)

  def call(self, inputs):
    user_id, item_id = inputs
    user_id = tf.reshape(user_id, [-1])
    item_id = tf.reshape(item_id, [-1])
    out = tf.gather(self.user_emb, user_id) * tf.gather(self.item_emb, item_id)
    out = tf.reduce_sum(out, axis=1, keepdims=True) + self.bias
    out = tf.sigmoid(out)
    return out


def main():

  def py_func(feats):
    label = feats['labels']
    return (feats['user_id'], feats['item_id']), label

  model = MyModel()

  dataset = tf.data.Dataset.from_tensor_slices({
      "user_id": np.random.randint(bucket, size=[1000, 1]),
      "item_id": np.random.randint(bucket, size=[1000, 1]),
      "labels": np.random.randint(2, size=[1000, 1])
  }).map(py_func)

  model.compile(
      keras.optimizers.SGD(0.01), 'binary_crossentropy', metrics=['AUC'])

  # model.run_eagerly = True
  model.fit(
      dataset,
      shuffle=False,
      workers=1,
      epochs=1000)

if __name__ == '__main__':
  main()

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

closed time in 18 days

doldre

issue commenttensorflow/tensorflow

model.run_eagerly=False is much slower than model.run_eagerly=True

This issue has been fixed, @doldre you can verify it through tf-nightly. If this is not do-able, you can also wait for 2.2.

doldre

comment created time in 18 days

pull request commenttensorflow/community

RFC: Keras categorical inputs

@tanzhenyu were there notes from the design review meeting that could be posted here?

Yeah I will update it soon.

tanzhenyu

comment created time in 20 days

delete branch tanzhenyu/addons

delete branch : crossing

delete time in 21 days

pull request commenttensorflow/addons

Add PolynomialCrossing to Addons

LGTM thanks for the PR!

Thanks for the review!

tanzhenyu

comment created time in 21 days

pull request commenttensorflow/addons

Add PolynomialCrossing to Addons

I think it's just a connection issue. I'll restart the tests.

Thanks!

tanzhenyu

comment created time in 22 days

pull request commenttensorflow/addons

Add PolynomialCrossing to Addons

@seanpmorgan Thanks Sean for the review! Any idea on what the sanity check is doing? error message - "the remote end hung up unexpectedly; early EOF; index-pack failed; Git fetch failed with exit code: 128" https://github.com/tensorflow/addons/pull/1018/checks?check_run_id=424954024

tanzhenyu

comment created time in 22 days

Pull request review commenttensorflow/addons

Add PolynomialCrossing to Addons

+# Copyright 2020 The TensorFlow Authors. All Rights Reserved.+#+# Licensed under the Apache License, Version 2.0 (the "License");+# you may not use this file except in compliance with the License.+# You may obtain a copy of the License at+#+#     http://www.apache.org/licenses/LICENSE-2.0+#+# Unless required by applicable law or agreed to in writing, software+# distributed under the License is distributed on an "AS IS" BASIS,+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+# See the License for the specific language governing permissions and+# limitations under the License.+# ==============================================================================+"""Implements Polynomial Crossing Layer."""++import tensorflow as tf+from typeguard import typechecked++from tensorflow_addons.utils import types+++@tf.keras.utils.register_keras_serializable(package="Addons")+class PolynomialCrossing(tf.keras.layers.Layer):+    """Layer for Deep & Cross Network to learn explicit feature interactions.++    A layer that applies feature crossing in learning certain explicit+    bounded-degree feature interactions more efficiently. The `call` method+    accepts `inputs` as a tuple of size 2 tensors. The first input `x0` should be+    the input to the first `PolynomialCrossing` layer in the stack, or the input+    to the network (usually after the embedding layer), the second input `xi`+    is the output of the previous `PolynomialCrossing` layer in the stack, i.e.,+    the i-th `PolynomialCrossing` layer.++    The output is y = x0 * (W .* x) + bias + xi, where .* designates dot product.++    References+        See [R. Wang](https://arxiv.org/pdf/1708.05123.pdf)++    Example:++        ```python+        # after embedding layer in a functional model:+        input = tf.keras.Input(shape=(None,), name='index', dtype=tf.int64)+        x0 = tf.keras.layers.Embedding(input_dim=32, output_dim=6))+        x1 = PolynomialCrossing(projection_dim=None)((x0, x0))+        x2 = PolynomialCrossing(projection_dim=None)((x0, x1))+        logits = tf.keras.layers.Dense(units=10)(x2)+        model = tf.keras.Model(input, logits)+        ```++    Arguments:+        projection_dim: project dimension. Default is `None` such that a full+          (`input_dim` by `input_dim`) matrix is used.+        use_bias: whether to calculate the bias/intercept for this layer. If set to+          False, no bias/intercept will be used in calculations, e.g., the data is+          already centered.+        kernel_initializer: Initializer instance to use on the kernel matrix.+        bias_initializer: Initializer instance to use on the bias vector.+        kernel_regularizer: Regularizer instance to use on the kernel matrix.+        bias_regularizer: Regularizer instance to use on bias vector.++    Input shape:+        A tuple of 2 (batch_size, `input_dim`) dimensional inputs.++    Output shape:+        A single (batch_size, `input_dim`) dimensional output.+    """++    @typechecked+    def __init__(+        self,+        projection_dim: int = None,+        use_bias: bool = True,+        kernel_initializer: types.Initializer = "truncated_normal",+        bias_initializer: types.Initializer = "zeros",+        kernel_regularizer: types.Regularizer = None,+        bias_regularizer: types.Regularizer = None,+        **kwargs,+    ):+        super(PolynomialCrossing, self).__init__(**kwargs)++        self.projection_dim = projection_dim+        self.use_bias = use_bias+        self.kernel_initializer = tf.keras.initializers.get(kernel_initializer)+        self.bias_initializer = tf.keras.initializers.get(bias_initializer)+        self.kernel_regularizer = tf.keras.regularizers.get(kernel_regularizer)+        self.bias_regularizer = tf.keras.regularizers.get(bias_regularizer)++        self.supports_masking = True++    def build(self, input_shape):+        if not isinstance(input_shape, (tuple, list)) or len(input_shape) != 2:+            raise ValueError(+                "Input shapes must be a tuple or list of size 2, "+                "got {}".format(input_shape)+            )+        last_dim = input_shape[-1][-1]+        if self.projection_dim is None:+            kernel_shape = [last_dim, last_dim]+        else:+            if self.projection_dim != last_dim:+                raise ValueError(+                    "The case where `projection_dim` != last "+                    "dimension of the inputs is not supported yet, got "+                    "`projection_dim` {}, and last dimension of input "+                    "{}".format(self.projection_dim, last_dim)+                )+            kernel_shape = [last_dim, self.projection_dim]+        self.kernel = self.add_weight(+            "kernel",+            shape=kernel_shape,+            initializer=self.kernel_initializer,+            regularizer=self.kernel_regularizer,+            dtype=self.dtype,+            trainable=True,+        )+        if self.use_bias:+            self.bias = self.add_weight(+                "bias",+                shape=[last_dim],+                initializer=self.bias_initializer,+                regularizer=self.bias_regularizer,+                dtype=self.dtype,+                trainable=True,+            )+        self.built = True++    def call(self, inputs):+        if not isinstance(inputs, (tuple, list)) or len(inputs) != 2:+            raise ValueError(+                "Inputs to the layer must be a tuple or list of size 2, "+                "got {}".format(inputs)+            )+        x0, x = inputs+        outputs = x0 * tf.matmul(x, self.kernel) + x+        if self.use_bias:+            outputs = tf.add(outputs, self.bias)+        return outputs++    def get_config(self):

Done.

tanzhenyu

comment created time in 22 days

Pull request review commenttensorflow/addons

Add PolynomialCrossing to Addons

+# Copyright 2020 The TensorFlow Authors. All Rights Reserved.+#+# Licensed under the Apache License, Version 2.0 (the "License");+# you may not use this file except in compliance with the License.+# You may obtain a copy of the License at+#+#     http://www.apache.org/licenses/LICENSE-2.0+#+# Unless required by applicable law or agreed to in writing, software+# distributed under the License is distributed on an "AS IS" BASIS,+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+# See the License for the specific language governing permissions and+# limitations under the License.+# ==============================================================================+"""Implements Polynomial Crossing Layer."""++import tensorflow as tf+from typeguard import typechecked++from tensorflow_addons.utils import types+++@tf.keras.utils.register_keras_serializable(package="Addons")+class PolynomialCrossing(tf.keras.layers.Layer):+    """Layer for Deep & Cross Network to learn explicit feature interactions.++    A layer that applies feature crossing in learning certain explicit+    bounded-degree feature interactions more efficiently. The `call` method+    accepts `inputs` as a tuple of size 2 tensors. The first input `x0` should be+    the input to the first `PolynomialCrossing` layer in the stack, or the input+    to the network (usually after the embedding layer), the second input `xi`+    is the output of the previous `PolynomialCrossing` layer in the stack, i.e.,+    the i-th `PolynomialCrossing` layer.++    The output is y = x0 * (W .* x) + bias + xi, where .* designates dot product.++    References+        See [R. Wang](https://arxiv.org/pdf/1708.05123.pdf)++    Example:++        ```python+        # after embedding layer in a functional model:+        input = tf.keras.Input(shape=(None,), name='index', dtype=tf.int64)+        x0 = tf.keras.layers.Embedding(input_dim=32, output_dim=6))+        x1 = PolynomialCrossing(projection_dim=None)((x0, x0))+        x2 = PolynomialCrossing(projection_dim=None)((x0, x1))+        logits = tf.keras.layers.Dense(units=10)(x2)+        model = tf.keras.Model(input, logits)+        ```++    Arguments:+        projection_dim: project dimension. Default is `None` such that a full+          (`input_dim` by `input_dim`) matrix is used.+        use_bias: whether to calculate the bias/intercept for this layer. If set to+          False, no bias/intercept will be used in calculations, e.g., the data is+          already centered.+        kernel_initializer: Initializer instance to use on the kernel matrix.+        bias_initializer: Initializer instance to use on the bias vector.+        kernel_regularizer: Regularizer instance to use on the kernel matrix.+        bias_regularizer: Regularizer instance to use on bias vector.++    Input shape:+        A tuple of 2 (batch_size, `input_dim`) dimensional inputs.++    Output shape:+        A single (batch_size, `input_dim`) dimensional output.+    """++    @typechecked+    def __init__(+        self,+        projection_dim: int = None,+        use_bias: bool = True,+        kernel_initializer: types.Initializer = "truncated_normal",+        bias_initializer: types.Initializer = "zeros",+        kernel_regularizer: types.Regularizer = None,+        bias_regularizer: types.Regularizer = None,+        **kwargs,+    ):+        super(PolynomialCrossing, self).__init__(**kwargs)++        self.projection_dim = projection_dim+        self.use_bias = use_bias+        self.kernel_initializer = tf.keras.initializers.get(kernel_initializer)+        self.bias_initializer = tf.keras.initializers.get(bias_initializer)+        self.kernel_regularizer = tf.keras.regularizers.get(kernel_regularizer)+        self.bias_regularizer = tf.keras.regularizers.get(bias_regularizer)++        self.supports_masking = True++    def build(self, input_shape):+        if not isinstance(input_shape, (tuple, list)) or len(input_shape) != 2:+            raise ValueError(+                "Input shapes must be a tuple or list of size 2, "

Done.

tanzhenyu

comment created time in 22 days

Pull request review commenttensorflow/addons

Add PolynomialCrossing to Addons

+# Copyright 2020 The TensorFlow Authors. All Rights Reserved.+#+# Licensed under the Apache License, Version 2.0 (the "License");+# you may not use this file except in compliance with the License.+# You may obtain a copy of the License at+#+#     http://www.apache.org/licenses/LICENSE-2.0+#+# Unless required by applicable law or agreed to in writing, software+# distributed under the License is distributed on an "AS IS" BASIS,+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.+# See the License for the specific language governing permissions and+# limitations under the License.+# ==============================================================================+"""Tests for PolynomialCrossing layer."""++import numpy as np+import tensorflow as tf++from tensorflow_addons.layers.polynomial import PolynomialCrossing+from tensorflow_addons.utils import test_utils+++@test_utils.run_all_in_graph_and_eager_modes+class PolynomialCrossingTest(tf.test.TestCase):+    # Do not use layer_test due to multiple inputs.++    def test_full_matrix(self):+        x0 = np.random.random((12, 5))+        x = np.random.random((12, 5))+        layer = PolynomialCrossing(projection_dim=None)+        layer([x0, x])

Done.

tanzhenyu

comment created time in 22 days

push eventtanzhenyu/addons

tanzhenyu

commit sha daef7c8f87b3a1b3925f775b3d99636aaf8d659e

Update polynomial_test.py update.

view details

push time in 22 days

push eventtanzhenyu/addons

tanzhenyu

commit sha 9d0982bc2f6af2c13f983130cabda49272820e6e

Update polynomial_test.py fix kernel initializer

view details

push time in 22 days

push eventtanzhenyu/addons

tanzhenyu

commit sha f9bc09fd80950cb06b8a63fc394955dd1137a799

Update polynomial_test.py adding var init.

view details

push time in 22 days

push eventtanzhenyu/addons

tanzhenyu

commit sha 276956f7865a72755497e74f269c7e1730f644b9

Update polynomial_test.py black the test

view details

push time in 22 days

push eventtanzhenyu/addons

tanzhenyu

commit sha a0872ac884ea2bcc69af725991befc27d8136fad

Update polynomial_test.py Added a few more tests.

view details

push time in 22 days

pull request commenttensorflow/addons

Add PolynomialCrossing to Addons

Majority of issues seem to stem from formatting. Could you please run black ./ or bash tools/run_docker.sh -c 'make code-format'

Done.

tanzhenyu

comment created time in 23 days

push eventtanzhenyu/addons

tanzhenyu

commit sha 806cadf72c551e9044a800924860630ac2806ad3

more indentation fix More indentitation fix.

view details

push time in 23 days

push eventtanzhenyu/addons

tanzhenyu

commit sha 5dbe61c2ea2cd5177b29d4e903d559eaadc8cd2d

Add PolynomialCrossing to Addonds 2 Update commit per Addons format.

view details

push time in 23 days

PR opened tensorflow/addons

Reviewers
Add PolynomialCrossing to Addons

This is from internal design to open-source the DCN network from paper https://arxiv.org/abs/1708.05123

+216 -0

0 comment

5 changed files

pr created time in 23 days

push eventtanzhenyu/addons

tanzhenyu

commit sha 18e3edb2b1a020c941b9844073de399e917dc3ec

Add PolynomialCrossing to Addons This is from internal design to open-source the DCN network from paper https://arxiv.org/abs/1708.05123

view details

push time in 23 days

create barnchtanzhenyu/addons

branch : crossing

created branch time in 23 days

fork tanzhenyu/addons

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons

fork in 23 days

push eventtanzhenyu/community

Zhenyu Tan

commit sha 4df736cc2aa5bc325a2488d2b733ff21024522ac

Finalize design based on review feedbacks.

view details

push time in a month

issue commenttensorflow/tensorflow

Significant prediction slowdown after model.compile()

Thanks for the docstring update, also for the explanation. I'm always interested!

Can confirm that model(x) has the same runtime as predict_on_batch(x), i.e. the v2 path is still slightly slower. It's OK for my use case though, so thanks again.

Another note for users: it's possible to specify model.run_eagerly = False before compiling. With this and the model(x) call, I am getting almost the same performance as in v1, without globally disabling eager execution.

P.S.: Sorry for the many edits of this post.

You can just compile(..., run_eagerly=False) if that's what you need

off99555

comment created time in a month

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-01-03                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and an extension to existing op (`tf.sparse.from_dense`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.sparse.from_dense(dftrain.sex, "Unknown"), tf.sparse.from_dense(dftrain.n_siblings_spouses, -1),+	tf.sparse.from_dense(dftrain.parch, -1), tf.sparse.from_dense(dftrain['class'], "Unknown"), tf.sparse.from_dense(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))

that's why we proposed a tf.sparse.from_dense(value, ignore_value), and our philosophy here is tensor type in / tensor type out, i.e., you can specifically convert to sparse tensors before it gets to here, which is pretty useful in TF Transform you all preprocessing things are done in memory efficient way.

tanzhenyu

comment created time in a month

issue commenttensorflow/tensorflow

Significant prediction slowdown after model.compile()

@tanzhenyu Great. To clarify, is the speedup for TF 2.1+ only, or also 2.0? (if latter, is 2.1 even faster?)

I believe this should be universal to 2.x versions

off99555

comment created time in a month

issue closedtensorflow/tensorflow

Significant prediction slowdown after model.compile()

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • TensorFlow installed from (source or binary): pip install tensorflow
  • TensorFlow version (use command below): 2.0.0
  • Python version: 3.7
  • CUDA/cuDNN version: CUDA=10.0, cuDNN=7.6.4
  • GPU model and memory: GTX 1060 6GB

Describe the current behavior The prediction speed is slowed down a lot after model.compile() call.

Describe the expected behavior Speed should not be affected. Predict function is used by users assuming that it will work fast because we use it all the time in production. It should not cause surprise to users.

Code to reproduce the issue https://nbviewer.jupyter.org/github/off99555/TensorFlowExperiments/blob/master/test-prediction-speed-after-compile.ipynb?flush_cache=true

image

closed time in a month

off99555

issue commenttensorflow/tensorflow

Significant prediction slowdown after model.compile()

I have updated the doc, also tested the performance for model(x) in nightly. Closing it for now. Thanks all for reporting and collaborative work!

off99555

comment created time in a month

pull request commenttensorflow/community

RFC: Keras categorical inputs

Should be ready in TF 2.2

Do you know when to release TF 2.2? We are designing a data transformation process with keras preprocess layers or feature_column for SQLFLow and ElasticDL based on TF 2.x.

@goldiegadde will have more accurate info on this. If you have special requirements on this, feel free to comment on it, or maybe ask @goldiegadde to see if you want to participate in the design review.

tanzhenyu

comment created time in a month

issue commenttensorflow/tensorflow

Significant prediction slowdown after model.compile()

@ttbrunner Ah yeah that was the commit, thanks.

Ok so we follow the adapter pattern for convert numpy and dataframes to dataset first, and has a single path for execution. Apparently the speed down is mainly two things: 1) the construction of dataset. 2) creating tf.function for predict. (Check TensorLikeDataAdapter under /python/keras/engine/data_adapter.py if you're interested)

@off99555 @OverLordGoldDragon @ttbrunner So here's what I would recommend going forward:

  1. you can predict the output using model call, not model predict, i.e., calling model(x) would make this much faster because there are no "conversion to dataset" part, and also it's directly calling a cached tf.function. However be aware that if you have batch norm layers or any other layers that behaves differently between training and inference, make sure to call it with model(x, training=False)

  2. I will make a docstring to recommend users to use model call and explain predict is for large dataset.

SG?

off99555

comment created time in a month

pull request commenttensorflow/community

RFC: Keras categorical inputs

With tf.feature_column, we can use embedding_column to wrap category_column and convert category_column output to dense tensor. Whether can Keras Category layers support the function like embedding_column? Meanwhile, tf.keras.layers.Embedding cannot support sparseTensor inputs which may be the output of CategoryLookup. I have created an issue tensorflow/tensorflow#33880 to embedding lookup with sparseTensor.

You can use dense tensor input to CategoryLookup, which gives you dense tensor output, and feed that into tf.keras.layers.Embedding. Maybe we should support sparse input in embedding layer.

Look forward to an embedding layer with sparse input. And do you have a time plan when to implement the proposed preprocessing Keras layers?

Should be ready in TF 2.2

tanzhenyu

comment created time in a month

issue commenttensorflow/tensorflow

Significant prediction slowdown after model.compile()

@off99555 @OverLordGoldDragon I tried to reproduce this with tf nightly but it doesn't seem to occur anymore. Or equivalently even without compile it becomes pretty slow as well. Now this is more of a tensorflow 1.15 much faster than tensorflow v2 -- can you confirm?

off99555

comment created time in a month

issue closedtensorflow/tensorflow

tf.train.AdamOptimizer doesn't work with custom TPU training loop

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: n/a
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 1.15
  • Python version: 3.x
  • Bazel version (if compiling from source): n/a
  • GCC/Compiler version (if compiling from source): n/a
  • CUDA/cuDNN version: n/a
  • GPU model and memory: n/a

Describe the current behavior Run this Colab notebook with a TPU accelerator: https://colab.research.google.com/drive/1bsgSNK3aK9sETlplIPVpAa-yc4q1S3sA

When running the above notebook with tf.train.AdamOptimizer, we get:

ValueError: in converted code:

    <ipython-input-22-807c7cf92c68>:21 simple_model_fn  *
        train_op = tf.train.AdamOptimizer().minimize(y)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py:413 minimize
        name=name)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py:569 apply_gradients
        self._distributed_apply, args=(grads_and_vars, global_step, name))
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_lib.py:1940 merge_call
        return self._merge_call(merge_fn, args, kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_lib.py:1947 _merge_call
        return merge_fn(self._strategy, *args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py:717 _distributed_apply
        non_slot_devices, finish, args=(self, update_ops), group=False)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_lib.py:1577 update_non_slot
        return self._update_non_slot(colocate_with, fn, args, kwargs, group)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/tpu_strategy.py:580 _update_non_slot
        result = fn(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py:713 finish
        return self._finish(update_ops, "update")
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/adam.py:228 _finish
        beta1_power, beta2_power = self._get_beta_accumulators()
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/adam.py:115 _get_beta_accumulators
        return (self._get_non_slot_variable("beta1_power", graph=graph),
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py:868 _get_non_slot_variable
        if hasattr(non_slot, "_distributed_container"):
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/values.py:827 __getattr__
        return super(TPUVariableMixin, self).__getattr__(name)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/values.py:389 __getattr__
        return getattr(self.get(), name)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/values.py:834 get
        return super(TPUVariableMixin, self).get(device=device)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/values.py:324 get
        return self._device_map.select_for_device(self._values, device)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/values.py:219 select_for_device
        (device, self._devices, device_util.current()))

    ValueError: Device /job:worker/replica:0/task:0/device:CPU:0 not found in ('/job:worker/replica:0/task:0/device:TPU:0', '/job:worker/replica:0/task:0/device:TPU:1', '/job:worker/replica:0/task:0/device:TPU:2', '/job:worker/replica:0/task:0/device:TPU:3', '/job:worker/replica:0/task:0/device:TPU:4', '/job:worker/replica:0/task:0/device:TPU:5', '/job:worker/replica:0/task:0/device:TPU:6', '/job:worker/replica:0/task:0/device:TPU:7') (current device /job:worker/replica:0/task:0/device:CPU:0)

This code runs just fine with tf.train.MomentumOptimizer and tf.keras.optimizers.Adam (run same code with the optimizer_type form variable set to KerasAdam or Momentum).

Describe the expected behavior Code should run without error using tf.train.AdamOptimizer, just like it does for the other optimizers.

Code to reproduce the issue https://colab.research.google.com/drive/1bsgSNK3aK9sETlplIPVpAa-yc4q1S3sA

closed time in a month

sharvil

issue commenttensorflow/tensorflow

tf.train.AdamOptimizer doesn't work with custom TPU training loop

@sharvil Unfortunately we don't have further 1.x releases (i.e., 1.16), so please keep using keras optimizers. If filtering gradients for not connected variables is desired, which does make sense, maybe file another issue for it. Meanwhile closing this for now.

sharvil

comment created time in a month

pull request commentkeras-team/governance

RFC: Keras categorical input.

Why we need to replicate RFC? It Is hard to follow duplicated threads

Per Francois' request. Please follow the TensorFlow's RFC, this is merely a mirrored RFC.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`

You're right, the adapt call is more useful in TF-Transform, not in Keras, if you have a large dataset. However we still provide adapt in Keras for smaller datasets. The way to pass statistical values is through tft.compute_and_apply_vocabulary, which was using raw TF operations before and it is targeted to be replaced by Keras preprocessing layers.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,

This really depends on what the execution engine is, and not part of the layer's responsibility, but instead should be responsibility of ProcessingStage. So in Keras, the only choice for now (same as TextVectorization layer), is single process because Keras is not designed for distributed running. In TFX that relies on Apache Beam to execute TensorFlow operations, this will be on distributed process.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(

I think this layer is more of a complimentary to that, i.e., tf.data can parse records and generate vocab file, of read vocab file and do other processing and still return string tensors. This layer is taken from that and convert things to indices before it gets to embedding.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""++  def __init__(self, depth=None, num_bins=None, name=None, **kwargs):+    """Constructs a CategoryCrossing layer.+    Args:+      depth: depth of input crossing. By default None, all inputs are crossed+             into one output. It can be an int or tuple/list of ints, where inputs are+             combined into all combinations of output with degree of `depth`. For example,+             with inputs `a`, `b` and `c`, `depth=2` means the output will be [ab;ac;bc]+      num_bins: Number of hash bins. By default None, no hashing is performed.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a list of int tensors of shape `[batch_size, d1, ..., dm]`+    Output: a single int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If the layer receives two inputs, `a=[[1, 2]]` and `b=[[1, 3]]`,+    and if depth is 2, then+    the output will be a single integer tensor `[[i, j, k, l]]`, where:+    i is the index of the category "a1=1 and b1=1"+    j is the index of the category "a1=1 and b2=3"+    k is the index of the category "a2=2 and b1=1"+    l is the index of the category "a2=2 and b2=3"+    """+    pass++`tf.keras.layers.CategoryEncoding`+CategoryEncoding(PreprocessingLayer):+"""This layer transforms categorical inputs from index space to category space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, num_categories, mode="sum", axis=-1, name=None, **kwargs):+    """Constructs a CategoryEncoding layer.+    Args:+      num_categories: Number of elements in the vocabulary.+      mode: how to reduce a categorical input if multivalent, can be one of "sum",  +          "mean", "binary", "tfidf". It can also be None if this is not a multivalent input,+          and simply needs to convert input from index space to category space. "tfidf" is only+          valid when adapt is called on this layer.+      axis: the axis to reduce, by default will be the last axis, specially true +          for sequential feature columns.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm-1, dm]`+    Output: a float tensor of shape `[batch_size, d1, ..., dm-1, num_categories]`+    """+    pass++`tf.keras.layers.CategoryHashing`+CategoryHashing(PreprocessingLayer):+"""This layer transforms categorical inputs to hashed output.+   If input is dense/sparse, then output is dense/sparse."""+  def __init__(self, num_bins, name=None, **kwargs):+    """Constructs a CategoryHashing layer.++    Args:+      num_bins: Number of hash bins.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm]`+    Output: a int tensor of shape `[batch_size, d1, ..., dm]`+    """+    pass++```++We also propose a `to_sparse` op to convert dense tensors to sparse tensors given user specified ignore values. This op can be used in both `tf.data` or [TF Transform](https://www.tensorflow.org/tfx/transform/get_started). In previous feature column world, "" is ignored for dense string input and -1 is ignored for dense int input.++```python+`tf.to_sparse`+def to_sparse(input, ignore_value):+  """Convert dense/sparse tensor to sparse while dropping user specified values.

Makes sense. Let's just extend the tf.sparse.from_dense op.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""++  def __init__(self, depth=None, num_bins=None, name=None, **kwargs):+    """Constructs a CategoryCrossing layer.+    Args:+      depth: depth of input crossing. By default None, all inputs are crossed+             into one output. It can be an int or tuple/list of ints, where inputs are+             combined into all combinations of output with degree of `depth`. For example,+             with inputs `a`, `b` and `c`, `depth=2` means the output will be [ab;ac;bc]+      num_bins: Number of hash bins. By default None, no hashing is performed.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a list of int tensors of shape `[batch_size, d1, ..., dm]`+    Output: a single int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If the layer receives two inputs, `a=[[1, 2]]` and `b=[[1, 3]]`,+    and if depth is 2, then+    the output will be a single integer tensor `[[i, j, k, l]]`, where:+    i is the index of the category "a1=1 and b1=1"+    j is the index of the category "a1=1 and b2=3"+    k is the index of the category "a2=2 and b1=1"+    l is the index of the category "a2=2 and b2=3"+    """+    pass++`tf.keras.layers.CategoryEncoding`+CategoryEncoding(PreprocessingLayer):+"""This layer transforms categorical inputs from index space to category space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, num_categories, mode="sum", axis=-1, name=None, **kwargs):+    """Constructs a CategoryEncoding layer.+    Args:+      num_categories: Number of elements in the vocabulary.+      mode: how to reduce a categorical input if multivalent, can be one of "sum",  +          "mean", "binary", "tfidf". It can also be None if this is not a multivalent input,+          and simply needs to convert input from index space to category space. "tfidf" is only+          valid when adapt is called on this layer.+      axis: the axis to reduce, by default will be the last axis, specially true +          for sequential feature columns.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm-1, dm]`+    Output: a float tensor of shape `[batch_size, d1, ..., dm-1, num_categories]`+    """+    pass++`tf.keras.layers.CategoryHashing`+CategoryHashing(PreprocessingLayer):+"""This layer transforms categorical inputs to hashed output.+   If input is dense/sparse, then output is dense/sparse."""+  def __init__(self, num_bins, name=None, **kwargs):+    """Constructs a CategoryHashing layer.++    Args:+      num_bins: Number of hash bins.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm]`+    Output: a int tensor of shape `[batch_size, d1, ..., dm]`+    """+    pass++```++We also propose a `to_sparse` op to convert dense tensors to sparse tensors given user specified ignore values. This op can be used in both `tf.data` or [TF Transform](https://www.tensorflow.org/tfx/transform/get_started). In previous feature column world, "" is ignored for dense string input and -1 is ignored for dense int input.

Yeah good point. I wasn't aware of this op. We should just extend it. Done.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""++  def __init__(self, depth=None, num_bins=None, name=None, **kwargs):+    """Constructs a CategoryCrossing layer.+    Args:+      depth: depth of input crossing. By default None, all inputs are crossed+             into one output. It can be an int or tuple/list of ints, where inputs are+             combined into all combinations of output with degree of `depth`. For example,+             with inputs `a`, `b` and `c`, `depth=2` means the output will be [ab;ac;bc]+      num_bins: Number of hash bins. By default None, no hashing is performed.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a list of int tensors of shape `[batch_size, d1, ..., dm]`+    Output: a single int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If the layer receives two inputs, `a=[[1, 2]]` and `b=[[1, 3]]`,+    and if depth is 2, then+    the output will be a single integer tensor `[[i, j, k, l]]`, where:+    i is the index of the category "a1=1 and b1=1"+    j is the index of the category "a1=1 and b2=3"+    k is the index of the category "a2=2 and b1=1"+    l is the index of the category "a2=2 and b2=3"+    """+    pass++`tf.keras.layers.CategoryEncoding`+CategoryEncoding(PreprocessingLayer):+"""This layer transforms categorical inputs from index space to category space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, num_categories, mode="sum", axis=-1, name=None, **kwargs):+    """Constructs a CategoryEncoding layer.+    Args:+      num_categories: Number of elements in the vocabulary.+      mode: how to reduce a categorical input if multivalent, can be one of "sum",  +          "mean", "binary", "tfidf". It can also be None if this is not a multivalent input,+          and simply needs to convert input from index space to category space. "tfidf" is only+          valid when adapt is called on this layer.+      axis: the axis to reduce, by default will be the last axis, specially true +          for sequential feature columns.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm-1, dm]`+    Output: a float tensor of shape `[batch_size, d1, ..., dm-1, num_categories]`+    """+    pass++`tf.keras.layers.CategoryHashing`+CategoryHashing(PreprocessingLayer):

Done.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""++  def __init__(self, depth=None, num_bins=None, name=None, **kwargs):+    """Constructs a CategoryCrossing layer.+    Args:+      depth: depth of input crossing. By default None, all inputs are crossed+             into one output. It can be an int or tuple/list of ints, where inputs are+             combined into all combinations of output with degree of `depth`. For example,+             with inputs `a`, `b` and `c`, `depth=2` means the output will be [ab;ac;bc]+      num_bins: Number of hash bins. By default None, no hashing is performed.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a list of int tensors of shape `[batch_size, d1, ..., dm]`+    Output: a single int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If the layer receives two inputs, `a=[[1, 2]]` and `b=[[1, 3]]`,+    and if depth is 2, then+    the output will be a single integer tensor `[[i, j, k, l]]`, where:+    i is the index of the category "a1=1 and b1=1"+    j is the index of the category "a1=1 and b2=3"+    k is the index of the category "a2=2 and b1=1"+    l is the index of the category "a2=2 and b2=3"+    """+    pass++`tf.keras.layers.CategoryEncoding`+CategoryEncoding(PreprocessingLayer):

Done.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""++  def __init__(self, depth=None, num_bins=None, name=None, **kwargs):+    """Constructs a CategoryCrossing layer.+    Args:+      depth: depth of input crossing. By default None, all inputs are crossed+             into one output. It can be an int or tuple/list of ints, where inputs are+             combined into all combinations of output with degree of `depth`. For example,+             with inputs `a`, `b` and `c`, `depth=2` means the output will be [ab;ac;bc]+      num_bins: Number of hash bins. By default None, no hashing is performed.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a list of int tensors of shape `[batch_size, d1, ..., dm]`+    Output: a single int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If the layer receives two inputs, `a=[[1, 2]]` and `b=[[1, 3]]`,+    and if depth is 2, then+    the output will be a single integer tensor `[[i, j, k, l]]`, where:+    i is the index of the category "a1=1 and b1=1"+    j is the index of the category "a1=1 and b2=3"+    k is the index of the category "a2=2 and b1=1"+    l is the index of the category "a2=2 and b2=3"

Yeah it is confusing. Updated.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""++  def __init__(self, depth=None, num_bins=None, name=None, **kwargs):+    """Constructs a CategoryCrossing layer.+    Args:+      depth: depth of input crossing. By default None, all inputs are crossed+             into one output. It can be an int or tuple/list of ints, where inputs are+             combined into all combinations of output with degree of `depth`. For example,+             with inputs `a`, `b` and `c`, `depth=2` means the output will be [ab;ac;bc]

Both Example for each layer description, and a code snippet below.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""

Yeah that's better. Done.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.

Done.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.

Done.

tanzhenyu

comment created time in 2 months

push eventtanzhenyu/community

Zhenyu Tan

commit sha f86ee916943429ffa5a64886bfb3dcf5a942d0a9

Addressing Jiri's comment.

view details

push time in 2 months

issue commenttensorflow/tensorflow

Saving GRU with dropout to SavedModel fails

See the above link. I don't think LSTM works as well

fhausmann

comment created time in 2 months

issue commenttensorflow/tensorflow

error on saving RNN layer with recurrent_dropout parameter as saved_model

https://github.com/tensorflow/tensorflow/issues/33247

deaputri

comment created time in 2 months

IssuesEvent

issue commenttensorflow/tensorflow

Unable to save TensorFlow Keras LSTM model to SavedModel format

This does seem like a bug

tmartin293

comment created time in 2 months

IssuesEvent

pull request commenttensorflow/community

RFC: Keras categorical inputs

With tf.feature_column, we can use embedding_column to wrap category_column and convert category_column output to dense tensor. Whether can Keras Category layers support the function like embedding_column? Meanwhile, tf.keras.layers.Embedding cannot support sparseTensor inputs which may be the output of CategoryLookup. I have created an issue tensorflow/tensorflow#33880 to embedding lookup with sparseTensor.

You can use dense tensor input to CategoryLookup, which gives you dense tensor output, and feed that into tf.keras.layers.Embedding.

Maybe we should support sparse input in embedding layer.

tanzhenyu

comment created time in 2 months

issue commenttensorflow/tensorflow

Adam implementation differs from paper (applies bias B_2 correction to \epsilon)

Before 3.0 we can flag-gate this behavior, though, and warn if the incompatible-with-the-paper flag is used.

@tanzhenyu can you implement the right behavior and gate it behind an optimizer constructor argument / different class?

Sounds good.

bmc2-stripe

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

add usage example to vgg16.py

 def VGG16(include_top=True,   Raises:       ValueError: in case of invalid argument for `weights`,           or invalid input shape.+          +  Usage Example:+  ```python+    >> from tensorflow.keras.applications.vgg16 import VGG16+    >> # Including the top layer (the last dense layer responsible for classification)+    >> vgg16_model = VGG16(input_shape = (224,224,3) , include_top = True)+    >> vgg16_model.summary()+    ```

Please indent correctly

Ron-Rocks

comment created time in 2 months

issue commenttensorflow/tensorflow

Adam implementation differs from paper (applies bias B_2 correction to \epsilon)

Thanks for the report. We have noticed this, and decided we can only make behavior change in 3.0

bmc2-stripe

comment created time in 2 months

issue closedtensorflow/tensorflow

optimizer.apply_gradients() logs warnings using Tensor.name which is not supported by eager execution

System information

  • Have I written custom code: Yes
  • OS Platform and Distribution: Ubuntu 18.04
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version: v2.0.0-rc2-26-g64c3d38 2.0.0
  • Python version: 3.7.4
  • CUDA/cuDNN version: 10.1/7.6.5
  • GPU model and memory: RTX 2070 super 8gb

Describe the current behavior When using a gradient tape in eager mode, if the gradient computation fails and returns None, the apply_gradients() function will attempt to log a warning using Tensor.name which isn't supported in eager execution. The exact line can be found here. This is a breaking issue because it is simply a logged warning, and the code should continue to execute; however in eager mode it raises an AttributeError due to Tensor.name. A similar issue can be found above on line 1039 however that one is less serious, as the code would terminate due to the ValueError anyway.

Describe the expected behavior A warning is logged and the code continues to execute.

Code to reproduce the issue There is a workaround for RTX GPUs at the top per the comment in #24828

import tensorflow as tf
import numpy as np

conv1_filters = 32
conv1_window = 3
input_dims = 50
num_classes = 10

random_normal = tf.initializers.RandomNormal()

# Magic fix for RTX GPUs
# gpus = tf.config.experimental.list_physical_devices('GPU')
# for gpu in gpus:
#   tf.config.experimental.set_memory_growth(gpu, True)

weights = {
  'wc1': tf.Variable(random_normal([conv1_window, input_dims, conv1_filters])),
  'out': tf.Variable(random_normal([conv1_filters, num_classes]))
}

biases = {
  # This line is the one that's wrong. Here I forgot to wrap the tf.zeros in a tf.Variable which is how I discovered the issue.
  'bc1': tf.zeros(conv1_filters),
  'out': tf.Variable(tf.zeros(num_classes))
}

def conv1d(x, W, b, stride=1):
  """
  Conv1D wrapper, with bias and relu activation.
  """
  x = tf.nn.conv1d(x, W, stride=stride, padding='SAME')
  x = tf.nn.bias_add(x, b)
  return tf.nn.relu(x)

def model(inputs):
  x = inputs
  x = conv1d(x, weights['wc1'], biases['bc1'])
  x = tf.add(tf.matmul(x, weights['out']), biases['out'])
  return tf.nn.softmax(x)

def cross_entropy(y_pred, y_true):
  y_pred = tf.clip_by_value(y_pred, 1e-9, 1.)
  return -tf.reduce_sum(y_true * tf.math.log(y_pred)) / tf.reduce_sum(y_true)

def train_one_batch(optimizer, minibatch_x, minibatch_y):
  with tf.GradientTape() as g:
    pred = model(minibatch_x)
    loss = cross_entropy(pred, minibatch_y)
  trainable_variables = list(weights.values()) + list(biases.values())
  gradients = g.gradient(loss, trainable_variables)
  optimizer.apply_gradients(zip(gradients, trainable_variables))

batch_size = 2
sequence_len = 4
x = tf.zeros([batch_size, sequence_len, input_dims], dtype=tf.float32)
y = tf.ones([batch_size, sequence_len], dtype=tf.int64)
y_onehot = tf.one_hot(y, depth=num_classes)
optimizer = tf.optimizers.Adam()
train_one_batch(optimizer, x, y_onehot)

Other info / logs

Traceback (most recent call last):
  File "code/tf_testcase.py", line 58, in <module>
    train_one_batch(optimizer, x, y_onehot)
  File "code/tf_testcase.py", line 50, in train_one_batch
    optimizer.apply_gradients(zip(gradients, trainable_variables))
  File "/home/ikhatri/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/optimizer_v2/optimizer_v2.py", line 427, in apply_gradients
    grads_and_vars = _filter_grads(grads_and_vars)
  File "/home/ikhatri/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/optimizer_v2/optimizer_v2.py", line 1029, in _filter_grads
    ([v.name for v in vars_with_empty_grads]))
  File "/home/ikhatri/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/keras/optimizer_v2/optimizer_v2.py", line 1029, in <listcomp>
    ([v.name for v in vars_with_empty_grads]))
  File "/home/ikhatri/miniconda3/envs/tf2/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1090, in name
    "Tensor.name is meaningless when eager execution is enabled.")
AttributeError: Tensor.name is meaningless when eager execution is enabled.

closed time in 2 months

ikhatri

issue commenttensorflow/tensorflow

optimizer.apply_gradients() logs warnings using Tensor.name which is not supported by eager execution

This is not really an optimizer issue when the "variable" is actually a "tensor".

ikhatri

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.train.AdamOptimizer doesn't work with custom TPU training loop

On the other hand, using GradientTape is the way to go in 2.0. So filtering out variables with None gradient doesn't seem too bad

sharvil

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.train.AdamOptimizer doesn't work with custom TPU training loop

@sharvil we provided tf.keras optimizers for use of eager mode, distribution strategy, etc. Those things are bundled through 2.0. We probably don't have plans to fix this for tf.train optimizer anymore, given there will be no more 1.x major versions to be released. That said, if you believe the None gradient is truly a hassle, we can consider making it warning instead of error. But might need help on provide concrete examples / use cases before we make make the decision

sharvil

comment created time in 2 months

issue closedtensorflow/tensorflow

tf.feature_column.shared_embeddings supports eager mode

System information

  • TensorFlow version (you are using): TF 2.0.0
  • Are you willing to contribute it (Yes/No): No

Describe the feature and the current behavior/state. Using tensorflow-2.0.0 with eager mode, tf.feature_column.shared_embeddings can not support eager mode.

Will this change the current api? How? No.

Who will benefit with this feature? tf.feature_column.shared_embeddings is a common feature column API,we can use it for multiple category_column with embedding parameters.

Any Other info.

closed time in 2 months

workingloong

issue commenttensorflow/tensorflow

tf.feature_column.shared_embeddings supports eager mode

We don't have plans for supporting additional features with feature columns at this point. The right solution will come with this RFC

workingloong

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""++  def __init__(self, depth=None, num_bins=None, name=None, **kwargs):+    """Constructs a CategoryCrossing layer.+    Args:+      depth: depth of input crossing. By default None, all inputs are crossed+             into one output. It can be an int or tuple/list of ints, where inputs are+             combined into all combinations of output with degree of `depth`. For example,+             with inputs `a`, `b` and `c`, `depth=2` means the output will be [ab;ac;bc]+      num_bins: Number of hash bins. By default None, no hashing is performed.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a list of int tensors of shape `[batch_size, d1, ..., dm]`+    Output: a single int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If the layer receives two inputs, `a=[[1, 2]]` and `b=[[1, 3]]`,+    and if depth is 2, then+    the output will be a single integer tensor `[[i, j, k, l]]`, where:+    i is the index of the category "a1=1 and b1=1"+    j is the index of the category "a1=1 and b2=3"+    k is the index of the category "a2=2 and b1=1"+    l is the index of the category "a2=2 and b2=3"+    """+    pass++`tf.keras.layers.CategoryEncoding`+CategoryEncoding(PreprocessingLayer):+"""This layer transforms categorical inputs from index space to category space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, num_categories, mode="sum", axis=-1, name=None, **kwargs):+    """Constructs a CategoryEncoding layer.+    Args:+      num_categories: Number of elements in the vocabulary.+      mode: how to reduce a categorical input if multivalent, can be one of "sum",  +          "mean", "binary", "tfidf". It can also be None if this is not a multivalent input,+          and simply needs to convert input from index space to category space. "tfidf" is only+          valid when adapt is called on this layer.+      axis: the axis to reduce, by default will be the last axis, specially true +          for sequential feature columns.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm-1, dm]`+    Output: a float tensor of shape `[batch_size, d1, ..., dm-1, num_categories]`+    """+    pass++`tf.keras.layers.CategoryHashing`+CategoryHashing(PreprocessingLayer):+"""This layer transforms categorical inputs to hashed output.+   If input is dense/sparse, then output is dense/sparse."""+  def __init__(self, num_bins, name=None, **kwargs):+    """Constructs a CategoryHashing layer.++    Args:+      num_bins: Number of hash bins.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm]`+    Output: a int tensor of shape `[batch_size, d1, ..., dm]`+    """+    pass++```++We also propose a `to_sparse` op to convert dense tensors to sparse tensors given user specified ignore values. This op can be used in both `tf.data` or [TF Transform](https://www.tensorflow.org/tfx/transform/get_started). In previous feature column world, "" is ignored for dense string input and -1 is ignored for dense int input.

If we don't need the functionality of sparse_output = to_sparse(sparse_input), then from_dense is probably better. This "imagined" functionality is not used anywhere though. In TFT I think any tf.io.VarLenFeature should automatically be sparse input, we just need to call SparseTensor.from_dense for any tf.io.FixedLenFeature.

WDYT?

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""++  def __init__(self, depth=None, num_bins=None, name=None, **kwargs):+    """Constructs a CategoryCrossing layer.+    Args:+      depth: depth of input crossing. By default None, all inputs are crossed+             into one output. It can be an int or tuple/list of ints, where inputs are+             combined into all combinations of output with degree of `depth`. For example,+             with inputs `a`, `b` and `c`, `depth=2` means the output will be [ab;ac;bc]+      num_bins: Number of hash bins. By default None, no hashing is performed.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a list of int tensors of shape `[batch_size, d1, ..., dm]`+    Output: a single int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If the layer receives two inputs, `a=[[1, 2]]` and `b=[[1, 3]]`,+    and if depth is 2, then+    the output will be a single integer tensor `[[i, j, k, l]]`, where:+    i is the index of the category "a1=1 and b1=1"+    j is the index of the category "a1=1 and b2=3"+    k is the index of the category "a2=2 and b1=1"+    l is the index of the category "a2=2 and b2=3"+    """+    pass++`tf.keras.layers.CategoryEncoding`+CategoryEncoding(PreprocessingLayer):+"""This layer transforms categorical inputs from index space to category space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, num_categories, mode="sum", axis=-1, name=None, **kwargs):+    """Constructs a CategoryEncoding layer.+    Args:+      num_categories: Number of elements in the vocabulary.+      mode: how to reduce a categorical input if multivalent, can be one of "sum",  +          "mean", "binary", "tfidf". It can also be None if this is not a multivalent input,+          and simply needs to convert input from index space to category space. "tfidf" is only+          valid when adapt is called on this layer.+      axis: the axis to reduce, by default will be the last axis, specially true +          for sequential feature columns.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm-1, dm]`+    Output: a float tensor of shape `[batch_size, d1, ..., dm-1, num_categories]`+    """+    pass++`tf.keras.layers.CategoryHashing`+CategoryHashing(PreprocessingLayer):+"""This layer transforms categorical inputs to hashed output.+   If input is dense/sparse, then output is dense/sparse."""+  def __init__(self, num_bins, name=None, **kwargs):+    """Constructs a CategoryHashing layer.++    Args:+      num_bins: Number of hash bins.+      name: Name to give to the layer.+      **kwargs: Keyword arguments to construct a layer.++    Input: a int tensor of shape `[batch_size, d1, ..., dm]`+    Output: a int tensor of shape `[batch_size, d1, ..., dm]`+    """+    pass++```++We also propose a `to_sparse` op to convert dense tensors to sparse tensors given user specified ignore values. This op can be used in both `tf.data` or [TF Transform](https://www.tensorflow.org/tfx/transform/get_started). In previous feature column world, "" is ignored for dense string input and -1 is ignored for dense int input.++```python+`tf.to_sparse`+def to_sparse(input, ignore_value):+  """Convert dense/sparse tensor to sparse while dropping user specified values.

to allow users to filter specified values, e.g., if the original input is already sparse: indices = [[0,0], [1, 0], [1,1]] values = ['A', '', 'C'] the user can still filter '' from it,

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.+      name: Name to give to the layer.+     **kwargs: Keyword arguments to construct a layer.++    Input: a string or int tensor of shape `[batch_size, d1, ..., dm]`+    Output: an int tensor of shape `[batch_size, d1, ..., dm]`++    Example:++    If one input sample is `["a", "c", "d", "a", "x"]` and the vocabulary is ["a", "b", "c", "d"],+    and a single OOV token is used (`num_oov_tokens=1`), then the corresponding output sample is+    `[1, 3, 4, 1, 0]`. 0 stands for an OOV token.+    """+    pass++`tf.keras.layers.CategoryCrossing`+CategoryCrossing(PreprocessingLayer):+"""This layer transforms multiple categorical inputs to categorical outputs+   by Cartesian product. and hash the output if necessary.+   If any input is sparse, then output is sparse, otherwise dense."""

Good question. This is the only layer that can accept multiple inputs. Other API only accept a single Tensor/SparseTensor. So by multiple inputs, if any one of them is sparse, the output will be sparse.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):+"""This layer transforms categorical inputs to index space.+   If input is dense/sparse, then output is dense/sparse."""++  def __init__(self, max_tokens=None, num_oov_tokens=1, vocabulary=None,+               name=None, **kwargs):+    """Constructs a CategoryLookup layer.++    Args:+      max_tokens: The maximum size of the vocabulary for this layer. If None,+              there is no cap on the size of the vocabulary. This is used when `adapt`+              is called.+      num_oov_tokens: Non-negative integer. The number of out-of-vocab tokens. +              All out-of-vocab inputs will be assigned IDs in the range of +              [0, num_oov_tokens) based on a hash. When+              `vocabulary` is None, it will convert inputs in [0, num_oov_tokens)+      vocabulary: the vocabulary to lookup the input. If it is a file, it represents the +              source vocab file; If it is a list/tuple, it represents the source vocab +              list; If it is None, the vocabulary can later be set.
  1. format of the file is same as a) any other TFX vocab file, or b) this test file: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/feature_column/testdata/warriors_vocabulary.txt

  2. users from feature columns world will set it during init, but this layer also allow users to call 'adapt' to derive/set the vocabulary from dataset.

tanzhenyu

comment created time in 2 months

Pull request review commenttensorflow/community

RFC: Keras categorical inputs

+# Keras categorical inputs++| Status        | Proposed                                             |+:-------------- |:---------------------------------------------------- |+| **Author(s)** | Zhenyu Tan (tanzheny@google.com), Francois Chollet (fchollet@google.com)|+| **Sponsor**   | Karmel Allison (karmel@google.com), Martin Wicke (wicke@google.com) |+| **Updated**   | 2019-12-12                                           |++## Objective++This document proposes 4 new preprocessing Keras layers (`CategoryLookup`, `CategoryCrossing`, `CategoryEncoding`, `CategoryHashing`), and 1 additional op (`to_sparse`) to allow users to:+* Perform feature engineering for categorical inputs+* Replace feature columns and `tf.keras.layers.DenseFeatures` with proposed layers+* Introduce sparse inputs that work with Keras linear models and other layers that support sparsity++Other proposed layers for replacement of feature columns such as `tf.feature_column.bucketized_column` and `tf.feature_column.numeric_column` has been discussed [here](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md) and are not the focus of this document.++## Motivation++Specifically, by introducing the 4 layers, we aim to address these pain points:+* Users have to define both feature columns and Keras Inputs for the model, resulting in code duplication and deviation from DRY (Do not repeat yourself) principle. See this [Github issue](https://github.com/tensorflow/tensorflow/issues/27416).+* Users with large dimension categorical inputs will incur large memory footprint and computation cost, if wrapped with indicator column through `tf.keras.layers.DenseFeatures`.+* Currently there is no way to correctly feed Keras linear model or dense layer with multivalent categorical inputs or weighted categorical inputs.++## User Benefit++We expect to get rid of the user painpoints once migrating off feature columns.++## Example Workflows++Two example workflows are presented below. These workflows can be found at this [colab](https://colab.sandbox.google.com/drive/1cEJhSYLcc2MKH7itwcDvue4PfvrLN-OR#scrollTo=22sa0D19kxXY).++### Workflow 1++The first example gives an equivalent code snippet to canned `LinearEstimator` [tutorial](https://www.tensorflow.org/tutorials/estimator/linear) on the Titanic dataset:++```python+dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')+y_train = dftrain.pop('survived')++CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck', 'embark_town', 'alone']+NUMERICAL_COLUMNS = ['age', 'fare']+# input list to create functional model.+model_inputs = []+# input list to feed linear model.+linear_inputs = []+for feature_name in CATEGORICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), dtype=tf.string, name=feature_name, sparse=True)+	vocab_list = sorted(dftrain[feature_name].unique())+	# Map string values to indices+	x = tf.keras.layers.CategoryLookup(vocabulary=vocab_list, name=feature_name)(feature_input)+  x = tf.keras.layers.CategoryEncoding(num_categories=len(vocab_list))(x)+	linear_inputs.append(x)+	model_inputs.append(feature_input)++for feature_name in NUMERICAL_COLUMNS:+	feature_input = tf.keras.Input(shape=(1,), name=feature_name)+	linear_inputs.append(feature_input)+	model_inputs.append(feature_input)++linear_model = tf.keras.experimental.LinearModel(units=1)+linear_logits = linear_model(linear_inputs)+model = tf.keras.Model(model_inputs, linear_logits)++model.compile('sgd', loss=tf.keras.losses.BinaryCrossEntropy(from_logits=True), metrics=['accuracy'])++dataset = tf.data.Dataset.from_tensor_slices((+	(tf.to_sparse(dftrain.sex, "Unknown"), tf.to_sparse(dftrain.n_siblings_spouses, -1),+	tf.to_sparse(dftrain.parch, -1), tf.to_sparse(dftrain['class'], "Unknown"), tf.to_sparse(dftrain.deck, "Unknown"),+	tf.expand_dims(dftrain.age, axis=1), tf.expand_dims(dftrain.fare, axis=1)),+	y_train)).batch(bach_size).repeat(n_epochs)++model.fit(dataset)+```++### Workflow 2++The second example gives an instruction on how to transition from categorical feature columns to the proposed layers. Note that one difference for vocab categorical column is that, instead of providing a pair of mutually exclusive `default_value` and `num_oov_buckets` where `default_value` represents the value to map input to given out-of-vocab value, and `num_oov_buckets` represents value range of [len(vocab), len(vocab)+num_oov_buckets) to map input to from a hashing function given out-of-vocab value. In practice, we believe out-of-vocab values should be mapped to the head, i.e., [0, num_oov_tokens), and in-vocab values should be mapped to [num_oov_tokens, num_oov_tokens+len(vocab)).++1. Categorical vocab list column++Original:+```python+fc = tf.feature_column.categorical_feature_column_with_vocabulary_list(+	   key, vocabulary_list, dtype, default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_list, num_oov_tokens=num_oov_buckets)+out = layer(x)+```++2. categorical vocab file column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_file(+       key, vocabulary_file, vocabulary_size, dtype,+       default_value, num_oov_buckets)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryLookup(+            vocabulary=vocabulary_file, num_oov_tokens=num_oov_buckets)+out = layer(x)+```+Note: `vocabulary_size` is only valid if `adapt` is called. Otherwise if user desires to lookup for the first K vocabularies in vocab file, then shrink the vocab file by only having the first K lines.++3. categorical hash column++Original:+```python+fc = tf.feature_column.categorical_column_with_hash_bucket(+       key, hash_bucket_size, dtype)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.CategoryHashing(num_bins=hash_bucket_size)+out = layer(x)+```++4. categorical identity column++Original:+```python+fc = tf.feature_column.categorical_column_with_identity(+       key, num_buckets, default_value)+```+Proposed:+```python+x = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+layer = tf.keras.layers.Lambda(lambda x: tf.where(tf.logical_or(x < 0, x > num_buckets), tf.fill(dims=tf.shape(x), value=default_value), x))+out = layer(x)+```++5. cross column++Original:+```python+fc_1 = tf.feature_column.categorical_column_with_vocabulary_list(key_1, vocabulary_list, +         dtype, default_value, num_oov_buckets)+fc_2 = tf.feature_column.categorical_column_with_hash_bucket(key_2, hash_bucket_size,+         dtype)+fc = tf.feature_column.crossed_column([fc_1, fc_2], hash_bucket_size, hash_key)+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key_1, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=key_2, dtype=dtype)+layer1 = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,  +           num_oov_tokens=num_oov_buckets)+x1 = layer1(x1)+layer2 = tf.keras.layers.CategoryHashing(+           num_bins=hash_bucket_size)+x2 = layer2(x2)+layer = tf.keras.layers.CategoryCrossing(num_bins=hash_bucket_size)+out = layer([x1, x2])+```++6. weighted categorical column++Original:+```python+fc = tf.feature_column.categorical_column_with_vocab_list(key, vocabulary_list,+         dtype, default_value, num_oov_buckets)+weight_fc = tf.feature_column.weighted_categorical_column(fc, weight_feature_key, +         dtype=weight_dtype)+linear_model = tf.estimator.LinearClassifier(units, feature_columns=[weight_fc])+```+Proposed:+```python+x1 = tf.keras.Input(shape=(1,), name=key, dtype=dtype)+x2 = tf.keras.Input(shape=(1,), name=weight_feature_key, dtype=weight_dtype)+layer = tf.keras.layers.CategoryLookup(+           vocabulary=vocabulary_list,   +           num_oov_tokens=num_oov_buckets)+x1 = layer(x1)+x = tf.keras.layers.CategoryEncoding(num_categories=len(vocabulary_list)+num_oov_buckets)([x1, x2])+linear_model = tf.keras.premade.LinearModel(units)+linear_logits = linear_model(x)+```++## Design Proposal+We propose a CategoryLookup layer to replace `tf.feature_column.categorical_column_with_vocabulary_list` and `tf.feature_column.categorical_column_with_vocabulary_file`, a `CategoryHashing` layer to replace `tf.feature_column.categorical_column_with_hash_bucket`, a `CategoryCrossing` layer to replace `tf.feature_column.crossed_column`, and another `CategoryEncoding` layer to convert the sparse input to the format required by linear models.++```python+`tf.keras.layers.CategoryLookup`+CategoryLookup(PreprocessingLayer):

Thanks for the reminder!

tanzhenyu

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.train.AdamOptimizer doesn't work with custom TPU training loop

@sharvil @swghosh when you call get_gradients, it should return error information regarding which variable is missing the gradient I think?

sharvil

comment created time in 2 months

push eventtanzhenyu/community

Zhenyu Tan

commit sha de3f777fb8342ec17c2bad317af0eac57242c37b

Add some default sections.

view details

push time in 2 months

PR opened keras-team/governance

RFC: Keras categorical input.

A replication from https://github.com/tensorflow/community/pull/188

+370 -0

0 comment

1 changed file

pr created time in 2 months

more