profile
viewpoint

keras-team/keras-applications 1547

Reference implementations of popular deep learning models.

keras-team/keras-cv 51

Industry-strength computer vision workflows with Keras

tanzhenyu/baselines-tf2 6

openai baselines with tensorflow 2.0

tanzhenyu/spinup-tf2 2

spinup tutorial with tensorflow 2.0

tanzhenyu/addons 0

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons

tanzhenyu/baselines 0

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

tanzhenyu/community 0

Stores documents used by the TensorFlow developer community

tanzhenyu/examples 0

TensorFlow examples

tanzhenyu/governance 0

Governance of the Keras API.

pull request commenttensorflow/tensorflow

Resubmit "Keras grouped convolutions"

Thanks for approving, looks like CI is also happy this time 💚

Great, thanks for fixing!

lgeiger

comment created time in 12 hours

issue commenttensorflow/tensorflow

Keras fails to feed ragged/sparse inputs with correct input placeholder

Does it still fail? Looks like it's fixed in nightly -- even the colab suggests so

shkarupa-alex

comment created time in 13 hours

pull request commenttensorflow/tensorflow

Added ResNext models

Thanks Jake. I realized that 1) this needs API proto generation, which you probably wouldn't be able to do so; 2) this needs basic unit test, 3) there's a bug on line of code to load_weights, you need a comma in front of by_name.

So I'm gonna create a commit instead and acknowledge your PR inside the commit.

jaketae

comment created time in 14 hours

pull request commenttensorflow/tensorflow

Resubmit "Keras grouped convolutions"

I'm not sure why it didn't show up in external CI. But here's what we got:

  File "/tensorflow/python/keras/layers/convolutional_test.py", line 296, in test_conv3d
    self._run_test(kwargs, expected_output_shape)
  File "/tensorflow/python/keras/layers/convolutional_test.py", line 265, in _run_test
    expected_output_shape=expected_output_shape)
  File "/tensorflow/python/framework/test_util.py", line 1717, in decorated
    result = f(self, *args, **kwargs)
  File "/tensorflow/python/keras/testing_utils.py", line 227, in layer_test
    model.train_on_batch(input_data, actual_output)
  File "/tensorflow/python/keras/engine/training.py", line 1476, in train_on_batch
    logs = train_function(iterator)
  File "/tensorflow/python/eager/def_function.py", line 766, in __call__
    result = self._call(*args, **kwds)
  File "/tensorflow/python/eager/def_function.py", line 826, in _call
    return self._stateless_fn(*args, **kwds)
  File "/tensorflow/python/eager/function.py", line 2812, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/tensorflow/python/eager/function.py", line 1838, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "/tensorflow/python/eager/function.py", line 1915, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/tensorflow/python/eager/function.py", line 549, in call
    ctx=ctx)
  File "/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.UnimplementedError:  Hit a case for convolution that is not implemented on GPU.
	 [[{{node cluster_131_1/xla_compile}}]] [Op:__inference_train_function_7135]
lgeiger

comment created time in 4 days

pull request commenttensorflow/addons

Incorporate low-rank techniques into DCN.

LGTM thanks! Apologies for the delay in review.

Thanks for the review!

tanzhenyu

comment created time in 5 days

pull request commenttensorflow/addons

Incorporate low-rank techniques into DCN.

Thanks for the PR! Mostly LGTM, one change requested in the testing and then a few questions.

Gentle ping :-)

tanzhenyu

comment created time in 5 days

Pull request review commenttensorflow/addons

Incorporate low-rank techniques into DCN.

 def test_full_matrix():     np.testing.assert_allclose([[0.55, 0.8, 1.05]], output)  +@pytest.mark.usefixtures("maybe_run_functions_eagerly")+def test_low_rank_matrix():+    x0 = np.asarray([[0.1, 0.2, 0.3]]).astype(np.float32)+    x = np.asarray([[0.4, 0.5, 0.6]]).astype(np.float32)+    layer = PolynomialCrossing(projection_dim=1, kernel_initializer="ones")+    output = layer([x0, x])+    np.testing.assert_allclose([[0.55, 0.8, 1.05]], output)++ @pytest.mark.usefixtures("maybe_run_functions_eagerly") def test_invalid_proj_dim():     with pytest.raises(ValueError) as exception_info:         x0 = np.random.random((12, 5))         x = np.random.random((12, 5))         layer = PolynomialCrossing(projection_dim=6)         layer([x0, x])-    assert "is not supported yet" in str(exception_info.value)+    assert "should be smaller than last_dim / 2" in str(exception_info.value)

Good point. Done

tanzhenyu

comment created time in 8 days

Pull request review commenttensorflow/addons

Incorporate low-rank techniques into DCN.

 def build(self, input_shape):             )         last_dim = input_shape[-1][-1]         if self.projection_dim is None:-            kernel_shape = [last_dim, last_dim]+            self.kernel = self.add_weight(+                "kernel",+                shape=[last_dim, last_dim],+                initializer=self.kernel_initializer,+                regularizer=self.kernel_regularizer,+                dtype=self.dtype,+                trainable=True,+            )         else:-            if self.projection_dim != last_dim:+            if self.projection_dim < 0 or self.projection_dim > last_dim / 2:

so when last_dim=6, projection_dim <= 3; when last_dim=7, projection_dim <= 3; when last_dim=8, projection_dim <= 4. I think this is intended.

tanzhenyu

comment created time in 8 days

Pull request review commenttensorflow/addons

Incorporate low-rank techniques into DCN.

 def build(self, input_shape):             )         last_dim = input_shape[-1][-1]         if self.projection_dim is None:-            kernel_shape = [last_dim, last_dim]+            self.kernel = self.add_weight(+                "kernel",+                shape=[last_dim, last_dim],+                initializer=self.kernel_initializer,+                regularizer=self.kernel_regularizer,+                dtype=self.dtype,+                trainable=True,+            )         else:-            if self.projection_dim != last_dim:+            if self.projection_dim < 0 or self.projection_dim > last_dim / 2:                 raise ValueError(-                    "The case where `projection_dim` != last "-                    "dimension of the inputs is not supported yet, got "-                    "`projection_dim` {}, and last dimension of input "-                    "{}".format(self.projection_dim, last_dim)+                    "`projection_dim` should be smaller than last_dim / 2 to improve"

No strong preference other either case. Though I'd like being restrict in the first case and then relax the constraints if it's proven to be effective in modeling.

tanzhenyu

comment created time in 8 days

push eventtanzhenyu/addons

Zhenyu Tan

commit sha 4eae11473c1d20fa5f82c8af747852d57293959b

addressing comments.

view details

push time in 8 days

Pull request review commenttensorflow/addons

Incorporate low-rank techniques into DCN.

 class PolynomialCrossing(tf.keras.layers.Layer):         ```      Arguments:-        projection_dim: project dimension. Default is `None` such that a full-          (`input_dim` by `input_dim`) matrix is used.+        projection_dim: project dimension to reduce the computational cost.+          Default is `None` such that a full (`input_dim` by `input_dim`)+          matrix W is used. If enabled, a low-rank matrix W = U*V will be used,+          where U is of size `input_dim` by `projection_dim` and V is of size+          `projection_dim` by `input_dim`. `projection_dim` need to be smaller+          than `input_dim`/2 to improve the model efficiency.

Not sure if I follow, projection_dim semantics and default value is not changed, we merely add the capacity for lower-rank decomposition?

tanzhenyu

comment created time in 8 days

issue commenttensorflow/addons

Inconsistent APIs within tfa.image and between tf.image and tfa.image

As of today, NHWC is what CPU only supports. In Cuda, it will mostly convert to NCHW first and call an efficient cudnn, except if it's float16 and > Volta V100, but this is done by default.

So yeah I think we should be consistent with tf.image

WindQAQ

comment created time in 13 days

issue commenttensorflow/addons

Inconsistent APIs within tfa.image and between tf.image and tfa.image

I believe we could drop it. The original channel_first exist mostly when cuda doesn't support it very well so channel_first ops are placed on CPU (IIUC).

WindQAQ

comment created time in 13 days

pull request commenttensorflow/tensorflow

Support Keras grouped convolutions

This seems to be reverted in dd2ea87.

It broke the gpu unit tests which was added in this PR.

Some obscured error message: Traceback (most recent call last): File "/third_party/tensorflow/python/keras/layers/convolutional_test.py", line 393, in test_group_conv self.assertAllClose(layer(inputs), expected_outputs, rtol=1e-5) File "/third_party/tensorflow/python/framework/test_util.py", line 1217, in decorated return f(*args, **kwds) File "/third_party/tensorflow/python/framework/test_util.py", line 2598, in assertAllClose self._assertAllCloseRecursive(a, b, rtol=rtol, atol=atol, msg=msg) File "/build/work/01080b4d142eb76348ad5b7462050a4cb61f/google3/runfiles/google3/third_party/tensorflow/python/framework/test_util.py", line 2558, in _assertAllCloseRecursive (path_str, path_str, msg))) File "/third_party/tensorflow/python/framework/test_util.py", line 2451, in _assertArrayLikeAllClose a = self._GetNdArray(a) File "/build/work/01080b4d142eb76348ad5b7462050a4cb61f/google3/runfiles/google3/third_party/tensorflow/python/framework/test_util.py", line 2445, in _GetNdArray a = self.evaluate(a) File "/third_party/tensorflow/python/framework/test_util.py", line 2151, in evaluate return sess.run(tensors) File "/build/work/01080b4d142eb76348ad5b7462050a4cb61f/google3/runfiles/google3/third_party/tensorflow/python/framework/test_util.py", line 1683, in run return super(ErrorLoggingSession, self).run(*args, **kwargs) File "/third_party/tensorflow/python/client/session.py", line 958, in run run_metadata_ptr) File "/third_party/tensorflow/python/client/session.py", line 1181, in _run feed_dict_tensor, options, run_metadata) File "/third_party/tensorflow/python/client/session.py", line 1359, in _do_run run_metadata) File "/third_party/tensorflow/python/client/session.py", line 1384, in _do_call raise type(e)(node_def, op, message) third_party.tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found. (0) Failed precondition: Read of uninitialized variable [[node conv1d/BiasAdd/ReadVariableOp (defined at /third_party/tensorflow/python/keras/layers/convolutional_test.py:393) ]] [[xla_compile]] [[xla_run/_1]] (1) Failed precondition: Read of uninitialized variable [[node conv1d/BiasAdd/ReadVariableOp (defined at /third_party/tensorflow/python/keras/layers/convolutional_test.py:393) ]] [[xla_compile]]

lgeiger

comment created time in 15 days

pull request commenttensorflow/addons

Incorporate low-rank techniques into DCN.

@bhack doesn't seem this pr is reviewer assigned. Who can help approve this? (Since I'm the owner of this file)

tanzhenyu

comment created time in 16 days

issue commenttensorflow/addons

Formalize process for code merging into tf.image, tfa.image, and keras-preprocessing

  1. (I think) TFA is the default place to start experimenting ops/layers. When new algorithms come up and many people would like to use it / contribute it with TF ecosystem, we'd always recommend TFA as a starting point. Whether TFA decides to accept this or not is completely up to community leads, i.e., mainly you (and owners of tfa.image in this case). When this becomes widely used, we need sponsorship from inside TF to a) migrate to core, b) maintain it for a long time.
  2. In this particular case of ImageProjectiveTransformV2 op, preprocessing layers (that will live in core) relies on it and I decided I will be sponsoring this in core, hence the commit (and also mentioned that we need to deprecated in commit message). What could have been done better is I should probably raise an issue right after this is committed (it slipped my mind apparently). But a sync would mean someone that is very up-to-date with both core and TFA, AFAIK there isn't a formal process, what I would propose is let's ask TF for some sponsorship in each area, i.e., have someone from TF that attends your monthly meetings on image related, loss/metrics related, text related, optimizer related, etc -- that can be a single person, or several of them. Let's discuss this.

We are trying to improve the process at: tensorflow/community#241 (upstreaming) tensorflow/community#242 (downstreaming)

I think we could solve 1. and 2. if we find a common vision in these templates.

On the TF side I think you just need to add, to the internal and public PR/review triage checklist an extra check to verify if the code is already in SIGs.

The main issue about 3 is that if we don't have a minimal short range roadmap about ops planned in keras preprocessing by TF team developing activities and Gsoc student activities we are at risk to have PR here in addons about contributors that could conflict/duplicate/overlap.

So I think nobody here want to waste free contributors time and so the best solution is to have a minimal overview of the plan in this area.

E.g. If we know that the TF team is not going to add new image operators and we have a public list of ops in the Gsoc roadmap it could be clear to potential contributors on what kind of image ops PR we are looking for here in Addons.

We don't have any short-term plans to add ops. For keras preprocessing, they are described here and already implemented. What AutoAugment requires (such as solarize, sample pairing) will not be made into core. Francois and I are discussing the roadmap for keras_image which might include some of these.

seanpmorgan

comment created time in 16 days

issue commenttensorflow/addons

Formalize process for code merging into tf.image, tfa.image, and keras-preprocessing

Currently there is no communication or criteria for what is being merged into the different repositories. This creates duplicated code within the ecosystem and is wasting contributor time/effort.

As an example, we've tried to extend ImageProjectiveTransformV2 without realizing that TF-core had already implemented the same functionality: tensorflow/tensorflow@2699281

@tanzhenyu Could you give us some clarification on what the internal process is for adding components to tf.image? Can we publish a roadmap of features that are planned to be added and can we make verifying against Addons part of the process for adding new things going forward?

@dynamicwebpaige Could you help clarify whats being contributed to Keras in this GSoC project: https://summerofcode.withgoogle.com/projects/#4863446367076352

The auto-augment issue caused us to receive a lot of PRs implementing functionality into Addons which looks as though it may now be going directly to Keras.

CC @karmel for visibility and see if you had any ideas for how we can formalize this process.

Related issues (To be expanded upon because there are many): #1779 #1126 #1275

Sorry @seanpmorgan @bhack , apparently I missed this comment!

Let me try to answer your question in several perspectives:

  1. (I think) TFA is the default place to start experimenting ops/layers. When new algorithms come up and many people would like to use it / contribute it with TF ecosystem, we'd always recommend TFA as a starting point. Whether TFA decides to accept this or not is completely up to community leads, i.e., mainly you (and owners of tfa.image in this case). When this becomes widely used, we need sponsorship from inside TF to a) migrate to core, b) maintain it for a long time.
  2. In this particular case of ImageProjectiveTransformV2 op, preprocessing layers (that will live in core) relies on it and I decided I will be sponsoring this in core, hence the commit (and also mentioned that we need to deprecated in commit message). What could have been done better is I should probably raise an issue right after this is committed (it slipped my mind apparently). But a sync would mean someone that is very up-to-date with both core and TFA, AFAIK there isn't a formal process, what I would propose is let's ask TF for some sponsorship in each area, i.e., have someone from TF that attends your monthly meetings on image related, loss/metrics related, text related, optimizer related, etc -- that can be a single person, or several of them. Let's discuss this.
  3. Because of 2), things need to be discussed case-by-case. For basic image ops like translate and rotate, they should exist in core. For auto-augment and rand-augment, if I'm not wrong (correct me if I am) this seems to be mostly used in EfficientNet and I'd like to see them live in TFA (at least for a while) -- they are preprocessing for sure, but it wouldn't show up in tf.keras.layers.preprocessing for a while unless this is proven to be a general technique for improving accuracy.

Cheers,

seanpmorgan

comment created time in 17 days

issue commenttensorflow/tensorflow

tf.keras.layers.experimental.preprocessing.RandomXXX API is missing "nearest" as an option for fill_mode.

@tanzhenyu

https://keras.io/api/preprocessing/image/#imagedatagenerator-class

tf.keras.preprocessing.image.ImageDataGenerator( ... channel_shift_range=0.0, fill_mode="nearest", cval=0.0, ... )

Makes sense. We could support this mode, but it wouldn't be default, and probably I wouldn't be able to make it before 2.3. If you need this, contribution welcome!

kechan

comment created time in 18 days

pull request commenttensorflow/addons

Incorporate low-rank techniques into DCN.

 #11 [black-test 5/6] RUN black --check /addons
#11 2.708 would reformat /addons/tensorflow_addons/layers/polynomial.py
#11 3.128 would reformat /addons/tensorflow_addons/layers/tests/polynomial_test.py
#11 8.070 Oh no! 💥 💔 💥

Are you running the hook? It requires docker https://github.com/tensorflow/addons/blob/master/tools/pre-commit.sh

docker is a challenge :-) Thanks!

tanzhenyu

comment created time in 18 days

push eventtanzhenyu/addons

Zhenyu Tan

commit sha 6d7e4e2a323b72176b07792c66aa53619c7bdfb5

Add final fix from black.

view details

push time in 18 days

issue commenttensorflow/tensorflow

tf.keras.layers.experimental.preprocessing.RandomXXX API is missing "nearest" as an option for fill_mode.

@tanzhenyu Not sure what you mean by "masked out" exactly. But minor masking out is sometimes what data augmentation intends to do??, e.g. crop out, or randomly obscuring part of object. Also, optimal image augmentation is domain specific, so don't need to say which work better or worse. FYI: I did try "reflect" for my task, and i obtained slightly worse accuracy (although I can't robustly attribute it to this choice.

This feature is after all in Keras for over 2 years and it is the default. If the broader TF community think it ought to not exist, I am fine with that. I will just have to implement my own...

What do you mean by default for Keras?

kechan

comment created time in 18 days

push eventtanzhenyu/addons

Zhenyu Tan

commit sha 4b8782826aa3e3002c7b8afd531292a9805d6555

Another fix.

view details

push time in 19 days

pull request commenttensorflow/addons

Incorporate low-rank techniques into DCN.

Do you have the hooks at https://github.com/tensorflow/addons/blob/master/CONTRIBUTING.md#coding-style?

Hmm...doesn't give me any changes?

tanzhenyu

comment created time in 19 days

issue commenttensorflow/tensorflow

tf.keras.layers.experimental.preprocessing.RandomXXX API is missing "nearest" as an option for fill_mode.

@tanzhenyu I used the old keras data augmentation API for a while now, I am pretty sure I don't want interpolation to "nearest". I really want fill_mode to be "nearest". The effect is that, for example, if you right shift the image, the "void" is filled by copying the edge pixel of the photo over and over until you fill up that void.

Keras actually used this as the default. I may guess that most ppl find it reasonable choice for most image tasks. So leaving this out of the new API is not ideal.

In that case you may end up with the main content of your image being masked out. reflect in this case would honestly work better.

kechan

comment created time in 19 days

pull request commenttensorflow/addons

Incorporate low-rank techniques into DCN.

@seanpmorgan Not sure what's failing here?

tanzhenyu

comment created time in 19 days

push eventtanzhenyu/addons

Zhenyu Tan

commit sha 757d7203368b474635ad5aedc14b78fef71447f3

Final fix.

view details

push time in 19 days

push eventtanzhenyu/addons

Zhenyu Tan

commit sha 72fcf6d122402ed1c66aa8d9f81fe52b1008bac6

Fix.

view details

push time in 19 days

push eventtanzhenyu/addons

Zhenyu Tan

commit sha d36d778c00f1afbe884b998aff9a5faac4043448

Fix by black ./.

view details

push time in 19 days

PR opened tensorflow/addons

Incorporate low-rank techniques into DCN.

It supports a low-rank kernel W = U * V, where U \in R^{last_dim x projection_dim} and V \in R^{projection_dim x last_dim}; Introduces a flag diag_scale that increases the diagonal of kernel W by diag_scale.

+66 -21

0 comment

2 changed files

pr created time in 19 days

push eventtanzhenyu/addons

Zhenyu Tan

commit sha a4e5c39bcda375eb5a64392b65f53c3fde1e85b6

Incorporate low-rank techniques into DCN. It supports a low-rank kernel W = U * V, where U \in R^{last_dim x projection_dim} and V \in R^{projection_dim x last_dim}; Introduces a flag diag_scale that increases the diagonal of kernel W by diag_scale.

view details

push time in 19 days

fork tanzhenyu/addons

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons

fork in 19 days

issue commenttensorflow/tensorflow

tf.keras.layers.experimental.preprocessing.RandomXXX API is missing "nearest" as an option for fill_mode.

To be more specific: -- we have fill_mode which stands for when the output coordinate is out of range, how do we compute a in-range input coordinate, so reflect, wrap, constant are valid modes. -- we have interpolation which stands for when the computed equivalent input coordinate is a float, not int. In this case, nearest and bilinear are valid modes. -- potential extra mode is spline interpolation, which is what scipy supports today.

kechan

comment created time in 19 days

issue commenttensorflow/addons

Deprecate ImageProjectiveTransformV2

@bhack this issue is for what needs to be done to deprecate ImageProjectTransformV2.

Please create a new issue to discuss how we can clarify the tf.image, tfa.image, and Keras image preprocessing relationship. Alternatively, add it to the meeting agenda. and let us know who you want us to try to get to the meeting for in person discussion.

Yeah I believe those are separate issues. For ImageProjectiveTransformV2 we could either 1) completely remove it from tfa.image, or 2) keep a reference to core.

WindQAQ

comment created time in 22 days

issue commenttensorflow/tensorflow

A high learning rate may cause a nan or an inf loss with tf.keras.optimizers.SGD

@gdhy9064 High learning rate is usually the root cause for many NAN problems. You can try with a lower value, or with another adaptive learning rate optimizer such as Adam.

gdhy9064

comment created time in a month

Pull request review commenttensorflow/tensorflow

Add Usage Example to Keras Nadam Optimizer

 class Nadam(optimizer_v2.OptimizerV2):    References     See [Dozat, T., 2015](http://cs229.stanford.edu/proj2015/054_report.pdf).+    +  Usage Example:

I would really like to see examples like this: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/optimizer_v2/adam.py#L73-L79

jedlimlx

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.keras GradientTape: get gradient with respect to input

Can you get the gradient of the input by calling:

tape.gradient(loss, input)

instead of:

tape.gradient(model.input)

?

dmus

comment created time in 2 months

issue commenttensorflow/tensorflow

Shape information is lost with DepthwiseConv2D

Looks like the issue comes from backend.depthwise_conv2d, using tf.transpose(x, (0, 3, 1, 2))

jnd77

comment created time in 2 months

IssuesEvent

issue closedtensorflow/tensorflow

TF 2.0 - Gradient of 'tf.keras.layers.Dense with bias' produces non-deterministic result

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary): pip install tf-nightly-gpu-2.0-preview==2.0.0.dev20190826
  • TensorFlow version (use command below): v1.12.1-9705-g0fbc138 2.0.0-dev20190826
  • Python version: 3.6.9
  • CUDA/cuDNN version: 10.0.0/7.3.1
  • GPU model and memory: Titan Xp 11Gb

Describe the current behavior (1) The following code produces the same 'numpy_data0.pkl', 'initial_params0.pkl', 'loss0.pkl' all the times (which means same data, same parameter, same loss), but 'grad0.pkl' changes. I checked it with 'diff' command between generated files. (2) It seems only with tensorflow 2.0 GPU version, this happens. I checked the code with tf-nightly-2.0-preview==2.0.0.dev20190830 (CPU version), it was ok. (= shows deterministic result) (3) Using custom dense layer + tf.keras.layers.ReLU() was ok also. (= shows deterministic result) Custom dense layer was

class MyDenseLayer(tf.keras.layers.Layer):
    def __init__(self, num_outputs):
        super(MyDenseLayer, self).__init__()
        self.num_outputs = num_outputs
    def build(self, input_shape):
        self.kernel = self.add_variable("kernel", initializer=tf.keras.initializers.GlorotUniform(),
                                        shape=[int(input_shape[-1]),
                                               self.num_outputs])
        self.bias = self.add_variable("bias", initializer=tf.zeros_initializer,
                                        shape=[self.num_outputs])
    def call(self, input):
        return tf.matmul(input, self.kernel) + self.bias

And net with

net = tf.keras.Sequential()
net.add(MyDenseLayer(100))
net.add(tf.keras.layers.ReLU())
net.add(MyDenseLayer(100))
net.add(tf.keras.layers.ReLU())
net.add(MyDenseLayer(1))
net.build((None, input_dim))

(+) When 'use_bias=False' option applied on hidden layers, is was ok. (= shows deterministic result)

Describe the expected behavior Since CUDNN force to behave determinisically (os.environ['TF_CUDNN_DETERMINISTIC'] = 'true'), and all the data/parameter/loss are the same, grad is expected to be same.

Code to reproduce the issue

import os
import pickle
import random
import numpy as np
import tensorflow as tf

os.environ['TF_CUDNN_DETERMINISTIC'] = 'true'

seed = 1234
np.random.seed(seed)
tf.random.set_seed(seed)
random.seed(seed)

# NN Model
input_dim = 5
net = tf.keras.Sequential()
net.add(tf.keras.layers.Dense(100, activation=tf.nn.relu, kernel_initializer=None))
net.add(tf.keras.layers.Dense(100, activation=tf.nn.relu, kernel_initializer=None))
net.add(tf.keras.layers.Dense(1, activation=None, kernel_initializer=None))
net.build((None, input_dim))

# Initial v_params
initial_v_params = net.variables

# Update NN Model one-step
x = np.random.normal(loc=0, scale=1., size=[1000, input_dim])
y = np.random.normal(loc=0, scale=1., size=[1000])

with tf.GradientTape() as tape:
    loss = tf.reduce_mean(tf.square(y - net(x)))
grad = tape.gradient(loss, net.trainable_variables)

# Tag for comparing files
tag = 1

with open('./numpy_data{}.pkl'.format(tag), 'wb') as f:
    pickle.dump([x, y], f)

with open('./initial_params{}.pkl'.format(tag), 'wb') as f:
    pickle.dump(initial_v_params, f)

with open('./loss{}.pkl'.format(tag), 'wb') as f:
    pickle.dump(loss, f)

with open('./grad{}.pkl'.format(tag), 'wb') as f:
    pickle.dump(grad, f)

closed time in 2 months

movinghoon

issue commenttensorflow/tensorflow

TF 2.0 - Gradient of 'tf.keras.layers.Dense with bias' produces non-deterministic result

Closing this based on above comments. Thanks all!

movinghoon

comment created time in 2 months

issue commenttensorflow/tensorflow

TimeseriesGenerator for labeled time-series such as sensor data

Thanks for the write-up. I'm not sure I understand the use case here -- this generator are mostly for RNN models, which we do need target to be the label as value of the last time step. Are you using this for other things?

mmalekzadeh

comment created time in 2 months

pull request commentkeras-team/autokeras

Use preprocessing layers for categorical encoding

https://github.com/keras-team/autokeras/pull/1090/files#diff-b2a4808b666256ed34518e4234d24721R37

Have you tested this in tf-nightly? I remembered this being an issue but resolved lately. If not, please file a issue!

haifeng-jin

comment created time in 2 months

issue closedtensorflow/tensorflow

sigmoid is ignored when calculating loss by calling method model.fit

<em>Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template</em>

System information

  • Windoiws

Version: ('v2.0.0-rc2-26-g64c3d382ca', '2.0.0')

Describe the current behavior I got a incorrect loss from history returned by calling model.fit. You can see the correct and incorrect result by change parameter "error " from my code.

Code to reproduce the issue

import math
import numpy as np
import tensorflow as tf

error = True
n_features = 100
batch = 2

"""
model
"""
x = tf.keras.Input(shape=(n_features,), dtype=tf.float32)
w = tf.Variable([1.0] * n_features)
b = tf.Variable(1.0)
z = tf.reduce_sum(w * x, axis=1, keepdims=True) + b

"""
loss is incorrect if error is true
"""
if error:
    y_ = tf.sigmoid(z)
else:
    y_ = 1.0 / (1.0 + math.e ** (-z))

m = tf.keras.Model(inputs=x, outputs=y_)

"""
loss
"""
optimizer=tf.keras.optimizers.SGD(learning_rate=0.001)
loss = tf.keras.losses.BinaryCrossentropy()
m.compile(optimizer = optimizer, loss = loss)

"""
train dataset
"""
x = np.array([[1.0 for i in range(n_features)]] * batch, dtype=np.float32)
y = np.array([0.0] * batch, dtype=np.float32)

"""
get correct loss
"""
logits = m(x)
l = loss(y, logits)

"""
get incorrect loss
"""
history = m.fit(x, y)

"""
history.history['loss'] != l.numpy()
"""
print(history.history)
print(l.numpy())

closed time in 2 months

tornadoyi

issue commenttensorflow/tensorflow

sigmoid is ignored when calculating loss by calling method model.fit

Closing it for now. Let us know if the previous comments didn't help you :-)

tornadoyi

comment created time in 2 months

issue commenttensorflow/tensorflow

Shape information is lost with DepthwiseConv2D

Thanks for the report! We wouldn't be able to make 1.x changes at this moment (or back-porting). This works find in tf-nightly, please check the below code:

# Try first with channels_last
input_tensor = np.random.random((32, 12, 12, 32)).astype(np.float32)
input_tensor_shape = (None, None, None, 32)
depthwise_conv = tf.keras.layers.DepthwiseConv2D(kernel_size=(3, 3), data_format='channels_last',
                                                  dilation_rate=(2, 2), use_bias=False, input_shape=input_tensor_shape[1:])
print(depthwise_conv(input_tensor).shape)

# Then with channels_first
input_tensor = np.random.random((32, 32, 12, 12)).astype(np.float32)
input_tensor_shape = (None, 32, None, None)
depthwise_conv = tf.keras.layers.DepthwiseConv2D(kernel_size=(3, 3), data_format='channels_first',
                                                  dilation_rate=(2, 2), use_bias=False, input_shape=input_tensor_shape[1:])
print(depthwise_conv(input_tensor).shape)
(32, 8, 8, 32)
(32, 32, 8, 8)
jnd77

comment created time in 2 months

issue closedtensorflow/tensorflow

Shape information is lost with DepthwiseConv2D

<em>Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template</em>

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 1.15
  • CUDA/cuDNN version: 10.2
  • GPU model and memory: RTX 2060

Describe the current behavior

In some cases (use_bias=False, dilation_rate > 1, data_format='channels_first'), the shape information is lost after DepthwiseConv2D.

Describe the expected behavior

It should behave the same way for channels_last and channels_first. When running the code below, channels_last prints shape=(?, ?, ?, 32) and channels_first prints shape=(?, ?, ?, ?). It should be shape=(?, 32, ?, ?)

Standalone code to reproduce the issue

Here for the gist

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

closed time in 2 months

jnd77

issue commenttensorflow/tensorflow

Discrepancy between keras.layers.Reshape and tf.keras.layers.Reshape

This is an issue that happened in eager mode, seems the solution is:

if not tf.executing_eagerly():
      # Set the static shape for the result since it might lost during array_ops
      # reshape, eg, some `None` dim in the result could be inferred.
      result.set_shape(self.compute_output_shape(inputs.shape))
    return result

We made a commit which was reverted. Waiting for another commit.

rgov

comment created time in 2 months

issue closedtensorflow/tensorflow

tf.keras.applications download path should be made configurable.

The download path for the models downloaded using tf.keras.applications seems to be hardcoded to ~/.keras/models.
This path should be made configurable. It will benefit people who have less storage space in their home folder, which is a feature in many large-scale computational clusters.

closed time in 2 months

indranaut

issue commenttensorflow/tensorflow

tf.keras.applications download path should be made configurable.

Closing this for now. Let us know if you have any other questions.

indranaut

comment created time in 2 months

issue closedtensorflow/tensorflow

Add implementation of LAYER conv1d_transpose

Would it be possible that someone implements a conv1d_transpose layer?

There is a tf.contrib.nn.conv1d_transpose implementation but it has many problems, such as the need for hardcoding batch size.

System information

  • TensorFlow version (you are using): 1.3
  • Are you willing to contribute it (Yes/No): Yes.

Will this change the current api? How? It will create a new implementation of a layer.

Who will benefit with this feature? Many users, in particular people working on 1d time series, etc.

closed time in 2 months

marianogabitto

issue commenttensorflow/tensorflow

Add implementation of LAYER conv1d_transpose

Fixed. Closing it.

marianogabitto

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.keras.layers.Conv1DTranspose ?

Thank you @tanzhenyu - that was really quick!

P.S. I guess it also resolves #29157

Yeah, resolving that too. Thanks

yoshihikoueno

comment created time in 2 months

issue closedtensorflow/tensorflow

tf.keras.layers.Conv1DTranspose ?

This is somewhat related to the issue #8729, which is already solved. In the issue, tf.nn.conv1d_transpose was requested and implemented in the end.

But the corresponding function in tf.layers or tf.keras is missing. In other words, there's no function like tf.layers.conv1d_transpose, tf.keras.layers.Conv1DTranspose.

Can you please implement it? Since there's already tf.nn.conv1d_transpose, I guess it doesn't take so much time to implement it.

closed time in 2 months

yoshihikoueno

issue commenttensorflow/tensorflow

tf.keras.layers.Conv1DTranspose ?

This is fixed and should be available in tf-nightly tomorrow. Closing it.

yoshihikoueno

comment created time in 2 months

pull request commenttensorflow/tensorflow

Make keras model load compatible with old version of models

Not sure if this is desired. Can't you just update your config?

Thanks for review, @tanzhenyu!

We can update our model to work with tf-2.1. However, I think it will save time for users if Keras API in tf-2.x can keep compatible with the old models generated by tf-1.x. Otherwise, there may be some other users who met similar issues with ours.

Note that tf-2.0 can load the above model file but some recent changes in tf-2.1 break it. I'm not sure if it is on purpose, or maybe forgot some edge cases.

Can you be more specific on 1) what was the original config, 2) what error message does it provide, 3) what are alternative solutions towards this? And also a unit test would be desired as well.

feihugis

comment created time in 2 months

PR closed tensorflow/tensorflow

TemporalDropout1D cla: yes comp:keras size:S stat:awaiting response

Removing words and estimating from context is one of the recent language modeling tasks. This kind of noise could be useful as regularization as well.

+33 -0

5 comments

1 changed file

halidziya

pr closed time in 2 months

pull request commenttensorflow/tensorflow

TemporalDropout1D

Closing this PR now. See discussions above. Let us know if you have other concerns.

halidziya

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

TemporalDropout1D

 def _get_noise_shape(self, inputs):     input_shape = array_ops.shape(inputs)     noise_shape = (input_shape[0], 1, input_shape[2])     return noise_shape+  ++@keras_export('keras.layers.TemporalDropout1D')+class TemporalDropout1D(Dropout):+  """Temporal 1D version of Dropout.++  This version performs the same function as Dropout, however it drops+  entire 1D temporal elements instead of individual elements. ++  Arguments:+    rate: Float between 0 and 1. Fraction of the input units to drop.++  Call arguments:+    inputs: A 3D tensor.+    training: Python boolean indicating whether the layer should behave in+      training mode (adding dropout) or in inference mode (doing nothing).++  Input shape:+    3D tensor with shape:+    `(samples, timesteps, channels)`++  Output shape:+    Same as input.+  """++  def __init__(self, rate, **kwargs):+    super(TemporalDropout1D, self).__init__(rate, **kwargs)+    self.input_spec = InputSpec(ndim=3)++  def _get_noise_shape(self, inputs):+    input_shape = array_ops.shape(inputs)+    noise_shape = (input_shape[0], input_shape[2], 1)

That's not TemporalDropout then. This is a useful feature though, which I think you can simply achieve it by calling tf.keras.layers.Dropout(0.5, noise_shape=[None, 10, 1])(inp, training=True)

Without any subclassing or having a new endpoint.

halidziya

comment created time in 2 months

pull request commenttensorflow/tensorflow

Make keras model load compatible with old version of models

Not sure if this is desired. Can't you just update your config?

feihugis

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.keras.layers.Conv1DTranspose ?

Somehow this falls outside of my radar. I will make this done today.

yoshihikoueno

comment created time in 2 months

issue commenttensorflow/tensorflow

sigmoid is ignored when calculating loss by calling method model.fit

I did some digging. This is an issue since this commit

So both loss are not wrong. It's just different on how we make numeric approximation on INF. More specifically the binary_crossentropy loss is: bce = y*logp + (1-y)log(1-p), and y=0, p=sigmoid(logits)=1, logits=101 This is theoretically -INF, however in graph mode our computation relies on tf.nn. sigmoid_cross_entropy_with_logits, which under their docstring is: max(logits, 0) - logits * y + log(1 + exp(-abs(logits))) so loss is same as logits, which is 101, and which is what history.loss reports.

On the other hand, in eager mode our computation relies on: output=clip(epsilon, 1-epsilon) y * log(p + epsilon) + (1-y) * log(1-output_epsilon) == log(2*epsilon) which ends up different than above.

@tornadoyi I wouldn't think this is a huge issue, since this is on very saturated case and approximating how we deal with INF, and it's not how we usually initialize kernels. Let us know if you still have concerns.

tornadoyi

comment created time in 2 months

pull request commenttensorflow/tensorflow

Use case clarification comments #36785

Yep I resolved all conflicts internally and mark it contributed by @abhilash1910

abhilash1910

comment created time in 2 months

issue commenttensorflow/tensorflow

sigmoid is ignored when calculating loss by calling method model.fit

Looking through the code, I don't think what the history stores is loss, its actually the output (before activation)

tornadoyi

comment created time in 2 months

pull request commenttensorflow/tensorflow

Support Keras grouped convolutions

I'm slightly worried about the position of this argument. We would usually put this at end of all positional arguments for backward compatibility reasons.

Personally I think the current placement makes the most sense. It is before activation and use_bias that change the output after doing the convolution and before configuration of initializers and constraints. But it is the last argument that changes the actual convolution operations (i.e. after filters, kernel_size, strides, dilation, ...). This is also where PyTorch places the groups argument though that shouldn't influence this decision too much.

I wouldn't expect most users to change the data_format format keyword, so my gut feeling is that every keyword argument after data_format will likely not be addressed via positional arguments anyway. But I agree technically this might be a breaking change for some users. I am not sure what the general policy for TensorFlow is, but adding it behind the name keyword seems not very consistent. Are there any plans to make some arguments keyword only, so that future additions like this can be made without risking breaking user code?

Yeah putting group here makes sense. Though for compatibility reasons I would imagine putting it after activation would be better, since most users specify up until activation. (For Resnet they might set use_bias=False as well). But I believe they would most likely achieve this by keywords.

AFAIK we don't have this plan. If you would like to contribute, that'd be great.

lgeiger

comment created time in 2 months

pull request commenttensorflow/tensorflow

Support Keras grouped convolutions

@tanzhenyu I updated the goldens in 212616f.

Unfortunately the api_compatibility_test segfaults on my machine, but it seems to work when only checking the keras.layers APIs endpoints, so hopefully CI will be happy with it.

Sorry for the cumbersome review process, but a fast response time for non-trivial changes is sadly not possible for me when the TF build takes an entire night to run 😉

No worries. We're working on faster build time these days. I'm slightly worried about the position of this argument. We would usually put this at end of all positional arguments for backward compatibility reasons.

lgeiger

comment created time in 2 months

issue closedtensorflow/tensorflow

Additional mode for loading of segmentation data in tf.keras.preprocessing.image.ImageDataGenerator.flow_from_dataframe()

System information

  • TensorFlow version (you are using): 2.1.0
  • Are you willing to contribute it (Yes/No): Yes

Describe the feature and the current behavior/state.

As of now, one has to come around with a workaround to load segmentation data in tf.keras.preprocessing.image.ImageDataGenerator to load images and segmentation masks in a generator. A current workaround could look like:

datagen = tf.keras.preprocessing.image.ImageDataGenerator()

df = pd.DataFrame({'images': images,
                   'masks': masks})
# seed, so that image_generator and mask_generator will rotate and shuffle equivalently
seed = 42
image_generator = datagen.flow_from_dataframe(df, 
                                      directory='.', 
                                      x_col='images', 
                                      y_col='masks', 
                                      batch_size=1, 
                                      class_mode=None, 
                                      seed=seed)
mask_generator = datagen.flow_from_dataframe(df, 
                                     directory='.', 
                                     x_col='masks', 
                                     y_col='images', # Or whatever 
                                     batch_size=1, 
                                     class_mode=None, 
                                     seed=seed)

train_generator = zip(image_generator, mask_generator)

# same for validation data generator

model.fit(x=train_generator, epochs=EPOCHS, 
                validation_data=validation_generator, use_multiprocessing=False)

It would be much cleaner, if there would be an option for class_mode (e.g. class_mode='mask') that allows you to load the files specified by the paths in y_col as images.

Will this change the current api? How?

An additional option for the argument class_mode will be added.

Who will benefit with this feature? Everyone who looks for a simple way to load images and corresponding segmentation data.

Any Other info.

closed time in 2 months

RobinBaumann

issue commenttensorflow/tensorflow

Additional mode for loading of segmentation data in tf.keras.preprocessing.image.ImageDataGenerator.flow_from_dataframe()

I'm gonna close this for now, since we all agreed we should implement the dataset way. Please file a PR once it's ready. Thanks!

RobinBaumann

comment created time in 2 months

issue commenttensorflow/tensorflow

Training with GPU on TF 2.0 is much slower than on TF 1.14 if set a large number to `input_dim` of `tf.keras.layers.Embedding`

@goldiegadde and @tanzhenyu I have tried on tf 2.2.0-rc2 with eager mode disabled and the issue has gone. But with eager mode enabled, both on tf 2.2.0-rc2 and tf 1.15.2 are slower.

Eager mode: Disabled

Tensorflow version:  2.2.0-rc2
Tensorflow eager mode:  False
is_gpu_available:  True
Train on 10000 samples
Epoch 1/3
10000/10000 [==============================] - 1s 87us/sample - loss: 0.8003
Epoch 2/3
10000/10000 [==============================] - 1s 90us/sample - loss: 0.7889
Epoch 3/3
10000/10000 [==============================] - 1s 90us/sample - loss: 0.7758
Tensorflow version:  1.15.2
Tensorflow eager mode:  False
is_gpu_available:  True
Train on 10000 samples
Epoch 1/3
10000/10000 [==============================] - 1s 95us/sample - loss: 0.7089
Epoch 2/3
10000/10000 [==============================] - 0s 31us/sample - loss: 0.7039
Epoch 3/3
10000/10000 [==============================] - 0s 32us/sample - loss: 0.6972

Eager mode: Enabled

Tensorflow version:  2.2.0-rc2
Tensorflow eager mode:  True
is_gpu_available:  True
Epoch 1/3
40/40 [==============================] - 7s 181ms/step - loss: 0.7136
Epoch 2/3
40/40 [==============================] - 7s 178ms/step - loss: 0.7085
Epoch 3/3
40/40 [==============================] - 7s 177ms/step - loss: 0.7034
Tensorflow version:  1.15.2
Tensorflow eager mode:  True
is_gpu_available:  True
Train on 10000 samples
Epoch 1/3
10000/10000 [==============================] - 9s 947us/sample - loss: 0.9751
Epoch 2/3
10000/10000 [==============================] - 8s 845us/sample - loss: 0.9588
Epoch 3/3
10000/10000 [==============================] - 8s 848us/sample - loss: 0.9409

So, I think the issue is solved.

Thank you all.

Awesome!

DeviLeo

comment created time in 2 months

pull request commenttensorflow/tensorflow

Support Keras grouped convolutions

@tanzhenyu I improved the tests in 52160bc and rebased onto master so that the API snapshot update won't introduce any merge conflicts.

Yep I'm not sure if you can run bazel run third_party/tensorflow/tools/api/tests:api_compatibility_test -- --update_goldens True

I am trying it now, but it's still compiling 😞. Feel free to update the API snapshots during the internal merge.

There is no "internal merge" before this PR passes all CI builds (unfortunately). If the bazel run doesn't work, you can probably manually edit the .pbtxt files.

lgeiger

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

TemporalDropout1D

 def _get_noise_shape(self, inputs):     input_shape = array_ops.shape(inputs)     noise_shape = (input_shape[0], 1, input_shape[2])     return noise_shape+  ++@keras_export('keras.layers.TemporalDropout1D')+class TemporalDropout1D(Dropout):+  """Temporal 1D version of Dropout.++  This version performs the same function as Dropout, however it drops+  entire 1D temporal elements instead of individual elements. ++  Arguments:+    rate: Float between 0 and 1. Fraction of the input units to drop.++  Call arguments:+    inputs: A 3D tensor.+    training: Python boolean indicating whether the layer should behave in+      training mode (adding dropout) or in inference mode (doing nothing).++  Input shape:+    3D tensor with shape:+    `(samples, timesteps, channels)`++  Output shape:+    Same as input.+  """++  def __init__(self, rate, **kwargs):+    super(TemporalDropout1D, self).__init__(rate, **kwargs)+    self.input_spec = InputSpec(ndim=3)++  def _get_noise_shape(self, inputs):+    input_shape = array_ops.shape(inputs)+    noise_shape = (input_shape[0], input_shape[2], 1)

The time_step axis is axis=1, so,

SpatialDropout1D(0.5)(np.ones((10,10,10))*2.0, training=True)[0, 0]
<tf.Tensor: shape=(10, 10), dtype=float32, numpy=
array([[0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.],
       [0., 4., 0., 0., 4., 4., 4., 0., 4., 0.]], dtype=float32)>

means the dropout is applied along the time_step axis, i.e., feature 0 is dropped out, so output[0][:][0] are all 0, feature 1 is not dropped out, so output[0][:][1] are all 4., correct?

halidziya

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.keras.applications download path should be made configurable.

I think this can be achieved via:

  1. using tf.keras.utils.get_file, and configure where you wanna download your checkpoint to, say file_path.
  2. passing weights=file_path when you're creating the model.

Let us know if this works for you.

indranaut

comment created time in 2 months

pull request commenttensorflow/tensorflow

Support Keras grouped convolutions

Looks like api_compatibility_test fails on CI. I'll rebase and update the API goldens. This might take a few hours on my machine though since I don't have a warm build cache.

Though it'd be nice to add test for DepthwiseConv as well

Good idea, I will do that once TF finished building on my machine. I only thought about that this can be used for testing when writing #36773 (comment)

Yep I'm not sure if you can run bazel run third_party/tensorflow/tools/api/tests:api_compatibility_test -- --update_goldens True?

lgeiger

comment created time in 2 months

issue closedtensorflow/tensorflow

[TF 2.0] categorical_column_with_vocabulary_list not usable in custom training loop

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary): Source
  • TensorFlow version (use command below): v2.0.0-beta0-16-g1d91213fe7 2.0.0-beta1
  • Python version: 3.6.8
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

Describe the current behavior

Outside of fit, e.g in a custom training loop, categorical_column_with_vocabulary_list results in an error. I have provided a modified version of Classifying Structured data which demonstrates this.

The error is ValueError: Column dtype and SparseTensors dtype must be compatible. key: thal, column dtype: <dtype: 'string'>, tensor dtype: <dtype: 'int32'>

Describe the expected behavior

Code runs without causing an error

Code to reproduce the issue

It should be directly copy-paste-able

import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split


def df_to_dataset(df, shuffle=True, batch_size=32):
    df = df.copy()
    labels = df.pop('target')
    ds = tf.data.Dataset.from_tensor_slices(
        (dict(df), labels)
    )
    if shuffle:
        ds = ds.shuffle(buffer_size=len(df))
    ds = ds.batch(batch_size)
    return ds


def generate_features():
    feature_columns = []
    feature_layer_inputs = {}


    thal = tf.feature_column.categorical_column_with_vocabulary_list(
          'thal', ['fixed', 'normal', 'reversible'])
    thal_one_hot = tf.feature_column.indicator_column(thal)
    feature_columns.append(thal_one_hot)
    feature_layer_inputs['thal'] = tf.keras.Input(shape=(1,), name='thal', dtype=tf.string)

    return feature_columns, feature_layer_inputs


def create_model(feature_columns, feature_layer_inputs):
    input_layer = tf.keras.layers.DenseFeatures(feature_columns)
    inputs = input_layer(feature_layer_inputs)

    l1 = tf.keras.layers.Dense(128, activation='relu')(inputs)
    l2 = tf.keras.layers.Dense(128, activation='relu')(l1)

    output = tf.keras.layers.Dense(1, activation='sigmoid')(l2)

    model = tf.keras.Model(
        inputs=[v for v in feature_layer_inputs.values()],
        outputs=[output]
    )
    return model


def make_loss(loss_object):
    def loss(model, x, y):
        y_pred = model(x)
        return loss_object(y_true=y, y_pred=y_pred)
    return loss


def grad(model, inputs, targets, loss):
    with tf.GradientTape() as tape:
        loss_value = loss(model, inputs, targets)
    return loss_value, tape.gradient(loss_value, model.trainable_variables)


def fit(epochs, train_ds, model, optimizer, loss_obj):
    loss = make_loss(loss_obj)
    for epoch in range(epochs):
        for i, (x, y) in enumerate(train_ds):
            loss_values, grad_values = grad(model, x, y, loss)
            optimizer.apply_gradients(zip(grad_values, model.trainable_variables))


if __name__ == '__main__':
    URL = 'https://storage.googleapis.com/applied-dl/heart.csv'
    df = pd.read_csv(URL)
    CUSTOM_TRAINING = True

    train, test = train_test_split(df, test_size=0.2)
    train, val = train_test_split(train, test_size=0.2)

    # hardcoded stuff
    batch_size = 32
    train_ds = df_to_dataset(train, batch_size=batch_size)

    # Create model and features
    feature_columns, feature_layer_inputs = generate_features()
    model = create_model(feature_columns, feature_layer_inputs)

    if CUSTOM_TRAINING:
        print('Trying custom training')
        bce = tf.keras.losses.BinaryCrossentropy()
        adam = tf.keras.optimizers.Adam()
        fit(epochs=5, train_ds=train_ds,
            model=model, optimizer=adam, loss_obj=bce)
    else:
        print('Using pre-defined fit')
        model.compile(optimizer='adam',
                      loss='binary_crossentropy',
                      metrics=['accuracy'])
        model.fit(train_ds, epochs=5)

If you flip the CUSTOM_TRAINING variable between True and False (line 72) you'll see what I mean.

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

  1. The complete (relevant) stacktrace is
  File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1758, in <module>
    main()
  File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1752, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1147, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm CE.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/ian.quah/PycharmProjects/tf2/datasets/issues.py", line 90, in <module>
    model=model, optimizer=adam, loss_obj=bce)
  File "/Users/ian.quah/PycharmProjects/tf2/datasets/issues.py", line 65, in fit
    loss_values, grad_values = grad(model, x, y, loss)
  File "/Users/ian.quah/PycharmProjects/tf2/datasets/issues.py", line 57, in grad
    loss_value = loss(model, inputs, targets)
  File "/Users/ian.quah/PycharmProjects/tf2/datasets/issues.py", line 50, in loss
    y_pred = model(x)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 712, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py", line 753, in call
    return self._run_internal_graph(inputs, training=training, mask=mask)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py", line 895, in _run_internal_graph
    output_tensors = layer(computed_tensors, **kwargs)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 712, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py", line 474, in call
    self._state_manager)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py", line 4299, in get_dense_tensor
    return transformation_cache.get(self, state_manager)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py", line 2562, in get
    transformed = column.transform_feature(self, state_manager)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py", line 4238, in transform_feature
    transformation_cache, state_manager)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py", line 3714, in get_sparse_tensors
    transformation_cache.get(self, state_manager), None)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py", line 2562, in get
    transformed = column.transform_feature(self, state_manager)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py", line 3692, in transform_feature
    return self._transform_input_tensor(input_tensor)
  File "/anaconda3/envs/mlpl/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py", line 3668, in _transform_input_tensor
    self.key, self.dtype, input_tensor.dtype))
ValueError: Column dtype and SparseTensors dtype must be compatible. key: thal, column dtype: <dtype: 'string'>, tensor dtype: <dtype: 'int32'>
  1. Placing a debugger on line 50 leads me to feature_column_v2.py specifically _transform_input_tensor

The input_tensor arg to _transform_input_tensor is

SparseTensor(indices=tf.Tensor(
[[ 0  0]
 [ 1  0]
 [ 2  0]
 [ 3  0]
 [ 4  0]
 [ 5  0]
 [ 6  0]
 [ 7  0]
 [ 8  0]
 [ 9  0]
 [10  0]
 [11  0]
 [12  0]
 [13  0]
 [14  0]
 [15  0]
 [16  0]
 [17  0]
 [18  0]
 [19  0]
 [20  0]
 [21  0]
 [22  0]
 [23  0]
 [24  0]
 [25  0]
 [26  0]
 [27  0]
 [28  0]
 [29  0]
 [30  0]
 [31  0]], shape=(32, 2), dtype=int64), values=tf.Tensor(
[49 34 58 46 59 47 55 58 41 68 62 51 61 46 39 48 37 41 51 59 51 70 60 57
 54 60 52 44 65 49 44 59], shape=(32,), dtype=int32), dense_shape=tf.Tensor([32  1], shape=(2,), dtype=int64))

which seems strange? It's like it forgot that it had transformed those variables?

closed time in 2 months

IanQS

issue closedtensorflow/tensorflow

How can I share the weights between two different dilations cnn layer in tensorflow2.0

How can I share the weights between two different dilations cnn layer in tensorflow2.0 In tensorflow1.x, I can just use the tf.variable_scope with the tf.AUTO_REUSE

closed time in 2 months

HuaYZhao

issue commenttensorflow/tensorflow

How can I share the weights between two different dilations cnn layer in tensorflow2.0

Closing it. Let us know if you have further questions!

HuaYZhao

comment created time in 2 months

issue closedtensorflow/tensorflow

tensorflow.contrib.integrate.odeint does not work with the default __call__ method of subclass of tf.Keras.layers.Layer

<em>Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template</em>

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 19.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): both 1.14 and 1.15-rc0
  • Python version: 3.7
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version: 10.1/7.0
  • GPU model and memory: Titan RTX, 24 GB

Describe the current behavior For Tensorflow 1.14: When defining ODE function using the default __call__ method of any subclass of tf.Keras.layers.Layer, an ValueError is raised, stating VarIsInitializedOp has been marked as not fetchable. To avoid this issue, the ODE function has to be defined using either pure TensorFlow functions, or the call method of the subclass instead of __call__. For Tensorflow 1.15-rc0: The same code will run forever without error message. The above solution (using call instead of __call__) also applies here.

Describe the expected behavior tensorflow.contrib.integrate.odeint should work with the default __call__ method of any subclass of tf.Keras.layers.Layer under both TensorFlow 1.14 and 1.15-rc0.

Code to reproduce the issue

import tensorflow as tf
from tensorflow.python.keras import backend, layers, Input
backend.clear_session()
from tensorflow.contrib.integrate import odeint

x = Input(shape=(10,))
l0 = layers.Dense(units=10)
l0.build(x.shape)  # this line is necessary only to make call work

def ode_func(h, t, *ode_params):
    return l0(h)  # l0.call(h) avoids the error which is hacky

ts = tf.constant([0., 1.], dtype=tf.float64)
h_ts = odeint(ode_func, x, ts)

Other info / logs <details><summary>For TensorFlow 1.14: ValueError: '.../VarIsInitializedOp' not fetchable; click here for detailed logs.</summary> <p>

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-ad0e95799922> in <module>()
     13 print('code start')
     14 
---> 15 h_ts = odeint(ode_func, x, ts)
     16 
     17 print('code finished')

23 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/integrate/python/ops/odes.py in odeint(func, y0, t, rtol, atol, method, options, full_output, name)
    538         full_output=full_output,
    539         name=scope,
--> 540         **options)
    541 
    542 

/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/integrate/python/ops/odes.py in _dopri5(func, y0, t, rtol, atol, full_output, first_step, safety, ifactor, dfactor, max_num_steps, name)
    403         lambda _, __, ___, i: i < num_times,
    404         interpolate, (solution, history, rk_state, 1),
--> 405         name='interpolate_loop')
    406 
    407     y = solution.stack(name=scope)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py in while_loop(cond, body, loop_vars, shape_invariants, parallel_iterations, back_prop, swap_memory, name, maximum_iterations, return_same_structure)
   3499       ops.add_to_collection(ops.GraphKeys.WHILE_CONTEXT, loop_context)
   3500     result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants,
-> 3501                                     return_same_structure)
   3502     if maximum_iterations is not None:
   3503       return result[1]

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py in BuildLoop(self, pred, body, loop_vars, shape_invariants, return_same_structure)
   3010       with ops.get_default_graph()._mutation_lock():  # pylint: disable=protected-access
   3011         original_body_result, exit_vars = self._BuildLoop(
-> 3012             pred, body, original_loop_vars, loop_vars, shape_invariants)
   3013     finally:
   3014       self.Exit()

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py in _BuildLoop(self, pred, body, original_loop_vars, loop_vars, shape_invariants)
   2935         expand_composites=True)
   2936     pre_summaries = ops.get_collection(ops.GraphKeys._SUMMARY_COLLECTION)  # pylint: disable=protected-access
-> 2937     body_result = body(*packed_vars_for_body)
   2938     post_summaries = ops.get_collection(ops.GraphKeys._SUMMARY_COLLECTION)  # pylint: disable=protected-access
   2939     if not nest.is_sequence_or_composite(body_result):

/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/integrate/python/ops/odes.py in interpolate(solution, history, rk_state, i)
    381             lambda rk_state, *_: t[i] > rk_state.t1,
    382             adaptive_runge_kutta_step, (rk_state, history, 0),
--> 383             name='integrate_loop')
    384         y = _interp_evaluate(rk_state.interp_coeff, rk_state.t0, rk_state.t1,
    385                              t[i])

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py in while_loop(cond, body, loop_vars, shape_invariants, parallel_iterations, back_prop, swap_memory, name, maximum_iterations, return_same_structure)
   3499       ops.add_to_collection(ops.GraphKeys.WHILE_CONTEXT, loop_context)
   3500     result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants,
-> 3501                                     return_same_structure)
   3502     if maximum_iterations is not None:
   3503       return result[1]

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py in BuildLoop(self, pred, body, loop_vars, shape_invariants, return_same_structure)
   3010       with ops.get_default_graph()._mutation_lock():  # pylint: disable=protected-access
   3011         original_body_result, exit_vars = self._BuildLoop(
-> 3012             pred, body, original_loop_vars, loop_vars, shape_invariants)
   3013     finally:
   3014       self.Exit()

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py in _BuildLoop(self, pred, body, original_loop_vars, loop_vars, shape_invariants)
   2935         expand_composites=True)
   2936     pre_summaries = ops.get_collection(ops.GraphKeys._SUMMARY_COLLECTION)  # pylint: disable=protected-access
-> 2937     body_result = body(*packed_vars_for_body)
   2938     post_summaries = ops.get_collection(ops.GraphKeys._SUMMARY_COLLECTION)  # pylint: disable=protected-access
   2939     if not nest.is_sequence_or_composite(body_result):

/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/integrate/python/ops/odes.py in adaptive_runge_kutta_step(rk_state, history, n_steps)
    345       with ops.control_dependencies(
    346           [check_underflow, check_max_num_steps, check_numerics]):
--> 347         y1, f1, y1_error, k = _runge_kutta_step(func, y0, f0, t0, dt)
    348 
    349       with ops.name_scope('error_ratio'):

/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/integrate/python/ops/odes.py in _runge_kutta_step(func, y0, f0, t0, dt, tableau, name)
    121       ti = t0 + alpha_i * dt
    122       yi = y0 + _scaled_dot_product(dt_cast, beta_i, k)
--> 123       k.append(func(yi, ti))
    124 
    125     if not (tableau.c_sol[-1] == 0 and tableau.c_sol[:-1] == tableau.beta[-1]):

<ipython-input-3-ad0e95799922> in ode_func(h, t, *ode_params)
      8 
      9 def ode_func(h, t, *ode_params):
---> 10     return l0(h)  # l0.call(h) avoids the error which is hacky
     11 
     12 ts = tf.constant([0., 1.], dtype=tf.float64)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    559       # framework.
    560       if base_layer_utils.needs_keras_history(inputs):
--> 561         base_layer_utils.create_keras_history(inputs)
    562 
    563     # Handle Keras mask propagation from previous layer to current layer.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py in create_keras_history(tensors)
    198     keras_tensors: The Tensors found that came from a Keras Layer.
    199   """
--> 200   _, created_layers = _create_keras_history_helper(tensors, set(), [])
    201   return created_layers
    202 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py in _create_keras_history_helper(tensors, processed_ops, created_layers)
    244             constants[i] = backend.function([], op_input)([])
    245       processed_ops, created_layers = _create_keras_history_helper(
--> 246           layer_inputs, processed_ops, created_layers)
    247       name = op.name
    248       node_def = op.node_def.SerializeToString()

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py in _create_keras_history_helper(tensors, processed_ops, created_layers)
    244             constants[i] = backend.function([], op_input)([])
    245       processed_ops, created_layers = _create_keras_history_helper(
--> 246           layer_inputs, processed_ops, created_layers)
    247       name = op.name
    248       node_def = op.node_def.SerializeToString()

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py in _create_keras_history_helper(tensors, processed_ops, created_layers)
    242             constants[i] = op_input
    243           else:
--> 244             constants[i] = backend.function([], op_input)([])
    245       processed_ops, created_layers = _create_keras_history_helper(
    246           layer_inputs, processed_ops, created_layers)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py in __call__(self, inputs)
   3251     inputs = nest.flatten(inputs)
   3252 
-> 3253     session = get_session(inputs)
   3254     feed_arrays = []
   3255     array_vals = []

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py in get_session(op_input_list)
    460   if not _MANUAL_VAR_INIT:
    461     with session.graph.as_default():
--> 462       _initialize_variables(session)
    463   return session
    464 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/backend.py in _initialize_variables(session)
    877     # marked as initialized.
    878     is_initialized = session.run(
--> 879         [variables_module.is_variable_initialized(v) for v in candidate_vars])
    880     uninitialized_vars = []
    881     for flag, v in zip(is_initialized, candidate_vars):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    948     try:
    949       result = self._run(None, fetches, feed_dict, options_ptr,
--> 950                          run_metadata_ptr)
    951       if run_metadata:
    952         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1156     # Create a fetch handler to take care of the structure of fetches.
   1157     fetch_handler = _FetchHandler(
-> 1158         self._graph, fetches, feed_dict_tensor, feed_handles=feed_handles)
   1159 
   1160     # Run request and get response.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in __init__(self, graph, fetches, feeds, feed_handles)
    485         self._ops.append(True)
    486       else:
--> 487         self._assert_fetchable(graph, fetch.op)
    488         self._fetches.append(fetch)
    489         self._ops.append(False)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _assert_fetchable(self, graph, op)
    498     if not graph.is_fetchable(op):
    499       raise ValueError(
--> 500           'Operation %r has been marked as not fetchable.' % op.name)
    501 
    502   def fetches(self):

ValueError: Operation 'odeint/interpolate_loop/interpolate/integrate_loop/runge_kutta_step/VarIsInitializedOp' has been marked as not fetchable.

</p> </details>

closed time in 2 months

richardwth

issue commenttensorflow/tensorflow

tensorflow.contrib.integrate.odeint does not work with the default __call__ method of subclass of tf.Keras.layers.Layer

We don't expect contrib working with Keras. Closing it for now.

richardwth

comment created time in 2 months

issue commenttensorflow/tensorflow

tf.keras accepts incorrect CNN input shapes from tf.data.Dataset when eager execution is disabled

I think we should error out. The fact that:

  1. it doesn't put any warning in model.fit
  2. it puts warning in model.call in a little bit troubling. Here's where it swallows the error: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/engine/network.py#L958-L964 Reassigning to author of that function.
bfmat

comment created time in 2 months

issue commenttensorflow/tensorflow

DepthwiseConv2D missing dilation_rate argument (& higher performance)

This is fixed. For performance issue, please file another bug. I don't see this as a Keras issue, the performance needs to be investigated at either op level or kernel level.

dajoke

comment created time in 2 months

issue closedtensorflow/tensorflow

DepthwiseConv2D missing dilation_rate argument (& higher performance)

<em>Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template</em>

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): (yes)
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary): ongoing, reference github source https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/layers/convolutional.py#L1686-L1877
  • TensorFlow version (use command below): TF2 rc
  • Python version: (3)
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:

You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with: 1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)" 2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the current behavior

Conv, Conv2D, and backend.depthwise_conv2d support dilation_rate, but DepthwiseConv2D does not.

Describe the expected behavior

DepthwiseConv2D needs the dilation_rate argument. It is supported by underlying code and mentioned by the DepthwiseConv2D documentation, but is missing from DepthwiseConv2D.

Note that while the DepthwiseConv2D as below works OK, performance of the forward inference code is significantly less than what it should be. DepthwiseConv2D with dilation is a very useful operation for full-resolution feature matching with a low operation count, so improving the execution code would be valuable.

Other info / logs

Proposed working revised version (sorry, not adept with github).


@keras_export('keras.layers.DepthwiseConv2D')
class DepthwiseConv2D(Conv2D):
  """Depthwise separable 2D convolution.
  Depthwise Separable convolutions consists in performing
  just the first step in a depthwise spatial convolution
  (which acts on each input channel separately).
  The `depth_multiplier` argument controls how many
  output channels are generated per input channel in the depthwise step.
  Arguments:
    kernel_size: An integer or tuple/list of 2 integers, specifying the
      height and width of the 2D convolution window.
      Can be a single integer to specify the same value for
      all spatial dimensions.
    strides: An integer or tuple/list of 2 integers,
      specifying the strides of the convolution along the height and width.
      Can be a single integer to specify the same value for
      all spatial dimensions.
      Specifying any stride value != 1 is incompatible with specifying
      any `dilation_rate` value != 1.
    padding: one of `'valid'` or `'same'` (case-insensitive).
    depth_multiplier: The number of depthwise convolution output channels
      for each input channel.
      The total number of depthwise convolution output
      channels will be equal to `filters_in * depth_multiplier`.
    data_format: A string,
      one of `channels_last` (default) or `channels_first`.
      The ordering of the dimensions in the inputs.
      `channels_last` corresponds to inputs with shape
      `(batch, height, width, channels)` while `channels_first`
      corresponds to inputs with shape
      `(batch, channels, height, width)`.
      It defaults to the `image_data_format` value found in your
      Keras config file at `~/.keras/keras.json`.
      If you never set it, then it will be 'channels_last'.
    dilation_rate: an integer or tuple/list of 2 integers, specifying
      the dilation rate to use for dilated convolution.
      Can be a single integer to specify the same value for
      all spatial dimensions.
      Currently, specifying any `dilation_rate` value != 1 is
      incompatible with specifying any stride value != 1.
    activation: Activation function to use.
      If you don't specify anything, no activation is applied
      (ie. 'linear' activation: `a(x) = x`).
    use_bias: Boolean, whether the layer uses a bias vector.
    depthwise_initializer: Initializer for the depthwise kernel matrix.
    bias_initializer: Initializer for the bias vector.
    depthwise_regularizer: Regularizer function applied to
      the depthwise kernel matrix.
    bias_regularizer: Regularizer function applied to the bias vector.
    activity_regularizer: Regularizer function applied to
      the output of the layer (its 'activation').
    depthwise_constraint: Constraint function applied to
      the depthwise kernel matrix.
    bias_constraint: Constraint function applied to the bias vector.
  Input shape:
    4D tensor with shape:
    `[batch, channels, rows, cols]` if data_format='channels_first'
    or 4D tensor with shape:
    `[batch, rows, cols, channels]` if data_format='channels_last'.
  Output shape:
    4D tensor with shape:
    `[batch, filters, new_rows, new_cols]` if data_format='channels_first'
    or 4D tensor with shape:
    `[batch, new_rows, new_cols, filters]` if data_format='channels_last'.
    `rows` and `cols` values might have changed due to padding.
  """

  def __init__(self,
               kernel_size,
               strides=(1, 1),
               padding='valid',
               depth_multiplier=1,
               data_format=None,
               activation=None,
               use_bias=True,
               dilation_rate=(1,1),
               depthwise_initializer='glorot_uniform',
               bias_initializer='zeros',
               depthwise_regularizer=None,
               bias_regularizer=None,
               activity_regularizer=None,
               depthwise_constraint=None,
               bias_constraint=None,
               **kwargs):
    super(DepthwiseConv2D, self).__init__(
        filters=None,
        kernel_size=kernel_size,
        strides=strides,
        padding=padding,
        data_format=data_format,
        dilation_rate=dilation_rate,
        activation=activation,
        use_bias=use_bias,
        bias_regularizer=bias_regularizer,
        activity_regularizer=activity_regularizer,
        bias_constraint=bias_constraint,
        **kwargs)
    self.depth_multiplier = depth_multiplier
    self.depthwise_initializer = initializers.get(depthwise_initializer)
    self.depthwise_regularizer = regularizers.get(depthwise_regularizer)
    self.depthwise_constraint = constraints.get(depthwise_constraint)
    self.bias_initializer = initializers.get(bias_initializer)

  def build(self, input_shape):
    if len(input_shape) < 4:
      raise ValueError('Inputs to `DepthwiseConv2D` should have rank 4. '
                       'Received input shape:', str(input_shape))
    input_shape = tensor_shape.TensorShape(input_shape)
    if self.data_format == 'channels_first':
      channel_axis = 1
    else:
      channel_axis = 3
    if input_shape.dims[channel_axis].value is None:
      raise ValueError('The channel dimension of the inputs to '
                       '`DepthwiseConv2D` '
                       'should be defined. Found `None`.')
    input_dim = int(input_shape[channel_axis])
    depthwise_kernel_shape = (self.kernel_size[0],
                              self.kernel_size[1],
                              input_dim,
                              self.depth_multiplier)

    self.depthwise_kernel = self.add_weight(
        shape=depthwise_kernel_shape,
        initializer=self.depthwise_initializer,
        name='depthwise_kernel',
        regularizer=self.depthwise_regularizer,
        constraint=self.depthwise_constraint)

    if self.use_bias:
      self.bias = self.add_weight(shape=(input_dim * self.depth_multiplier,),
                                  initializer=self.bias_initializer,
                                  name='bias',
                                  regularizer=self.bias_regularizer,
                                  constraint=self.bias_constraint)
    else:
      self.bias = None
    # Set input spec.
    self.input_spec = InputSpec(ndim=4, axes={channel_axis: input_dim})
    self.built = True

  def call(self, inputs):
    outputs = backend.depthwise_conv2d(
        inputs,
        self.depthwise_kernel,
        strides=self.strides,
        padding=self.padding,
        dilation_rate=self.dilation_rate,
        data_format=self.data_format)

    if self.use_bias:
      outputs = backend.bias_add(
          outputs,
          self.bias,
          data_format=self.data_format)

    if self.activation is not None:
      return self.activation(outputs)

    return outputs

  @tf_utils.shape_type_conversion
  def compute_output_shape(self, input_shape):
    if self.data_format == 'channels_first':
      rows = input_shape[2]
      cols = input_shape[3]
      out_filters = input_shape[1] * self.depth_multiplier
    elif self.data_format == 'channels_last':
      rows = input_shape[1]
      cols = input_shape[2]
      out_filters = input_shape[3] * self.depth_multiplier

    rows = conv_utils.conv_output_length(rows, self.kernel_size[0],
                                         padding=self.padding,
                                         stride=self.strides[0],
                                         dilation=self.dilation_rate[0])
    cols = conv_utils.conv_output_length(cols, self.kernel_size[1],
                                         padding=self.padding,
                                         stride=self.strides[1],
                                         dilation=self.dilation_rate[1])
    if self.data_format == 'channels_first':
      return (input_shape[0], out_filters, rows, cols)
    elif self.data_format == 'channels_last':
      return (input_shape[0], rows, cols, out_filters)

  def get_config(self):
    config = super(DepthwiseConv2D, self).get_config()
    config.pop('filters')
    config.pop('kernel_initializer')
    config.pop('kernel_regularizer')
    config.pop('kernel_constraint')
    config['depth_multiplier'] = self.depth_multiplier
    config['depthwise_initializer'] = initializers.serialize(
        self.depthwise_initializer)
    config['depthwise_regularizer'] = regularizers.serialize(
        self.depthwise_regularizer)
    config['depthwise_constraint'] = constraints.serialize(
        self.depthwise_constraint)
    return config


closed time in 2 months

dajoke

Pull request review commenttensorflow/tensorflow

Support Keras grouped convolutions

 def __init__(self, rank,   def build(self, input_shape):     input_shape = tensor_shape.TensorShape(input_shape)     input_channel = self._get_input_channel(input_shape)-    kernel_shape = self.kernel_size + (input_channel, self.filters)+    if input_channel % self.groups != 0:+      raise ValueError(+          'The number of input channels must be evenly divisible by the number '+          'of groups. Received groups={}, but the input has {} channels '+          '(full input shape is {}).'.format(self.groups, input_channel,+                                             input_shape))+    kernel_shape = self.kernel_size + (input_channel // self.groups,+                                       self.filters)      self.kernel = self.add_weight(

@lgeiger Though to be more concise, I think when we call tf.nn.conv2d, the inputs don't need to be splitted, because we need to infer groups from input channel and kernel size? (The docstring of that op needs to be updated as well)

lgeiger

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Support Keras grouped convolutions

 def __init__(self, rank,   def build(self, input_shape):     input_shape = tensor_shape.TensorShape(input_shape)     input_channel = self._get_input_channel(input_shape)-    kernel_shape = self.kernel_size + (input_channel, self.filters)+    if input_channel % self.groups != 0:+      raise ValueError(+          'The number of input channels must be evenly divisible by the number '+          'of groups. Received groups={}, but the input has {} channels '+          '(full input shape is {}).'.format(self.groups, input_channel,+                                             input_shape))+    kernel_shape = self.kernel_size + (input_channel // self.groups,+                                       self.filters)      self.kernel = self.add_weight(

Oh I refreshed the page before your comment. Yeah this seems legit.

lgeiger

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Support Keras grouped convolutions

 def __init__(self, rank,   def build(self, input_shape):     input_shape = tensor_shape.TensorShape(input_shape)     input_channel = self._get_input_channel(input_shape)-    kernel_shape = self.kernel_size + (input_channel, self.filters)+    if input_channel % self.groups != 0:+      raise ValueError(+          'The number of input channels must be evenly divisible by the number '+          'of groups. Received groups={}, but the input has {} channels '+          '(full input shape is {}).'.format(self.groups, input_channel,+                                             input_shape))+    kernel_shape = self.kernel_size + (input_channel // self.groups,+                                       self.filters)      self.kernel = self.add_weight(

Ok I'm assuming this will end up calling tf.nn.conv2d with inputs=(H,W, IN_CHANNEL) and filters=(IN_CHANNEL/GROUP, K, K, OUT_CHANNEL) and under the hood (at kernel level) it will infer the group and get multiple OUT_CHANNEL/GROUP conv outputs and concatenate before returning.

lgeiger

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Support Keras grouped convolutions

 def __init__(self, rank,   def build(self, input_shape):     input_shape = tensor_shape.TensorShape(input_shape)     input_channel = self._get_input_channel(input_shape)-    kernel_shape = self.kernel_size + (input_channel, self.filters)+    if input_channel % self.groups != 0:+      raise ValueError(+          'The number of input channels must be evenly divisible by the number '+          'of groups. Received groups={}, but the input has {} channels '+          '(full input shape is {}).'.format(self.groups, input_channel,+                                             input_shape))+    kernel_shape = self.kernel_size + (input_channel // self.groups,+                                       self.filters)      self.kernel = self.add_weight(

Right, so filters==output_channels. What I don't understand here is, let's say we want to map a (55, 55, 32) conv input to (27, 27, 64) conv output, with groups=2. So apparently we need 2 different kernels, both are with shape (3, 3, 16, 64). The first kernel will map the first half (55, 55, 16) to (27, 27, 64), and the second kernel will map the second half (55, 55, 16) to (27, 27, 64). And the final result would be the sum of the two. So either we need 2 kernels in 1 conv layer, or we need 2 conv layers. Am I missing something here? (Feel free to point out where I'm wrong)

lgeiger

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Support Keras grouped convolutions

 def __init__(self, rank,   def build(self, input_shape):     input_shape = tensor_shape.TensorShape(input_shape)     input_channel = self._get_input_channel(input_shape)-    kernel_shape = self.kernel_size + (input_channel, self.filters)+    if input_channel % self.groups != 0:+      raise ValueError(+          'The number of input channels must be evenly divisible by the number '+          'of groups. Received groups={}, but the input has {} channels '+          '(full input shape is {}).'.format(self.groups, input_channel,+                                             input_shape))+    kernel_shape = self.kernel_size + (input_channel // self.groups,+                                       self.filters)      self.kernel = self.add_weight(

I'm a little confused here, dont we need self.groups # of kernels each of shape (kernel_size,) + (input_channel // self.groups, self.filters)?

lgeiger

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Support Keras grouped convolutions

 def __init__(self, rank,     if filters is not None and not isinstance(filters, int):       filters = int(filters)     self.filters = filters+    self.groups = groups+    if filters is not None and filters % self.groups != 0:

what if self.groups = None? Should we allow that and make None == 1?

lgeiger

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Support Keras grouped convolutions

 def test_conv3d_dynamic_shape(self):             input_shape=(None, 3, None, None, None),             input_data=input_data) +class GroupedConvTest(keras_parameterized.TestCase):+  @parameterized.named_parameters(+      ('Conv1D', keras.layers.Conv1D),+      ('Conv2D', keras.layers.Conv2D),+      ('Conv3D', keras.layers.Conv3D),+  )+  def test_group_conv_incorrect_use(self, layer):+    with self.assertRaisesRegexp(ValueError, 'The number of filters'):+      layer(16, 3, groups=3)+    with self.assertRaisesRegexp(ValueError, 'The number of input channels'):+      layer(16, 3, groups=4).build((32, 12, 12, 3))++  @parameterized.named_parameters(+      ('Conv1D', keras.layers.Conv1D, (32, 12, 32)),+      ('Conv2D', keras.layers.Conv2D, (32, 12, 12, 32)),+      ('Conv3D', keras.layers.Conv3D, (32, 12, 12, 12, 32)),+  )+  def test_group_conv(self, layer, input_shape):+    if test.is_gpu_available(cuda_only=True):+      with self.cached_session(use_gpu=True):

I think we are using the with test_util.use_gpu(): these days

lgeiger

comment created time in 2 months

pull request commenttensorflow/tensorflow

Support Keras grouped convolutions

Somehow this wasn't in my radar. Looking into it now.

lgeiger

comment created time in 2 months

issue closedtensorflow/tensorflow

inconsistent default parameters for adagrad optimizer

System information TensorFlow 2.1.0 (tested on anaconda package and docker image)

Describe the current behavior

The default value for the initial_accumulator_value parameter of the Adagrad optimizer is different depending on whether it is passed as a string or as an instance of the optimizer class. This may lead to drastic differences in learning behavior, which is not apparent from the code.

Describe the expected behavior

Both variants should use the same default parameters.

Standalone code to reproduce the issue

import tensorflow as tf

model = tf.keras.models.Model()
model.compile(optimizer='adagrad')
model.optimizer.get_config()
# {'name': 'Adagrad', 'learning_rate': 0.001, 'decay': 0.0, 'initial_accumulator_value': 0.0, 'epsilon': 1e-07}

tf.keras.optimizers.Adagrad().get_config()
# {'name': 'Adagrad', 'learning_rate': 0.001, 'decay': 0.0, 'initial_accumulator_value': 0.1, 'epsilon': 1e-07}

closed time in 2 months

chstem

issue commenttensorflow/tensorflow

inconsistent default parameters for adagrad optimizer

This is fixed here

chstem

comment created time in 2 months

Pull request review commenttensorflow/tensorflow

Use case clarification comments #36785

 class Adadelta(optimizer_v2.OptimizerV2):   don't have to set an initial learning rate. In this version, initial   learning rate can be set, as in most other Keras optimizers. -  Args:-    learning_rate: A `Tensor`, floating point value, or a schedule that is a-      `tf.keras.optimizers.schedules.LearningRateSchedule`. The learning rate.-      To match the exact form in the original paper use 1.0.-    rho: A `Tensor` or a floating point value. The decay rate.-    epsilon: A `Tensor` or a floating point value.  A constant epsilon used-             to better conditioning the grad update.-    name: Optional name prefix for the operations created when applying-      gradients.  Defaults to `"Adadelta"`.-    **kwargs: Keyword arguments. Allowed to be one of-      `"clipnorm"` or `"clipvalue"`.-      `"clipnorm"` (float) clips gradients by norm; `"clipvalue"` (float) clips-      gradients by value.--  Reference:

I think you should revert this, this resolves the merge conflict by going to the previous version, instead of latest version

abhilash1910

comment created time in 2 months

pull request commenttensorflow/tensorflow

Added Custom Dense layer specially for SNN's

Maybe contribute this to tensorflow addons

Sanyam8055

comment created time in 2 months

issue closedtensorflow/tensorflow

Negative sampling / candidate sampling with Keras API

<em>Please make sure that this is a feature request. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template</em>

System information

  • TensorFlow version (you are using): 2.0
  • Are you willing to contribute it (Yes/No): Yes

Describe the feature and the current behavior/state.

It looks like TF Keras has no built in functions to do candidate sampling like in TF 1.0. As far as I can tell, the official keras word2vec example doesn't have negative sampling https://www.tensorflow.org/tutorials/text/word_embeddings even though it's an integral part for producing quality word embeddings.

Will this change the current api? How?

The original tensorflow losses had options for negative sampling, including different distributions to select from. So perhaps some new loss functions and candidate sampling options will need to be added.

Who will benefit with this feature?

Anyone who is training embedding models.

closed time in 2 months

Santosh-Gupta

issue commenttensorflow/tensorflow

Negative sampling / candidate sampling with Keras API

Closing this. If the above article doesn't help you, let us know!

Santosh-Gupta

comment created time in 2 months

issue commenttensorflow/tensorflow

Negative sampling / candidate sampling with Keras API

@Santosh-Gupta What do you mean by "The original tensorflow losses had options for negative sampling, including different distributions to select from" whereas TF2 doesn't have it?

Santosh-Gupta

comment created time in 2 months

issue commenttensorflow/tensorflow

Need a way to get Intermediate Layer Inputs/Activations for tf.keras Models

Closing it for now. Feel free to re-open if you still have concerns :-)

n2cholas

comment created time in 2 months

issue closedtensorflow/tensorflow

Need a way to get Intermediate Layer Inputs/Activations for tf.keras Models

<em>Please make sure that this is a feature request. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template</em>

System information

  • TensorFlow version (you are using): 2.0
  • Are you willing to contribute it (Yes/No): Yes, if there is a consensus on how it should be designed.

Describe the feature and the current behavior/state. In eager mode, there is no way to access a tf.keras model's layer inputs/outputs during training (as far as I can tell, please correct me if I'm wrong). In TF 1.x (graph mode), this was not a problem, since you could use <s>layer.input or layer.output</s> layer.inbound_nodes or layer.outbound_nodes to get these tensors and use those values, but this is no longer possible in eager mode.

PyTorch solves this issue by allowing users to register hooks on layers, which is essentially a function that is called before/after the forward/backward pass on a layer. Here is the code for register_forward_hook in PyTorch.

Alternatively, the input/output properties of a layer could store a reference to the tensors used in the most recent forward pass.

Will this change the current api? How? Yes, depending on how this is implemented. If a hooks approach is used, this public method would have to be added to tf.keras.layers.Layer. If the input/output property approach is used, these properties would have new behavior in eager mode.

Who will benefit with this feature? Being able to access, record, manipulate, or otherwise use layer inputs and outputs for models during training/inference is generally very useful.

A specific example is the K-FAC optimization algorithm, which uses each layer's inputs and pre-activation gradients to approximate the Fisher information matrix. The current implementation does not support eager. PyTorch implementations (e.g. this one) of this algorithm use hooks to do this.

Another use case is visualizing intermediate activations of CNNs. This example uses layer.outputs in TF 1.x + Keras to grab the right tensors then creating an augmented model. This process would be greatly simplified by allowing access to intermediate activations without augmenting the model.

Any Other info. Here is a related issue about getting intermediate activations.

EDIT (2019-10-20): I learned that layer.inbound_nodesand layer.outbound_nodes used to have this behavior, not layer.input and layer.output. layer.input and layer.output track the tensors that are created when model.build(input_shape) is called. When you use the model as a callable on your own input (i.e. predictions = model(inputs)), new tensors are created (reusing the model's weights/architecture). In TF 1.x, inputs would be added to the inbound_nodes list and predictions would be added to the outbound_nodes list during this call. Now, since in eager mode the model is called on a new EagerTensor for every training step, it is not reasonable to add that many new tensors to these lists, so the property was deprecated. Since this feature used to exist, I think it's important for a reasonable replacement to exist in TF2 (such as hooks or a layer property tracking the most recent inputs/outputs).

closed time in 2 months

n2cholas
more