profile
viewpoint

Ask questionsOptimizer clipvalue and clipnorm not working in Tensorflow 2.0

System information

  • Python 3 Google Compute Engine backend (GPU)

Describe the current behavior clipvalue and clipnorm in Optimizers does nothing!

Describe the expected behavior By setting clipvalue=0 or clipnorm=0 no training should occur (gradients should be 0!), but the network still trains, and if using a large learning rate, loss goes to nan.

Code to reproduce the issue image Gradient is clearly not zero since the network is getting modified at each iteration.

image Sanity check by setting lr=0 No training occurs when lr=0, as expected.

tensorflow/tensorflow

Answer questions tomerk

@karlhjm no, we still disable it in 2.2 w/ distribution strategies enabled.

@zaccharieramzi yes, happy to elaborate: There's two possible places to clip when you have distribution strategies enabled:

  • before gradients get aggregated (usually wrong)
  • after gradients get aggregated (usually right & what people expect)

We want it working w/ the second case (clipping after gradients are aggregated). The issue is the optimizers are written with clipping happening in the code before aggregation does.

We looked into changing this, but it would have required either:

  1. api changes that break existing users of optimizer apply_gradients/other non-minimize methods
  2. changing the signatures of methods optimizer implementers need to implement, breaking existing custom optimizers

So rather than:

  • quietly doing clipping in the wrong place
  • increasing churn & breaking existing users or existing custom optimizers just for this individual feature

We instead decided to leave this disabled for now. We'll roll support for this into a larger optimizer refactoring that solves a larger set of issues. (RFC for that is at https://github.com/tensorflow/community/pull/234)

useful!

Related questions

ModuleNotFoundError: No module named 'tensorflow.contrib' hot 8
Error occurred when finalizing GeneratorDataset iterator hot 6
ModuleNotFoundError: No module named 'tensorflow.contrib'
When importing TensorFlow, error loading Hadoop
tf.keras.layers.Conv1DTranspose ?
tensorflow-gpu CUPTI errors hot 4
[TF 2.0] tf.keras.optimizers.Adam hot 4
Lossy conversion from float32 to uint8. Range [0, 1]. Convert image to uint8 prior to saving to suppress this warning. hot 4
TF2.0 AutoGraph issue hot 4
Tf.Keras metrics issue hot 4
module 'tensorflow' has no attribute 'ConfigProto' hot 4
TF 2.0 'Tensor' object has no attribute 'numpy' while using .numpy() although eager execution enabled by default hot 4
ModuleNotFoundError: No module named 'tensorflow.examples.tutorials' hot 4
AttributeError: module 'tensorflow.python.framework.op_def_registry' has no attribute 'register_op_list' hot 4
tensorflow2.0 detected 'xla_gpu' , but 'gpu' expected hot 3
Github User Rank List