profile
viewpoint
Rohan Jain rohan100jain @google Mountain View

rohan100jain/Beatameister 2

Music 256a beat boxing project

rohan100jain/Movie-Visualization 2

Creating a visualization for movies

rohan100jain/Cloud-Prize 0

Description and terms for the Netflix Cloud Prize, which runs from March-September 2013. Read the rules, fork to your GitHub account to create a Submission, then send us your email address.

rohan100jain/community 0

Stores documents used by the TensorFlow developer community

rohan100jain/Snowcleaning 0

Snow Cleaning Marathon match 79 for top coder

rohan100jain/tensorboard 0

TensorFlow's Visualization Toolkit

rohan100jain/tensorflow 0

An Open Source Machine Learning Framework for Everyone

issue commenttensorflow/tensorflow

tf.math.maximum example is written incorrectly.

Thanks! Feel free to submit a PR

abhinavsp0730

comment created time in a day

issue commenttensorflow/tensorflow

More elaborate logging implementation (implementing TFLogSink s, etc.)

We're not actively working on this right now but would love it if you could contribute a PR here. Will be more than happy to review / support.

csachs

comment created time in 5 days

issue closedtensorflow/tensorflow

Thread hang for Stage/MapStage op when setting inter_op_parallelism_threads=1

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
  • TensorFlow version (use command below): tf-nightly
  • Python version: 3.6.8

Describe the current behavior

Thread will hang if setting inter_op_parallelism_threads=1.

Standalone code to reproduce the issue

from six.moves import queue as Queue
import threading

import tensorflow as tf
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import ops
from tensorflow.python.framework.ops import disable_eager_execution
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import data_flow_ops
from tensorflow.python.platform import test

disable_eager_execution()
tf.config.threading.set_inter_op_parallelism_threads(num_threads=1)  # pass if set 2


class ThreadHangTest(test.TestCase):
    """test Stage/MapStage"""

    def testStage(self):
        capacity = 3
        with ops.device(test.gpu_device_name()):
            x = array_ops.placeholder(dtypes.int32, name='x')
            stager = data_flow_ops.StagingArea([dtypes.int32, ], capacity=capacity, shapes=[[]])

        queue = Queue.Queue()
        with self.session() as sess:
            def thread_run():
                for i in range(capacity + 1):
                    sess.run(stager.put([x]), feed_dict={x: i})
                    queue.put(0)

            t = threading.Thread(target=thread_run)
            t.daemon = True
            t.start()

            try:
                for i in range(capacity + 1):
                    queue.get(timeout=1)
            except Queue.Empty:
                pass

            for i in range(capacity):
                sess.run(stager.get())

    def testMapStage(self):
        capacity = 3
        with ops.device(test.gpu_device_name()):
            x = array_ops.placeholder(dtypes.int32, name='x')
            pi = array_ops.placeholder(dtypes.int64, name='pi')
            map_stager = data_flow_ops.MapStagingArea([dtypes.int32, ], capacity=capacity, shapes=[[]])

        queue = Queue.Queue()
        with self.session() as sess:
            def thread_run():
                for i in range(capacity + 1):
                    sess.run(map_stager.put(pi, [x], [0]), feed_dict={x: i, pi: i})
                    queue.put(0)

            t = threading.Thread(target=thread_run)
            t.daemon = True
            t.start()

            try:
                for i in range(capacity + 1):
                    queue.get(timeout=1)
            except Queue.Empty:
                pass

            for i in range(capacity):
                sess.run(map_stager.get())


if __name__ == '__main__':
    test.main()

closed time in 8 days

GHGmc2

issue commenttensorflow/tensorflow

Thread hang for Stage/MapStage op when setting inter_op_parallelism_threads=1

I think this is working as intended. The issue is that the StagingArea.put() operation is blocked because you're pushing in one more than capacity. That occupies a thread.

That means that the StagingArea.get() operation, which is supposed to get an element and unblock the put() can't really proceed because it has no threads to run on.

GHGmc2

comment created time in 8 days

issue commenttensorflow/tensorflow

TF2 Warning : Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.

I can trace the warning down to https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/module/module.py#L338 where we pass in a set() to nest.flatten which triggers the warning. I don't think this is something you need to really worry about. Assigning to Tom who wrote tf.Module

songs18

comment created time in 12 days

issue closedtensorflow/tensorflow

Adam optimizing performance issue

Hi, Using tensorflow 2.2.0-rc3 on Ubuntu 18.04 and Python 3.8.2, I've converted my code for both compat.v1 and v2 which was required by tf2onnx. There is a huge difference in speed between v1 and v2 and I was wondering how to solve this. My v2 implementation is:

	built_model = tf.keras.Model(inputs=model.inputs, outputs=model.outputs)
	model.load_weights()

	@tf.function
	def train_fn(inputs, targets):
		with tf.GradientTape() as tape:
			outputs = built_model(inputs)
			loss, metrics, _ = dataset.loss(outputs, targets)

		gradients = tape.gradient(loss, built_model.trainable_variables)
		optimizer.apply_gradients(zip(gradients, built_model.trainable_variables))
		return loss, metrics, learning_rate, global_step

while my v1 is:

	session = tf.compat.v1.Session()
	tf.compat.v1.keras.backend.set_session(session)
	model.load_weights(session=session)

	loss, metrics, targets_pl = dataset.loss(model.outputs)
	tvars = tf.compat.v1.trainable_variables()
	gvs = optimizer.get_gradients(loss, tvars)
	train_op = optimizer.apply_gradients(zip(gvs, tvars))

	def train_fn(inputs, targets):
		loss_, _, global_step_, metrics_, lr_ = session.run([loss, train_op, global_step, metrics, learning_rate], feed_dict=dict(zip(model.inputs+targets_pl, inputs+targets)))
		return loss_, metrics_, lr_, global_step_

I also have a predict function that is as fast. It looks like the derivative is slow. I would like to migrate everything to v2. Thank you for your help.

closed time in 13 days

christopher5106

issue commenttensorflow/tensorflow

Adam optimizing performance issue

I was able to identify and debug the issue - the problem was that we were forcing all the embedding layer variables onto the CPU and that was slowing down the model. https://github.com/tensorflow/tensorflow/commit/46f7108d78c6a3c0854fe66ce1cd92e5ebb3d6e2?diff=split should fix the issue and from some preliminary testing its probably even 5-10% faster than before (not rigorously tested though)

christopher5106

comment created time in 13 days

issue commenttensorflow/tensorflow

tf.py_function could return a dictionary of tensors

I think this would be a nice feature but I think you can work around this limitation by constructing your py_function appropriately as tf.data supports dictionaries

python``` import tensorflow as tf

def tokenizer(x): return { "input_ids": [ 101, 13366, 2131, 1035, 6819, 2094, 1035, 102 ], "attention_mask": [ 1, 1, 1, 1, 1, 1, 1, 1 ]}

def py_func(x): d = tokenizer(x) return list(d.values())

def ds_map_fn(x): flattened_output = tf.py_function(py_func, [x], [tf.int32, tf.int32]) return {"input_ids": flattened_output[0], "attention_mask": flattened_output[1]}

ds = tf.data.Dataset.range(2) ds = ds.map(ds_map_fn)

for value in ds: print(value)

Ceceu

comment created time in 14 days

issue closedtensorflow/tensorflow

TF 2.1: inserting into MutableHashTable results into error

System information

  • Have written custom code
  • OS Platform and Distribution: Linux Ubuntu 18.04:
  • TensorFlow installed from binary: pip install tensorflow-gpu==2.1.0
  • TensorFlow version: 2.1.0
  • Python version: 3.6
  • CUDA/cuDNN version: 10/7.0
  • GPU model and memory: GTX1070 and 6GB

Describe the current behavior I am trying to insert some key:value pairs into a MutableHashTable

Describe the expected behavior Using the contrib equivalent of MutableHashTable does not produce this error.

Standalone code to reproduce the issue

tf.compat.v1.disable_eager_execution()

CHARMAP = ['', '', ''] + list('0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')

with tf.device('/gpu:0'):
    table = tf.raw_ops.MutableHashTable(
            key_dtype=tf.int64,
            value_dtype=tf.string,
    )

insert = table.insert(tf.constant(list(range(len(CHARMAP))), dtype=tf.int64),
                     tf.constant(CHARMAP)
                     )

Other info / logs Here is the output:

 File "test.py", line 12, in <module>
    insert = table.insert(tf.constant(list(range(len(CHARMAP))), dtype=tf.int64),
AttributeError: 'Tensor' object has no attribute 'insert'

closed time in 15 days

vladdders

issue commenttensorflow/tensorflow

TF 2.1: inserting into MutableHashTable results into error

tf.raw_ops.MutableHashTable returns a tensor handle corresponding to the table that then needs to be used in an init_op / insert op etc.

I think what you want is tf.lookup.experimental.DenseHashTable https://www.tensorflow.org/api_docs/python/tf/lookup/experimental/DenseHashTable

import tensorflow as tf tf.compat.v1.disable_eager_execution()

CHARMAP = ['', '', ''] + list('0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')

with tf.device('/gpu:0'): table = tf.lookup.experimental.DenseHashTable( key_dtype=tf.int64, value_dtype=tf.string, default_value='_', empty_key=0, deleted_key=-1, )

table.insert(tf.constant(list(range(len(CHARMAP))), dtype=tf.int64), tf.constant(CHARMAP))

Please let me know if you have any issues.

vladdders

comment created time in 15 days

issue closedtensorflow/tensorflow

Memory leakage when converting to tensor

` import numpy as np import tensorflow as tf

for i in range(5000): print(i) array = np.random.random((1024, 1024)) tf.convert_to_tensor(array, dtype=tf.float32) `

Tensorflow version is 1.14.0, Numpy version is 1.17.0, python version is 3.6.8 The process is killed when i ~= 2400 on my machine The command "watch -d free -m" shows that memory decreases over time until it gets close to zero, then crashes

I did not find a way to free the memory from the unreferenced tensors

Best, Benoît

closed time in 15 days

benoitkoenig

issue commenttensorflow/tensorflow

Memory leakage when converting to tensor

Can confirm that there isn't any memory increase with TF2 (as per Dan's last comment). Closing issue for now. Please let me know if you run into problems.

benoitkoenig

comment created time in 15 days

issue commenttensorflow/tensorflow

Calling custom op changes data type

Was curious how you resolved the problem? Usually when we have a list of potential values for Attrs its for the types that we'd like to have kernels for e.g. what you do with the "T" attr there. For the "alpha" attribute, it should either be a float or a double. If you want to support both, you can do

REGISTER_OP("Mean2D") .Attr("T: {float, double}") .Attr("alpha_type: {float, double}") .Attr("alpha: alpha_type") .Input("img: T") .Output("out: T")

and then you can add that as a type constraint in your kernel registration. You'll then have to templatize your kernel etc.

There is a still a problem that if you had done

REGISTER_OP("Mean2D") .Attr("T: {float, double}") .Attr("alpha: double") .Input("img: T") .Output("out: T")

you would run into issues with the kernel because GetAttr isn't overloaded for double as you pointed out. Would you be willing do a PR for this? It simply involves adding a double to GetNodeAttr here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/node_def_util.h#L192

Bidski

comment created time in 20 days

issue commenttensorflow/tensorflow

Calling custom op changes data type

This is definitely strange behavior. Lets try and find a minimal reproduction so that we can figure out what the problem is. One idea would be to try running this op by itself without it being part of a Keras model - feed in some random values of img and alpha that mimics the failure case.

Bidski

comment created time in 25 days

more