profile
viewpoint
Zongheng Yang concretevitamin UC Berkeley @ucbrise Berkeley, CA https://zongheng.me/ CS PhD student at UC Berkeley

concretevitamin/haskeme 3

A Scheme interpreter written in Haskell.

concretevitamin/NLP_Stanford_Coursera 3

Several project-like assignments for Coursera.org's Stanford NLP class.

concretevitamin/join-order-benchmark 2

Join Order Benchmark queries and join graph visualizations

concretevitamin/cs294-ai-sys-sp19 1

CS294; AI For Systems and Systems For AI

concretevitamin/dotfiles_public 1

A subset of my dotfiles, potentially modified so that they don't contain sensitive information.

concretevitamin/hive-testbench 1

Testbench for experimenting with Apache Hive at any data scale.

concretevitamin/acme 0

A library of reinforcement learning components and agents

concretevitamin/arrow 0

Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.

concretevitamin/assignment1 0

Part 1 of the introductory assignment on building our own compiler

fork concretevitamin/neurocard

State-of-the-art neural cardinality estimators for join queries

fork in 3 days

startedneurocard/neurocard

started time in 3 days

push eventneurocard/neurocard

Zongheng Yang

commit sha 752ac3a3dabedcfa34e85e5cbd8bc37a103a11c1

Public release of NeuroCard.

view details

push time in 3 days

create barnchneurocard/neurocard

branch : master

created branch time in 7 days

created repositoryneurocard/neurocard

created time in 7 days

fork concretevitamin/transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

https://huggingface.co/transformers

fork in 8 days

issue commentray-project/ray

[rllib] DQN Torch policy + parametric is broken on GPU

dqn_torch_policy.py may need to be fixed too.

concretevitamin

comment created time in a month

Pull request review commentray-project/ray

[WIP] Try fixing torch GPU and masking errors

 def forward(self, input_dict, state, seq_lens):         # Mask out invalid actions (use -inf to tag invalid).         # These are then recognized by the EpsilonGreedy exploration component         # as invalid actions that are not to be chosen.-        inf_mask = torch.clamp(-            torch.log(action_mask), -float("inf"), float("inf"))+        inf_mask = torch.clamp(torch.log(action_mask), -1e15, 1e15)+        inf_mask = inf_mask.to(self.device)

This may not be the culprit causing the device issue. All inputs should've been transferred to the device already.

ericl

comment created time in a month

Pull request review commentray-project/ray

[WIP] Try fixing torch GPU and masking errors

 def forward(self, input_dict, state, seq_lens):         # Mask out invalid actions (use -inf to tag invalid).         # These are then recognized by the EpsilonGreedy exploration component         # as invalid actions that are not to be chosen.-        inf_mask = torch.clamp(-            torch.log(action_mask), -float("inf"), float("inf"))+        inf_mask = torch.clamp(torch.log(action_mask), -1e15, 1e15)

More intuitive to leave max value as 0 (log(1)).

ericl

comment created time in a month

issue commentray-project/ray

[rllib] Action mask support using -inf for PyTorch is broken

Just that change isn't enough. See https://github.com/ray-project/ray/blob/master/rllib/utils/exploration/epsilon_greedy.py#L142 .

On Mon, Aug 17, 2020 at 3:15 PM Eric Liang notifications@github.com wrote:

@concretevitamin https://github.com/concretevitamin does this fix the example https://github.com/ray-project/ray/pull/10168/files ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ray-project/ray/issues/10165#issuecomment-675141963, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEQWHR6KJ57MKY4KWGSHXDSBGTZRANCNFSM4QCHR26A .

concretevitamin

comment created time in a month

issue openedray-project/ray

[rllib] Action mask support using -inf for PyTorch is broken

<!--Please include [tune], [rllib], [autoscaler] etc. in the issue title if relevant-->

What is the problem?

Ray 0.8.6.

Reproduction (REQUIRED)

# Set inside "num_gpus": 1.
rllib/examples$ p parametric_actions_cartpole.py  --torch

hits


  File "/home/ubuntu/anaconda3/envs/exp/lib/python3.7/site-packages/tree/__init__.py", line 516, in <listcomp>
    [func(*args) for args in zip(*map(flatten, structures))])
  File "/home/ubuntu/anaconda3/envs/exp/lib/python3.7/site-packages/ray/rllib/utils/torch_ops.py", line 121, in mapping
    item.cpu().detach().numpy()
RuntimeError: CUDA error: device-side assert triggered

Using this same setup on my custom environment & (torch) model, the error manifests explicitly as:

  • Using StochasticSampling, plus making sure my model return -1e15 for invalid actions: works.
  • Using EpsilonGreedy, plus making sure my model return -float('inf') for invalids: ERROR, model weights quickly get trained to NaN.

Suggestion: perhaps using float.min gives better numerical stability.

If we cannot run your script, we cannot fix your issue.

  • [x] I have verified my script runs in a clean environment and reproduces the issue.
  • [ ] I have verified the issue also occurs with the latest wheels.

created time in a month

issue openedray-project/ray

[rllib] DQN Torch policy + parametric is broken on GPU

<!--Please include [tune], [rllib], [autoscaler] etc. in the issue title if relevant-->

What is the problem?

See below. Ray 0.8.6.

Reproduction (REQUIRED)

# Modified script with "num_gpus": 1.
rllib/examples$ p parametric_actions_cartpole.py --run=DQN --torch

hits error

  File "/home/ubuntu/anaconda3/envs/exp/lib/python3.7/site-packages/ray/rllib/agents/dqn/dqn_torch_policy.py", line 218, in build_q_losses
    torch.where(q_t > -float("inf"), q_t, torch.tensor(0.0)) *
RuntimeError: Expected tensor to have CUDA Backend, but got tensor with CPU Backend (while checking arguments for CUDA_tensor_apply4)
== Status ==

If we cannot run your script, we cannot fix your issue.

  • [x ] I have verified my script runs in a clean environment and reproduces the issue.
  • [ ] I have verified the issue also occurs with the latest wheels.

created time in a month

issue commentray-project/ray

[rllib] Torch policy + non-empty custom_model_config errors out

The problem is rllib unpacking this for me.

On Mon, Aug 10, 2020 at 02:06 Sven Mika notifications@github.com wrote:

Can you try getting the pool information in your Model class' c'tor from config["model"]["custom_model_config"]["pool"], instead of an actual pool arg? It should work then.

— You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub https://github.com/ray-project/ray/issues/10015#issuecomment-671243318, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEQWHSRV5BDNLMJVTUECBTR762A3ANCNFSM4PZRSVZQ .

concretevitamin

comment created time in a month

issue openedray-project/ray

[rllib] Torch policy + non-empty custom_model_config errors out

<!--Please include [tune], [rllib], [autoscaler] etc. in the issue title if relevant-->

What is the problem?

On Ray 0.8.6, Torch policy + non-empty custom_model_config + custom model would crash.

Reproduction (REQUIRED)

  File "/home/ubuntu/anaconda3/envs/exp/lib/python3.7/site-packages/ray/rllib/policy/torch_policy_template.py", line 120, in __init__
    **self.config["model"].get("custom_model_config", {}))
  File "/home/ubuntu/anaconda3/envs/exp/lib/python3.7/site-packages/ray/rllib/models/catalog.py", line 360, in get_model_v2
    model_config, name, **model_kwargs)
TypeError: __init__() got an unexpected keyword argument 'pool'

Reproducible by any custom Torch model with a non-empty custom_model_config.

This error shows up if I set any custom_model_config: {'pool': xxx}, where pool is something my custom torch model can interpret; here, rllib torch template is unpacking this dict, causing erroring out.

If we cannot run your script, we cannot fix your issue.

  • [ ] I have verified my script runs in a clean environment and reproduces the issue.
  • [ ] I have verified the issue also occurs with the latest wheels.

created time in 2 months

issue commentpytorch/pytorch

nn.Parameter{List,Dict} not copied to gpus in forward pass when nn.DataParallel is used

This is a serious issue preventing us from upgrading to 1.6. Working around ParameterList (e.g., assign directly as attributes) is non-ideal as it breaks loading prior checkpoints. Adding glue code to load prior checkpoints would work, but it's something I don't expect from a stable release.

Also related to issue #42327.

grtzsohalf

comment created time in 2 months

issue commentpytorch/pytorch

Wrong in nn.ParameterList and nn.DataParallel, pytorch1.6

This is a serious issue preventing us from upgrading to 1.6. Working around ParameterList (e.g., assign directly as attributes) is non-ideal as it breaks loading prior checkpoints. Adding glue code to load prior checkpoints would work, but it's something I don't expect from a stable release.

Also related to issue #36035.

yfsong0709

comment created time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha 4803315b034b566cc834e45e7330b6c05f9d1b3b

updates

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha 861810949a18646b73ee974a01b3355db3cd3530

First commit

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha 8f3795090d2657e70a25e9606b693ff24e5b61dd

First commit

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha d50424db3af04c4013fa3b59091eb06224c39bf8

First commit

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha 42e92895b27a36e851c7d1e4d0ce1b44068c2289

First commit

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha 7b1ce30d2b0cb630ca0988048152baff5ac7400e

First commit

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha 6e8a8ab617a09e907439e4785a215d8aa2d8e650

First commit

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha f53996352503614c435e058dad2175dc59e2b26b

updates

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha 42265ab013c541299b20620ad047f8a7ea14d961

Delete README.md

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha c0bdc05cd79d97fd228ba0cf029cc914599fe64f

First commit

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha e1161fad954d86b15881e0f6e75fc3f4f0b14d71

updates

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha fe8dded2151263524c72412628d2ec63382767d4

updates

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha e82a986ff123f90ed9be8ccfdd6b42772bf84a47

Delete _config.yml

view details

push time in 2 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha 04bc15459a1a132d419731a820bb3bc1096c099f

First commit

view details

push time in 2 months

startedgoogle/trax

started time in 3 months

startedgoogle/neural-tangents

started time in 3 months

issue commentwandb/client

Git diff & code saving not working for Ray Tune

@lavanyashukla @vanpelt do you mind re-opening this to track? This issue isn't resolved.

concretevitamin

comment created time in 3 months

push eventvar-skip/var-skip.github.io

Zongheng Yang

commit sha d7f271c8b4912b7415be5caa2f4fa65783c6c4db

Set theme jekyll-theme-minimal

view details

push time in 3 months

create barnchvar-skip/var-skip-var-skip.github.io

branch : master

created branch time in 3 months

created repositoryvar-skip/var-skip-var-skip.github.io

created time in 3 months

created repositoryvar-skip/var-skip.github.io

created time in 3 months

more