profile
viewpoint
Doyup Lee LeeDoYup POSTECH Pohang Machine Learning Researcher

LeeDoYup/AnoGAN-tf 202

Unofficial Tensorflow Implementation of AnoGAN (Anomaly GAN)

LeeDoYup/FixMatch-pytorch 98

Unofficial Pytorch code for "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence" in NeurIPS'20. This repo contains reproduced checkpoints.

LeeDoYup/DeblurGAN-tf 55

Unofficial tensorflow (tf) implementation of DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks

LeeDoYup/bottom-up-attention-tf 33

Unofficial tensorflow implementation of "Bottom-up and Top-down attention for VQA" (TF v. 1.13)

LeeDoYup/CapsNet-tf 10

Unofficial implementation of Capsule Networks, Dynamic Routing between capsules (by tensorflow)

LeeDoYup/Gaussian-Process-Gpy 6

Gaussian Process Regression with Gpy

startednumba/numba

started time in 3 hours

issue commentLeeDoYup/FixMatch-pytorch

I've never seem code that's so neat in the field of ML.

I am really happy to hear that. Thank you !!

concinntyTurtle

comment created time in 2 days

startedfacebookresearch/fairscale

started time in 7 days

startedildoonet/evonorm

started time in 8 days

starteddigantamisra98/EvoNorm

started time in 9 days

startedmarco-rudolph/differnet

started time in 9 days

startedbenathi/fastswa-semi-sup

started time in 11 days

issue commentLeeDoYup/FixMatch-pytorch

Training time

@YBZh We are on testing this branch dev/speedup https://github.com/LeeDoYup/FixMatch-pytorch/tree/dev/speedup . We expect that it increases the speed of training about 15 % w/o performance changes. After checking all the experiments, we will merge it to the master branch. (Meanwhile, we already know the computation is not logically different in pytorch, and if you need it, feel free to use the branch.)

YBZh

comment created time in 14 days

issue commentLeeDoYup/FixMatch-pytorch

Training time

We are on testing this branch dev/speedup https://github.com/LeeDoYup/FixMatch-pytorch/tree/dev/speedup . We expect that it increases the speed of training about 15 % w/o performance changes. After checking all the experiments, we will merge it to the master branch. (Meanwhile, we already know the computation is not logically different in pytorch, and if you need it, feel free to use the branch.)

dreamflasher

comment created time in 14 days

create barnchLeeDoYup/FixMatch-pytorch

branch : dev/speedup

created branch time in 14 days

issue commentLeeDoYup/FixMatch-pytorch

NCCL error when train with single node & multi gpus

Also, it is not related to the implementation, but NCCL.

Oliver-ss

comment created time in 18 days

issue closedLeeDoYup/FixMatch-pytorch

NCCL error when train with single node & multi gpus

Hi, I tried to run the command

python train.py --world-size 1 --rank 0 --multiprocessing-distributed --num_labels 4000 --save_name cifar10_4000 --dataset cifar10 --num_classes 10

but the following exception was raised:

Traceback (most recent call last):
  File "train.py", line 316, in <module>
    main(args)
  File "train.py", line 62, in main
    mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
  File "/home/yuansong/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/yuansong/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
    while not context.join():
  File "/home/yuansong/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 119, in join
    raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/yuansong/.local/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
    fn(i, *args)
  File "/data/FixMatch-pytorch/train.py", line 155, in main_worker
    device_ids=[args.gpu])
  File "/home/yuansong/.local/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 333, in __init__
    self.broadcast_bucket_size)
  File "/home/yuansong/.local/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 549, in _distributed_broadcast_coalesced
    dist._broadcast_coalesced(self.process_group, tensors, buffer_size)
RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:492, unhandled system error, NCCL version 2.4.8

I am not sure whether you met a similar problem before. If so, how did you solve it? Thanks!

closed time in 18 days

Oliver-ss

issue commentLeeDoYup/FixMatch-pytorch

NCCL error when train with single node & multi gpus

Inactivated issue, so i close it.

Oliver-ss

comment created time in 18 days

issue closedLeeDoYup/FixMatch-pytorch

Training time

Nice work !! The results seem really attractive.
It takes me more than one hour to train for 10K iterations based on ONE M40 GPU, thus the total training time for ONE Cifar-10 experiment is about 2^20 / 10K * 1 h = 100 h ~ 4 days !! Is this normal ??

As you said, CIFAR10 training takes about 16 hours (0.7 days) with V100x4 GPUs. Does your training make the full utilization of all the FOUR GPUs or just based on ONE of them ??

closed time in 18 days

YBZh

issue commentLeeDoYup/FixMatch-pytorch

Training time

It is the reason why we release the trained model to contribute AI society.

YBZh

comment created time in 18 days

issue commentLeeDoYup/FixMatch-pytorch

Training time

YES It is normal. The training iterations are > 10,000,000 with batch size 64 * 8.

Trivial supervised training of CIFAR has 50000/batch_size in a epoch. Thus, consistency regularization requires much more iterations of training.

We checked our code use full data parallelism with DDP, and there is no computational bottleneck (all GPUs keep > 90% volatility in training).

YBZh

comment created time in 18 days

delete branch LeeDoYup/FixMatch-pytorch

delete branch : Yeongjae-patch-1

delete time in 18 days

push eventLeeDoYup/FixMatch-pytorch

Yeongjae Cheon

commit sha 79d97620e71ca664c68ac283670206c51e96bef0

minor bug fix

view details

Doyup Lee

commit sha 0e0b492f1cb110a43c765c55105b5f94c13f45fd

Merge pull request #4 from LeeDoYup/Yeongjae-patch-1 minor bug fix

view details

push time in 18 days

issue closedhj-n/c_math_viewer

Python support

Thanks for the awesome project. I wonder that you have any plan to support python.

closed time in 20 days

LeeDoYup

issue commenthj-n/c_math_viewer

Python support

Okay, Thanks. I look forward to release the python support !

LeeDoYup

comment created time in 20 days

issue openedhj-n/c_math_viewer

Python support

Thanks for the awesome project. I wonder that you have any plan to support python.

created time in 21 days

issue commentLeeDoYup/FixMatch-pytorch

NCCL error when train with single node & multi gpus

Hello. I think that the error is related to NCCL, not the implementation.

I recommend to try NCCL test in here: https://github.com/NVIDIA/nccl-tests

Oliver-ss

comment created time in 22 days

issue commentLeeDoYup/FixMatch-pytorch

Training time

Thanks for making the FIRST issue ! Yes, the training time of WRN-28-2 for CIFAR10 takes 16 hours. The training time is not different by the amount of labeled data, because all training samples are used for training regardless of labeling. In addition, we check there is no computational bottleneck in training. We also check that other implementations require similar time to training, and our code is more faster than another code, which supports DDP in pytorch.

I agree that consistency regularization based SSL methods take much long time to train the model. The fundamental reason is that FixMatch does not use external dataset and a pretraining model, but makes hidden representations from the unlabeled data of downstream task (CIFAR10). In addition, FixMatch requires 2^20 iterations, which are much much longer than those of supervised learning (150 epochs with batch_size = 128, it is about 60,000 iterations).

When the amount of labels is enough in the easy task (e.g. 4000 labels of CIFAR10), plausible accuracy is reached at the early stage of training (about at < 100,000 iterations?). However, to improve the accuracy, FixMatch requires remaining iterations (about 900,000).

dreamflasher

comment created time in 23 days

push eventLeeDoYup/FixMatch-pytorch

LeeDoYup

commit sha 920b17c702bc7650bdceb25ffc6298281cc895fc

add files

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha a1de02b87cae94d88a382b14314778c3ffa8c1d3

Update README.md

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha edc93b29cc650e2e3eb682dddfdd76067b672396

Update README.md

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha 9a4983ce5027d34ff33b63135f8c2605a72676cc

Update README.md

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha a79748f02c10bf0ecb9488992604ff1c95aaf263

Update README.md

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha 24119de8b61e5d488878b14e1acb955f75cb4b93

Update README.md

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha 40781fa9cd6c6938992724f520fbd62354329bec

Update README.md

view details

push time in 24 days

MemberEvent

push eventLeeDoYup/FixMatch-pytorch

LeeDoYup

commit sha f58483d431079e5184eda2f0636d2ff978985773

add files

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha 1f5d3d4176abafdf4f4ea0e25859645e29296f21

Update README.md

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha 8a6bf4d1723c1ad98048a26f3e9cd6f1f30b36dd

Remove unnecessaries

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha 9e2054dbf690ad91e93a55fc2693e95d184e9147

Update README.md typo

view details

push time in 24 days

push eventLeeDoYup/FixMatch-pytorch

Doyup Lee

commit sha 466c153d0eb8cc03ad0e27456dfac8dd2ebcd707

Update README.md typo

view details

push time in 24 days

startedLeeDoYup/FixMatch-pytorch

started time in 24 days

push eventLeeDoYup/FixMatch-pytorch

LeeDoYup

commit sha e65d95cbddef6cc49963131c726a9bed8a72142c

add files

view details

push time in 24 days

create barnchLeeDoYup/FixMatch-pytorch

branch : main

created branch time in 24 days

created repositoryLeeDoYup/FixMatch-pytorch

Unofficial Pytorch code for "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence" in NeurIPS'20. This repo contains reproduced checkpoints.

created time in 24 days

issue commentgoogle-research/fixmatch

Model for CIFAR-100

--filter=128 same issue: https://github.com/google-research/fixmatch/issues/26

bkj

comment created time in 25 days

startedbenathi/fastswa-semi-sup

started time in 25 days

startedhuggingface/transformers

started time in a month

startedkakaobrain/nlp-paper-reading

started time in a month

startedwjmaddox/swa_gaussian

started time in a month

issue openedgoogle-research/fixmatch

Leaky ReLU in ResNet

Standard ResNets are known to use ReLU activation function, but i found that your implementation uses Leaky ReLU instead of ReLU.

Does replacing ReLU into Leaky ReLU affect the results?

created time in a month

startedNVIDIA/apex

started time in a month

push eventLeeDoYup/LeeDoYup

Doyup Lee

commit sha d5dd743a47c6e067e97829c36d43b921cd84f4f3

Update README.md

view details

push time in a month

startednng555/ssmba

started time in a month

PR opened pytorch/examples

Fix random seed bug of DDP in the ImageNet example

the random seed has to be set in the main_worker, not in the def main(). I found that although the seed is set in def main(), each process in distributed training has individual & different seeds.

+5 -3

0 comment

1 changed file

pr created time in a month

push eventLeeDoYup/examples

Doyup Lee

commit sha c3204dd5ffe384d40bb3d5d23febea463088e4c6

Fix random seed bug of DDP in the ImageNet example the random seed has to be set in the main_worker, not in the `def main()`. I found that although the seed is set in `def main()`, each process in distributed training has individual & different seeds.

view details

push time in a month

fork LeeDoYup/examples

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

fork in a month

issue commentgoogle-research/fixmatch

Reproduce Fixmatch with RandAug

Oh Thanks, i wanted to check it has much parameters in pytorch implementation than 1.5 M. Thank you for cross-checking !! (I think it takes much more time than the cifar10 case)

When you set filters=128, what was the number of trainable params ??? (I know that 32 filters model has 1.5 M params)

It has 23.41 M params.

YUE-FAN

comment created time in a month

issue closedgoogle-research/fixmatch

Total parameters of cifar100 model

Hello. Thanks for the awesome study.

Could you let me know the total number of parameters of cifar100 model (--filter 1280)??

closed time in a month

LeeDoYup

issue commentgoogle-research/fixmatch

Total parameters of cifar100 model

answered in here: https://github.com/google-research/fixmatch/issues/25

LeeDoYup

comment created time in a month

issue commentgoogle-research/fixmatch

Reproduce Fixmatch with RandAug

When you set filters=128, what was the number of trainable params ??? (I know that 32 filters model has 1.5 M params)

YUE-FAN

comment created time in a month

issue openedgoogle-research/fixmatch

Total parameters of cifar100 model

Hello. Thanks for the awesome study.

Could you let me know the total number of parameters of cifar100 model (--filter 1280)??

created time in a month

issue closedgoogle-research/fixmatch

Validation Data

Hello. First of all, thanks for the great study !! I have a question about validation data.

To select an optimal model, how the validation data are selected?

  • 5 different labeled dataset & compute minimum test error?
  • Distinct validation dataset (from training & test dataset) for model selection? (then, how is it selected?)

After i read the paper and code, i think the former is the way you implement, but i want to check it.

closed time in a month

LeeDoYup

startedbikestra/bikestra.github.com

started time in a month

issue openedmtoneva/example_forgetting

Do these codes work?

I think these codes are not runnable.

created time in a month

startedmtoneva/example_forgetting

started time in a month

startedHaohanWang/ImageNet-Sketch

started time in a month

startedfacebookresearch/fair_self_supervision_benchmark

started time in a month

issue commentkekmodel/FixMatch-pytorch

Weight decay in EMA?

I think the weight decay is not conducted in optimizer, but in EMA update.

bkj

comment created time in a month

issue openedgoogle-research/fixmatch

Validation Data

Hello. First of all, thanks for the great study !! I have a question about validation data.

To select an optimal model, how the validation data are selected?

  • 5 different labeled dataset & compute minimum test error?
  • Distinct validation dataset (from training & test dataset) for model selection? (then, how is it selected?)

After i read the paper and code, i think the former is the way you implement, but i want to check it.

created time in a month

issue commentfacebookresearch/moco

Buffer of BN in EMA update

Thanks for the quick reply ! I want to check the below thinkgs.

ResNet-50 uses batch normalization and the default setting uses the running varables. So the momentum encoder uses them. However, i think it also calculate when the momentum encoder uses forward(im_k).

LeeDoYup

comment created time in a month

issue openedfacebookresearch/moco

Buffer of BN in EMA update

Hello. Thanks for the awesome project ! I have a question.

I wonder why the EMA update doesn't track the running mean and variance of BN.

    @torch.no_grad()
    def _momentum_update_key_encoder(self):
        """
        Momentum update of the key encoder
        """
        for param_q, param_k in zip(self.encoder_q.parameters(), self.encoder_k.parameters()):
            param_k.data = param_k.data * self.m + param_q.data * (1. - self.m)

I think below codes are right, because the ema model has poor performance when the running variables in BN are not tracked.

    @torch.no_grad()
    def _momentum_update_key_encoder(self):
        """
        Momentum update of the key encoder
        """
        for param_q, param_k in zip(self.encoder_q.parameters(), self.encoder_k.parameters()):
            param_k.data = param_k.data * self.m + param_q.data * (1. - self.m)
        
        # buffer update
        for buffer_q, buffer_k in zip(self.encoder_q.buffers(), self.encoder_k.buffers()):
            buffer_k.copy_(buffer_q)

created time in a month

push eventLeeDoYup/Anomaly_Detection_VQA

Doyup Lee

commit sha dd35690e3f5d35497134f0acb06228e0d73bafad

Update README.md

view details

push time in a month

create barnchLeeDoYup/Anomaly_Detection_VQA

branch : master

created branch time in a month

created repositoryLeeDoYup/Anomaly_Detection_VQA

Official implementation of "Regularizing Attention Networks for Anomaly Detection in VQA"

created time in a month

startedwkentaro/gdown

started time in a month

startedgyoogle/tech-interview-for-developer

started time in a month

startedsubeeshvasu/Awesome-Learning-with-Label-Noise

started time in 2 months

startedhuggingface/knockknock

started time in 2 months

startedmodestyachts/cifar-10.2

started time in 2 months

startedmodestyachts/ImageNetV2_pytorch

started time in 2 months

startedmodestyachts/imagenet-testbed

started time in 2 months

startedmodestyachts/CIFAR-10.1

started time in 2 months

startedfabienbaradel/cophy

started time in 2 months

startedmodAL-python/modAL

started time in 2 months

fork LeeDoYup/FixMatch-pytorch

Unofficial PyTorch implementation of "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence"

fork in 2 months

startedwbaek/theconf

started time in 2 months

PR opened valencebond/FixMatch_pytorch

Update README.md

fix wrong dataset path: dataset -> data

+2 -2

0 comment

1 changed file

pr created time in 2 months

push eventLeeDoYup/FixMatch_pytorch

Doyup Lee

commit sha 669e18adc44f57fe4bdc704441a8ceac70b65501

Update README.md fix wrong dataset path: dataset -> data

view details

push time in 2 months

fork LeeDoYup/FixMatch_pytorch

Unofficial PyTorch implementation of "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence"

fork in 2 months

startedrwightman/pytorch-image-models

started time in 2 months

startedmodestyachts/imagenet-vid-robust-example

started time in 2 months

startedgoogle-research/big_transfer

started time in 2 months

startedfacebookresearch/FixRes

started time in 2 months

issue closedzveryansky/python-arulesviz

Is there any plan to support other visualization in R package?

Hello. Thx for the awesome project. I want to ask for you to be going to add other visualizations (matrix-based, grouped matrix, graph-b, two-key, ..) in R package??

closed time in 2 months

LeeDoYup

issue commentzveryansky/python-arulesviz

Is there any plan to support other visualization in R package?

Thx for the fast reply. I close this issue, and will reopen if any detailed idea come up with.

LeeDoYup

comment created time in 2 months

startedtommyod/Efficient-Apriori

started time in 2 months

startedpostech-db-lab-starlab/NL2SQL

started time in 2 months

issue openedzveryansky/python-arulesviz

Is there any plan to support other visualization in R package?

Hello. Thx for the awesome project. I want to ask for you to be going to add other visualizations (matrix-based, grouped matrix, graph-b, two-key, ..) in R package??

created time in 2 months

starteddeepmind/deepmind-research

started time in 2 months

startedarogozhnikov/einops

started time in 2 months

startedfastai/numerical-linear-algebra

started time in 2 months

startedlucidrains/byol-pytorch

started time in 2 months

more