profile
viewpoint

da03/Attention-OCR 1010

Visual Attention based OCR

da03/lightlda 16

Distributed LDA, takes raw text as input and outputs topic word table.

da03/im2markup 4

Neural model for converting Image-to-Markup (by Yuntian Deng github.com/da03)

da03/mm 3

Manage SSH key files of multiple machines and create nicknames for them.

da03/OpenNMT-py 3

Open-Source Neural Machine Translation in PyTorch http://opennmt.net/

da03/im2latex-dataset 2

Python tools for creating suitable dataset for OpenAI's im2latex task: https://openai.com/requests-for-research/#im2latex

da03/Im2Text-1 1

Im2Text extension to OpenNMT

da03/LSTMVis 1

Visualization Toolbox for Long Short Term Memory networks (LSTMs)

da03/nti.plasTeX 1

NTI's fork of plasTeX,

push eventnlp-course/data

da03

commit sha ef311eaee631790cb5f98b7239a625eb5216ddf8

.

view details

push time in 7 hours

push eventnlp-course/data

da03

commit sha 69800cd2be52ab90543fa4081621ca8828c7b387

.

view details

push time in 7 hours

push eventnlp-course/data

da03

commit sha 21aaa7c5de8d5813e04dc42ba750b31222ba8d76

.

view details

push time in 8 hours

push eventnlp-course/data

da03

commit sha 2a5edc8408802ac60abbea1f16f04582f1f2aaa7

.

view details

push time in 8 hours

push eventnlp-course/data

da03

commit sha 212de225aae90107203ab0915703dfa5905b6552

add LR png

view details

push time in 9 hours

push eventnlp-course/data

da03

commit sha f782c717b0ce257b812bb6884bc4c9efad70cd03

add LR

view details

push time in 9 hours

issue commentharvardnlp/im2markup

error importation cudnn

Yeah I think so! The only problem is that the runtime would be disconnected if it's idle for a certain period of time, and the instance would be freed so all progress would be lost. Therefore, you might want to connect to your google drive, and save progress (checkpoints) to your google drive.

theoeiferman

comment created time in 17 days

issue commentharvardnlp/im2markup

[regarding real dataset] Please respond

That's surprising. What are the sizes of those images?

vyaslkv

comment created time in 18 days

issue commentharvardnlp/im2markup

[regarding real dataset] Please respond

Cool that will work if you do a proper tokenization: the label shall be something like "( 5 + 2 sqrt 3 ) / ( 7 + 4 sqrt 3 ) = a - b sqrt 3" (separated by blanks). The algorithm should work for whatever output format.

vyaslkv

comment created time in 19 days

issue commentharvardnlp/im2markup

[regarding real dataset] Please respond

btw, if you got a GPU instance, I would recommend using this dockerfile to save you the trouble of installing luaTorch: https://github.com/OpenNMT/OpenNMT/blob/master/Dockerfile

vyaslkv

comment created time in 20 days

issue commentharvardnlp/im2markup

[regarding real dataset] Please respond

Regarding hardware, I think it's almost impossible to train on CPU, it would probably take forever. For GPU training would take less than a day even using 100k images. On AWS any GPU configuration is probably ok since your dataset of 20k images is small.

Regarding dataset size, I think 20k is a bit small, combining it with the im2latex-100k might give some reasonable results, but ideally you might need 100k real images to train. Besides, are your images of roughly the same font size? If not, standard image normalization techniques (such as denoising, resizing to same font size) might produce better results.

vyaslkv

comment created time in 20 days

issue commentharvardnlp/im2markup

[regarding real dataset] Please respond

Thanks for your interest in our work, my email is dengyuntian@seas.harvard.edu. One thing to note is that we only work on public datasets (or datasets that can be released later), such that the public can benefit from our research.

Alternatively, if you want to keep the dataset private, you can also consider cloud computing services such as Amazon EC2, Google GCE, Microsoft Azure, which provides GPU instances paid by the hour.

vyaslkv

comment created time in 21 days

push eventnlp-course/data

da03

commit sha bc555bed53f5bdddb462afc70c40d19deace15cf

.

view details

push time in a month

fork da03/nbconvert-examples

Examples that illustrate how nbconvert can be used

fork in a month

issue closedEdward-Sun/structured-nart

Optimization hyper-parameters used in the paper

Dear authors,

May I ask for the hyper-parameters used in your paper for IWSLT (the smaller model) and WMT (the full model)? Such as learning rate, warmup steps, batch size, max learning rate, etc.

Thanks in advance!

Best, Yuntian

closed time in a month

da03

issue commentEdward-Sun/structured-nart

Optimization hyper-parameters used in the paper

That's really helpful! I'll study the source code to make sure I get all hyper-parameters right. Thanks!

da03

comment created time in a month

push eventda03/CS187

da03

commit sha 8472b26d7d4d789ee691fdd7a3f163f08ce4d225

.

view details

push time in a month

push eventda03/CS187

da03

commit sha b45ea88c654a2f0f713e591c65711d628aaa6f59

.

view details

push time in a month

push eventda03/CS187

da03

commit sha 14a68f4a38faefbe7dcee403eef3338f5d558dff

.

view details

push time in a month

issue commentharvardnlp/im2markup

error importation cudnn

Yes you are right that OpenNMT-py uses PyTorch and this project uses LuaTorch. PyTorch does not require GPUs (you can do CPU-only installation), but again, it might be extremely slow without using GPUs.

For the onmt_preprocess missing issue, have you installed OpenNMT-py following the instructions here? https://github.com/OpenNMT/OpenNMT-py

theoeiferman

comment created time in a month

create barnchda03/CS187

branch : master

created branch time in a month

created repositoryda03/CS187

created time in a month

issue commentEdward-Sun/structured-nart

Optimization hyper-parameters used in the paper

Thank you for your reply! My questions are more on the optimization side such as batch size/learning schedule (the fairseq instructions are only on WMT with 4 GPUs, not on IWSLT which I assume only uses a single GPU). Plus, unfortunately the fairseq instructions cannot reproduce the WMT results reported in the paper (I need to double-check if they implemented CRF or DCRF, but it seems to be a few BLEU points off even assuming that's CRF).

da03

comment created time in a month

issue commentharvardnlp/im2markup

error importation cudnn

Oh that explains why: this code (or cudnn) only supports CUDA and cannot run on systems without GPUs. While this version (https://opennmt.net/OpenNMT-py/im2text.html, code can be found at https://github.com/OpenNMT/OpenNMT-py) supports CPU only training, doing so would be extremely slow without the parallelism provided by GPUs. Another way might be using cloud computes such as Amazon EC2 or Google GCE or Microsoft Azure, and rent a GPU instance.

theoeiferman

comment created time in a month

issue commentharvardnlp/im2markup

error importation cudnn

Oh no, so it seems nvidia-docker would not work on Mac... I have never used GPUs on Mac, but I think with a proper CUDA installation (https://docs.nvidia.com/cuda/cuda-installation-guide-mac-os-x/index.html), you should get both nvcc --version and nvidia-smi working.

theoeiferman

comment created time in a month

issue commentharvardnlp/im2markup

error importation cudnn

Hmm I suspect that your CUDA driver version might be too outdated (what's the output of nvcc --version and nvidia-smi?), which caused issues both for require cudnn and for installing nvidia-docker. There are actually CUDA drivers available for mac: https://www.nvidia.com/en-us/drivers/cuda/mac-driver-archive/. Fixing the driver version issue might solve all problems.

theoeiferman

comment created time in a month

issue commentharvardnlp/im2markup

error importation cudnn

I think it should be nvidia-docker run -it operating_lua /bin/bash, but it might be better to directly check docker documentation.

theoeiferman

comment created time in a month

issue openedEdward-Sun/structured-nart

Optimization hyper-parameters used in the paper

Dear authors,

May I ask for the hyper-parameters used in your paper for IWSLT (the smaller model) and WMT (the full model)? Such as learning rate, warmup steps, batch size, max learning rate, etc.

Thanks in advance!

Best, Yuntian

created time in a month

issue commentharvardnlp/im2markup

error importation cudnn

BTW, I think you might need to use nvidia-docker (https://github.com/NVIDIA/nvidia-docker) to support using GPUs inside docker container.

theoeiferman

comment created time in a month

issue commentharvardnlp/im2markup

error importation cudnn

For the first question, I think you need to put the dockerfile inside a folder, then inside this folder do docker build . will generate the image.

For the second question, it allows using Lua inside docker container only.

theoeiferman

comment created time in a month

startedEdward-Sun/structured-nart

started time in a month

issue commentharvardnlp/im2markup

error importation cudnn

Oh I think it seems to be some issue with installing cudnn. Installing torch correctly might be hard, would you mind using docker? Here is a docker file that can be directly used: https://github.com/OpenNMT/OpenNMT/blob/master/Dockerfile

theoeiferman

comment created time in a month

issue commentharvardnlp/im2markup

error importation cudnn

sorry I meant entering "th" first,

then in the prompt, enter "require cudnn"

theoeiferman

comment created time in a month

issue commentharvardnlp/im2markup

error importation cudnn

Hmm can you try "require cudnn" in a torch prompt (by command "th") and see if that works?

theoeiferman

comment created time in a month

issue commentpytorch/fairseq

How to reproduce vanilla NAT performance on raw iwslt16?

@George0828Zhang I think iter-decode-max-iter shall be set to 0.

George0828Zhang

comment created time in 2 months

push eventda03/fairseq

cascaded-generation

commit sha 99e4a17de522102def36066941d163744cf8cd26

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

cascaded-generation

commit sha d6c5569ca3b2f9d7a5795bd21de4b4eec7b92936

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

cascaded-generation

commit sha 5bd876a22c37f3fe435e10382dabbd7be73941a8

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

cascaded-generation

commit sha be0ccc5fbab2e88265b3876e70e48b2b968402fa

.

view details

push time in 2 months

push eventda03/fairseq

cascaded-generation

commit sha 706bcaf24654cc0bf6f00446ec2281ffd72222f4

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

cascaded-generation

commit sha 9d1c92d7514b886d689aaea7102ee5ac1552b76c

.

view details

push time in 2 months

push eventda03/fairseq

cascaded-generation

commit sha c765843f2b4630f65753ab39195495dc2738e598

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

cascaded-generation

commit sha be71fcda94f5a6039dd9c0e7959ab6d4b46ce5d9

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

cascaded-generation

commit sha cd82a5afb7ff4c0355422f94586c0ea8024f1816

.

view details

push time in 2 months

push eventda03/fairseq

cascaded-generation

commit sha 6f9bbc05019cacededb1af2a3a847b90815dcd83

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

cascaded-generation

commit sha 7e0f163f3136dc547d2644435dc663e615495c2a

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

cascaded-generation

commit sha 2d9d0f3d00c95f454034e6ab1c12a4d605762b47

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

da03

commit sha bed49c24c242adfcbf1e08b6b51b373274ed96ad

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha b9e64f8792c92a3cd8fae7ee70f3fe04e8c8eb2c

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha b5a2a6e5c809fe74322f37b1dc6d29a3f4a6370c

.

view details

push time in 2 months

startedharvardnlp/cascaded-generation

started time in 2 months

push eventharvardnlp/cascaded-generation

da03

commit sha 5d604bb1ca480394f0dbdf453eed6d94f60fbb6e

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

da03

commit sha b6f6b8b9c52d59e6b5cd2e7e5025ec72d8384412

.

view details

push time in 2 months

push eventda03/fairseq

cascaded-generation

commit sha d91401578a3703e4ebae1f7eeb5d6bbd9a0a61d9

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha b4026782ff551d21526c002e5acd041ede722f0b

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha c094e71b48ece9670ac1432606e3381cb06d2e3c

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha b21b31dfac998acd4073c143c1628b4e87c84b12

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha b2ba625b9edba9a60d83e101f406b7594226b870

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha 09762f113e731da7a70f0d4f64794617721c2054

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha a837cc094a93ca0db867ea17d2ca56f76ab21f9b

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha b032421df672a017945dace155c4994959339600

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha 2836305749e8917536b28dabe8f2c1496c4e7968

.

view details

da03

commit sha af68bd8a6f23502cab1290ac32f7043560d10424

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha 1becb5572c3092272f2d2f63fbdc10f29101bafc

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 7fa67ddce2c9b408df4f95523896719e1b076cc8

.

view details

Yuntian Deng

commit sha 1934ae9a8f23401bf507d1956e7b6500296a6f79

.

view details

da03

commit sha 1353b3fc78f00d717a93f32646db9fd1dfeb6445

.

view details

da03

commit sha c6b7c713668c746511226ec22bc40b053cbac9cd

.

view details

da03

commit sha 6643ac9ca627fc2907eaead3c61705c199846b82

.

view details

da03

commit sha 22df6ba146965b70d85723969fb4fe449882f22c

Merge https://github.com/da03/fairseq into neurips

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

da03

commit sha c6b7c713668c746511226ec22bc40b053cbac9cd

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

da03

commit sha 1353b3fc78f00d717a93f32646db9fd1dfeb6445

.

view details

push time in 2 months

push eventda03/fairseq

da03

commit sha a665736b292176d5ce4bef6d111bb9b69df52bcb

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

Yuntian Deng

commit sha 1934ae9a8f23401bf507d1956e7b6500296a6f79

.

view details

push time in 2 months

push eventharvardnlp/cascaded-generation

Yuntian Deng

commit sha 7fa67ddce2c9b408df4f95523896719e1b076cc8

.

view details

push time in 2 months

create barnchharvardnlp/cascaded-generation

branch : master

created branch time in 2 months

created repositoryharvardnlp/cascaded-generation

created time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha d8760eaa5d0a3c360d4b6a899c95d20247240eba

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha e053f076f984a7b7ad39ef7dffed82e62ed0751c

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 07cf20de2a23acb5ef2a881ad4835e319717d4d9

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha d152cb36dd2d5416b24253d84e0e5f7b4fd3bda2

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 479ff281c5e08aaf3bceb93f9273231d2c80831c

.

view details

Yuntian Deng

commit sha 8e37fd2769faff5e38010a82e087afe145d9e7d9

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 9edbe4e068243b492418f056e5c26d5f308a27c0

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 41ecb0cbf58bcb3ef80c21af4cf0695aee097298

.

view details

Yuntian Deng

commit sha 52adc9de06f40ad84d83c1ccaa213cd5d6ba7835

Merge https://github.com/da03/fairseq

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 9288ceb67e084a349de0e751fe5d7c474e3f998d

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 8b7287fa379262d945c56e50940411b7bb6949d8

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 181c993bad920108c9493b9a779c0963b0472b1c

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha b4c4039d7db3d1289ab5266254cc94aa742aba32

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha b3051c79494f08b42d026ecbc06bc8194beff832

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 738407b931dc399ce59f7cb29340d5d4ba62b985

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 8d14aea4fb116a370a9124a027ff0ddf0f00167e

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha e8161926a307db1d581b5389df9708f112d327f3

.

view details

push time in 2 months

push eventda03/matplotlib

Yuntian Deng

commit sha 616d8e053001166ca2d3247bfb874932499a2b9b

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 56963f6b43851e6e35b105845ffa8c00f04ccd76

.

view details

push time in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha 2b6a173262a2e8a6cb3cbdf7b0f60c4305e143df

.

view details

push time in 2 months

push eventda03/matplotlib

Kent

commit sha c01883ad0dd4c051bde7c49e563a639a680bda76

Add force_zorder parameter

view details

Kent

commit sha 76bb560f109486dfb3e30c6b68eedadc05a18d49

Typos

view details

Kent

commit sha 988be5021c5440ecaa845f790c8813117af88484

Formatting: add blank line

view details

Yuntian Deng

commit sha 9d32d80608efdd25e870b8fbe9571278f3798d49

.

view details

push time in 2 months

fork da03/matplotlib

matplotlib: plotting with Python

https://matplotlib.org

fork in 2 months

push eventda03/fairseq

Yuntian Deng

commit sha fdc1adec169c86fa522a7308b46afb18760225f9

.

view details

push time in 2 months

issue commentOpenNMT/OpenNMT-py

Which loss function is used in im2text?

It's cross-entropy loss, same as in other sequence prediction tasks such as translation.

mrgloom

comment created time in 3 months

push eventda03/fairseq

Yuntian Deng

commit sha e490412b28b4f2489b800b9a9a563917ee8199cb

.

view details

push time in 3 months

push eventda03/fairseq

Yuntian Deng

commit sha eb34b0e1211922d1a333aa11698d095f083b99d7

.

view details

push time in 3 months

issue closedharvardnlp/im2markup

There is a bug in preprocess_latex.js

When I used your script to normalize the formula, I found that there was an illegal LaTeX symbol 'ule' generated. After checking, I found that the symbol was originally '\rule', but the program mistook the '\r' in front of the symbol for a carriage return. This problem was solved when I changed the "\ rule {" to "\ \rule {" in line 344 of the preprocess_latex.js.

closed time in 3 months

hengyeliu

issue commentharvardnlp/im2markup

There is a bug in preprocess_latex.js

Thanks!

hengyeliu

comment created time in 3 months

push eventharvardnlp/im2markup

unknown

commit sha 99bff2004061a68fc39262306686c9765df85632

update preprocess_latex.js

view details

Yuntian Deng

commit sha 2b68f4edd3798b763d7044de2ea332ed06381413

Merge pull request #37 from hengyeliu/im2latexpr update preprocess_latex.js

view details

push time in 3 months

PR merged harvardnlp/im2markup

update preprocess_latex.js

update preprocess_latex.js at line 344.

+1 -1

0 comment

1 changed file

hengyeliu

pr closed time in 3 months

push eventda03/fairseq

Yuntian Deng

commit sha df9c2ec0a8fa0e7d6c857cad60025b21f33dddac

.

view details

push time in 3 months

more