profile
viewpoint

allenai/allennlp 8761

An open-source NLP research library, built on PyTorch.

allenai/acl2018-semantic-parsing-tutorial 376

Materials from the ACL 2018 tutorial on neural semantic parsing

allenai/allennlp-as-a-library-example 109

A simple example for how to build your own model using AllenNLP as a dependency.

allenai/allennlp-semparse 56

A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLP

allenai/allennlp-language-modeling 16

An experimental plugin adding language model implementations and training methods to AllenNLP.

allenai/allennlp-guide 9

Code and material for the AllenNLP Guide

allenai/deep_qa_experiments 8

Scala framework for running experiments with allenai/deep_qa

issue commentnitishgupta/nmn-drop

Training NMN model on full DROP (with numerical answers)

Also see our ACL paper on why performance on the ICLR subset is very likely worse when you train the model on all of these questions. (As ACL is ongoing right now, here's a link to the conference page for the paper.)

amritasaha1812

comment created time in 2 days

issue commentallenai/allennlp

Add new attribute in Token class

You should be able to just do something like tokens = [MyToken(token.text, ...) for token in tokens] in your dataset reader. You shouldn't have to override tokenizers or token indexers at all.

Zessay

comment created time in 2 days

issue closedallenai/allennlp

A parameter problem

In the file pretrained_transformer_mismatched_embedder.py, there is a * in the __init__ (line 43). I haven't seen this parameter before (only know *args and **kwargs), and can't understand the use of this here. Could you please explain the usage of this parameter? Thank you!

closed time in 2 days

Zessay

issue commentallenai/allennlp

A parameter problem

It forces all following arguments to be passed by name (with the idea of minimizing accidental order swaps for long argument lists). https://stackoverflow.com/questions/14301967/bare-asterisk-in-function-arguments

Zessay

comment created time in 2 days

push eventallenai/allennlp

Ryo Takahashi

commit sha 8229aca3be784ae3af5cd4edec2749124e6b6cba

Fix pretrained model initialization (#4439) * Add failing test * Use copy_ instead of slicing * Update CHANGELOG

view details

push time in 2 days

PR merged allenai/allennlp

Fix pretrained model initialization

This PR fixes #4427. As described, slicing a 0-dim tensor raises IndexError, so used copy_ method instead.

I added scalar parameters to the models in pretrained_model_initializer_test.py and added a test case, which had failed before this fix.

+11 -1

0 comment

3 changed files

reiyw

pr closed time in 2 days

issue closedallenai/allennlp

PretrainedModelInitializer fails to initialize a model with a 0-dim tensor

<!-- Please fill this template entirely and do not erase any of it. We reserve the right to close without a response bug reports which are incomplete.

If you can't fill in the checklist then it's likely that this is a question, not a bug, in which case it probably belongs on our discource forum instead:

https://discourse.allennlp.org/ -->

Checklist

<!-- To check an item on the list replace [ ] with [x]. -->

  • [x] I have verified that the issue exists against the master branch of AllenNLP.
  • [x] I have read the relevant section in the contribution guide on reporting bugs.
  • [x] I have checked the issues list for similar or identical bug reports.
  • [x] I have checked the pull requests list for existing proposed fixes.
  • [x] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the master branch.
  • [x] I have included in the "Description" section below a traceback from any exceptions related to this bug.
  • [x] I have included in the "Related issues or possible duplicates" section below all related issues and possible duplicate issues (If there are none, check this box anyway).
  • [x] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
  • [x] I have included in the "Environment" section below the output of pip freeze.
  • [x] I have included in the "Steps to reproduce" section below a minimally reproducible example.

Description

<!-- Please provide a clear and concise description of what the bug is here. -->

<details> <summary><b>Python traceback:</b></summary> <p>

<!-- Paste the traceback from any exception (if there was one) in between the next two lines below -->

Traceback (most recent call last):
  File "test.py", line 27, in <module>
    applicator(net)
  File "/Users/reiyw/.ghq/ghq/github.com/allenai/allennlp/allennlp/nn/initializers.py", line 482, in __call__
    initializer(parameter, parameter_name=name)
  File "/Users/reiyw/.ghq/ghq/github.com/allenai/allennlp/allennlp/nn/initializers.py", line 406, in __call__
    tensor.data[:] = source_weights[:]
IndexError: slice() cannot be applied to a 0-dim tensor.

</p> </details>

PretrainedModelInitializer fails to initialize a model with 0-dim tensors. The error occurs at: https://github.com/allenai/allennlp/blob/637dbb159082999c546ac2fc64746b88e5c9d1b5/allennlp/nn/initializers.py#L405-L406 The cause is that slicing cannot be applied to a 0-dim tensor. Instead of slicing a tensor here, we can avoid the error by using copy_ method:

tensor.data.copy_(source_weights.data)

I have confirmed that this change will not break the pretrained model initializer test.

Another workaround is to use a 1-dim tensor instead of a 0-dim tensor to represent a scalar, but I think it's better to fix it so that others don't face the same problem.

If that change is appropriate, I would be happy to submit a PR.

Related issues or possible duplicates

  • None

Environment

<!-- Provide the name of operating system below (e.g. OS X, Linux) --> OS: OS X

<!-- Provide the Python version you were using (e.g. 3.7.1) --> Python version: 3.7.7

<details> <summary><b>Output of <code>pip freeze</code>:</b></summary> <p>

<!-- Paste the output of pip freeze in between the next two lines below -->

-e git+ssh://git@github.com/allenai/allennlp.git@9c4dfa544e85e00d636beb0026a08e40dcdb6404#egg=allennlp
apex @ git+https://github.com/NVIDIA/apex.git@44532b30a4fad442f00635a0f4c8f241b06c2315
appdirs==1.4.4
attrs==19.3.0
black==19.10b0
bleach==3.1.5
blis==0.4.1
boto3==1.14.12
botocore==1.17.12
catalogue==1.0.0
certifi==2020.6.20
chardet==3.0.4
click==7.1.2
codecov==2.1.7
colorama==0.4.3
coverage==5.1
cycler==0.10.0
cymem==2.0.3
docutils==0.15.2
filelock==3.0.12
flake8==3.8.3
flaky==3.6.1
future==0.18.2
h5py==2.10.0
idna==2.10
importlib-metadata==1.7.0
Jinja2==2.11.2
jmespath==0.10.0
joblib==0.15.1
jsonnet==0.16.0
jsonpickle==1.4.1
keyring==21.2.1
kiwisolver==1.2.0
livereload==2.6.2
lunr==0.5.8
Markdown==3.2.2
markdown-include==0.5.1
MarkupSafe==1.1.1
matplotlib==3.2.2
mccabe==0.6.1
mkdocs==1.1.2
mkdocs-material==5.3.3
mkdocs-material-extensions==1.0
more-itertools==8.4.0
murmurhash==1.0.2
mypy==0.782
mypy-extensions==0.4.3
nltk==3.5
nr.collections==0.0.1
nr.databind.core==0.0.16
nr.databind.json==0.0.11
nr.interface==0.0.3
nr.metaclass==0.0.5
nr.parsing.date==0.1.0
nr.pylang.utils==0.0.3
nr.stream==0.0.4
numpy==1.19.0
overrides==3.1.0
packaging==20.4
pathspec==0.8.0
pathtools==0.1.2
pkginfo==1.5.0.1
plac==1.1.3
pluggy==0.13.1
preshed==3.0.2
protobuf==3.12.2
py==1.9.0
py-cpuinfo==6.0.0
pycodestyle==2.6.0
pyflakes==2.2.0
Pygments==2.6.1
pymdown-extensions==7.1
pyparsing==2.4.7
pytest==5.4.3
pytest-benchmark==3.2.3
pytest-cov==2.10.0
python-dateutil==2.8.1
PyYAML==5.3.1
readme-renderer==26.0
regex==2020.6.8
requests==2.24.0
requests-toolbelt==0.9.1
responses==0.10.15
rfc3986==1.4.0
ruamel.yaml==0.16.10
ruamel.yaml.clib==0.2.0
s3transfer==0.3.3
sacremoses==0.0.43
scikit-learn==0.23.1
scipy==1.5.0
sentencepiece==0.1.91
six==1.15.0
spacy==2.3.0
srsly==1.0.2
tensorboardX==2.0
thinc==7.4.1
threadpoolctl==2.1.0
tokenizers==0.7.0
toml==0.10.1
torch==1.5.1
tornado==6.0.4
tqdm==4.47.0
transformers==2.11.0
twine==3.2.0
typed-ast==1.4.1
typing-extensions==3.7.4.2
urllib3==1.25.9
wasabi==0.7.0
wcwidth==0.2.5
webencodings==0.5.1
zipp==3.1.0

</p> </details>

Steps to reproduce

<details> <summary><b>Example source:</b></summary> <p>

<!-- Add a fully runnable example in between the next two lines below that will reproduce the bug -->

import tempfile
import pathlib

import torch

from allennlp.nn import InitializerApplicator
from allennlp.nn.initializers import PretrainedModelInitializer


class Net(torch.nn.Module):
    def __init__(self):
        super().__init__()
        # 0-dim tensor
        self.scalar = torch.nn.Parameter(torch.tensor(1.0))


net = Net()
temp_dir = pathlib.Path(tempfile.mkdtemp())
weights_file = temp_dir / "weights.th"
torch.save(net.state_dict(), weights_file)

initializer = PretrainedModelInitializer(weights_file)
applicator = InitializerApplicator([("scalar", initializer)])
applicator(net)

</p> </details>

closed time in 2 days

reiyw

issue commentallenai/allennlp

where is part of speech and proper name category label collection?

What model are you talking about? The label space used depends on the data used to train the model. For POS tags, we typically use spaCy to predict those. Other syntax labels most often ultimately derive from the Penn Treebank.

hugess

comment created time in 2 days

issue commentallenai/allennlp

allennlp.common.checks.ConfigurationError: key "token_embedders" is required at location "model.text_field_embedder."

You're using an old model that is not compatible with allennlp 1.0. An updated model can be found here: https://storage.googleapis.com/allennlp-public-models/elmo-constituency-parser-2020.02.10.tar.gz.

563017732

comment created time in 2 days

issue closedallenai/allennlp

allennlp.common.checks.ConfigurationError: key "token_embedders" is required at location "model.text_field_embedder."

Describe the bug

I0706 10:33:02.516595 140677189486400 archival.py:171] extracting archive file /home/jiawen/.allennlp/cache/60c14844468543e4329ce7e8d3444fa1f9f7057b4b0de5b3f4a597eb57113d32.73aa20bab6336a582588814d8458d040b59536ca1f60b6a769a2da61c7aa3c9a to temp dir /tmp/tmp15lrx2nb I0706 10:33:06.706750 140677189486400 params.py:247] type = from_instances I0706 10:33:06.706835 140677189486400 vocabulary.py:323] Loading token dictionary from /tmp/tmp15lrx2nb/vocabulary. I0706 10:33:06.706987 140677189486400 filelock.py:274] Lock 140674091466936 acquired on /tmp/tmp15lrx2nb/vocabulary/.lock I0706 10:33:06.707335 140677189486400 filelock.py:318] Lock 140674091466936 released on /tmp/tmp15lrx2nb/vocabulary/.lock I0706 10:33:06.707670 140677189486400 params.py:247] model.type = constituency_parser I0706 10:33:06.707974 140677189486400 params.py:247] model.regularizer = None I0706 10:33:06.708117 140677189486400 params.py:247] model.text_field_embedder.type = basic Traceback (most recent call last): File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/params.py", line 237, in pop value = self.params.pop(key) KeyError: 'token_embedders'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "src/scripts/athene/pipeline.py", line 193, in <module> document_retrieval(logger, args.mode) File "src/scripts/athene/pipeline.py", line 163, in document_retrieval Config.document_add_claim, Config.document_parallel) File "/home/jiawen/Documents/Code/Code/fact_checking/ESIM/src/athene/retrieval/document/docment_retrieval.py", line 194, in main method = Doc_Retrieval(database_path=db_file, add_claim=add_claim, k_wiki_results=k_wiki) File "/home/jiawen/Documents/Code/Code/fact_checking/ESIM/src/athene/retrieval/document/docment_retrieval.py", line 47, in init self.predictor = Predictor.from_path("https://s3-us-west-2.amazonaws.com/allennlp/models/elmo-constituency-parser-2018.03.14.tar.gz") File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/predictors/predictor.py", line 275, in from_path load_archive(archive_path, cuda_device=cuda_device), File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/models/archival.py", line 197, in load_archive opt_level=opt_level, File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/models/model.py", line 398, in load return model_class._load(config, serialization_dir, weights_file, cuda_device, opt_level) File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/models/model.py", line 295, in _load model = Model.from_params(vocab=vocab, params=model_params) File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 580, in from_params **extras, File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 609, in from_params kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras) File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 181, in create_kwargs cls.name, param_name, annotation, param.default, params, **extras File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 287, in pop_and_construct_arg return construct_arg(class_name, name, popped_params, annotation, default, **extras) File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 321, in construct_arg return annotation.from_params(params=popped_params, **subextras) File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 580, in from_params **extras, File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 609, in from_params kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras) File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 181, in create_kwargs cls.name, param_name, annotation, param.default, params, **extras File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/from_params.py", line 280, in pop_and_construct_arg popped_params = params.pop(name, default) if default != _NO_DEFAULT else params.pop(name) File "/home/jiawen/anaconda3/envs/esim/lib/python3.6/site-packages/allennlp/common/params.py", line 242, in pop raise ConfigurationError(msg) allennlp.common.checks.ConfigurationError: key "token_embedders" is required at location "model.text_field_embedder." I0706 10:33:06.710113 140677189486400 archival.py:205] removing temporary unarchived model dir at /tmp/tmp15lrx2nb

closed time in 2 days

563017732

issue commentallenai/allennlp

Transformer tokenizers cause deadlocks when dataset reader is lazy and dataloader num_workers > 0

We have lots of models that train just fine with transformers, so it definitely is not a global problem. Make sure you have an up to date release, also.

epwalsh

comment created time in 2 days

issue commentallenai/allennlp

Transformer tokenizers cause deadlocks when dataset reader is lazy and dataloader num_workers > 0

Just don't set num_workers > 0 with a lazy dataset.

epwalsh

comment created time in 2 days

issue commentallenai/allennlp-guide

Chapter on document ranking/retrieval

Awesome, thanks for the update!

jacobdanovitch

comment created time in 2 days

issue closedallenai/allennlp

just can stay the original and nomore same like dog place

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

closed time in 3 days

marskong88

issue closedallenai/allennlp

BERT sliding window embedding size smaller than number of tokens

I have a question about the behaviour o embeddings generated by BERT for long token sequences.

I'm aware there is some functionality for generating embeddings using a sliding window approach (#2537) -- and in v1.0.0 of allennlp, I no longer get warnings about long sequences being truncated (so I assume that embeddings are now generated with a sliding window for long token sequences.

With a batch-size of 1, I would expect the length of the textfields and length of the embeddings generated by BERT to be equivalent. However, in a simple SQuAD style evaluation with an input sequence of 1088 tokens I am finding that my embedding size is only 1080 after using the BERT embedder. I have added this assertion which fails:

    def forward(  # type: ignore
        self,
        question_with_context: Dict[str, Dict[str, torch.LongTensor]],
        context_span: torch.IntTensor,
        answer_span: Optional[torch.IntTensor] = None,
        metadata: List[Dict[str, Any]] = None
    ) -> Dict[str, torch.Tensor]:
        embedded_text = self._text_field_embedder(question_with_context)
        assert embedded_text.shape[1] == question_with_context["tokens"]["token_ids"].shape[1], "Expected embedding size and sequence length to be equal, but instead got an embedding size of {} for {} tokens".format(embedded_text.shape[1], question_with_context["tokens"]["token_ids"].shape[1])
AssertionError: Expected embedding size and sequence length to be equal, but instead got an embedding size of 1080 for 1088 tokens

If this is expected, how would the original tokens map to the embedded tokens? If it is not expected, what could the source of the issue be? Could it be something silly I'm doing?

Thanks

closed time in 5 days

j6mes

issue commentallenai/allennlp

BERT sliding window embedding size smaller than number of tokens

Ok, got it. I wasn't really familiar with how we handled long sequences, but now I know a bit more. The key issue is this: if you look at the number of tokens in your TextField, it's the same as the number of embeddings you get out of the TextFieldEmbedder (this is the fundamental contract that AllenNLP upholds). So the reason there are extra tokens in the 'token_ids' tensor is because they were added to signal where to break up segments for the sliding window encoding. You can see this if you just decrease the max length in your config file; the lower it is, the larger the discrepancy between the two numbers you're comparing.

If you want to see the code where this happens, look here for where the extra tokens are added: https://github.com/allenai/allennlp/blob/60deece9fca2da6b66bfcde44484384bdefa3fe7/allennlp/data/token_indexers/pretrained_transformer_indexer.py#L149-L182 (and hence they show up in the "token_ids" tensor), and here for where they are removed: https://github.com/allenai/allennlp/blob/60deece9fca2da6b66bfcde44484384bdefa3fe7/allennlp/modules/token_embedders/pretrained_transformer_embedder.py#L201-L235

j6mes

comment created time in 5 days

issue closedallenai/allennlp

How to add hand-craft feature in seq2seq model ?

Hi there, I'm using allennlp for seq2seq task. And I use the copynet_seq2seq . Here are part of my configs:

"type":"copynet-seq2seq", 
"source_token_indexers": {
  "tokens": {
    "type": "single_id",
    "namespace": "source_tokens",
    "lowercase_tokens": true
  }
},
 ....

I wonder if I want use pos(of a word) as another feature. How could I do it ? Should I add another indexers like:

"type":"copynet-seq2seq", 
"source_pos_indexers": {
  "pos": {
    "type": "single_id",
    "namespace": "source_tokens",
    "lowercase_tokens": true
  }
},
 ....

Or I could add it together?

"type":"copynet-seq2seq", 
"source_token_indexers": {
  "pos": {
    "type": "single_id",
    "namespace": "source_tokens",
    "lowercase_tokens": true
  }
  "tokens": {
    "type": "single_id",
    "namespace": "source_tokens",
    "lowercase_tokens": true
  }
},
 ....

And do I have to use same namespace between them ?

@matt-gardner I saw your reply in #3091 , but the link is out of date. Could you help me ?

closed time in 6 days

wlhgtc

issue commentallenai/allennlp

How to add hand-craft feature in seq2seq model ?

Yes, there are several possible attributes of Token which can be used by the token indexer (and you can even set your own if you subclass Token). It now seems a little bit odd to me to store POS tags and such on the Token object, but we did it this way because the main tokenizer we used was spacy's tokenizer, and that had the option of producing Token objects with these attributes, so we ran with it.

For your vocabulary questions, see here.

wlhgtc

comment created time in 6 days

push eventallenai/allennlp-guide-examples

Zessay

commit sha 213af4c14ff9622043a7676c7dd4dfd05d85931b

fix type hint error

view details

Matt Gardner

commit sha 584b32e630d397df150692b6ad5c76412b60bd78

Merge pull request #12 from Zessay/fix-type-hint-error fix type hint error

view details

push time in 6 days

pull request commentallenai/allennlp-guide-examples

fix type hint error

Thanks!

Zessay

comment created time in 6 days

issue commentallenai/allennlp

Type hint error

Thanks for reporting this!

Zessay

comment created time in 6 days

push eventallenai/allennlp

Matt Gardner

commit sha 60deece9fca2da6b66bfcde44484384bdefa3fe7

Fix type hint in text_field.py (#4434) Fixes #4433.

view details

push time in 6 days

delete branch allenai/allennlp

delete branch : matt-gardner-patch-1

delete time in 6 days

PR merged allenai/allennlp

Fix type hint in text_field.py merge when ready

Fixes #4433.

+1 -1

0 comment

1 changed file

matt-gardner

pr closed time in 6 days

issue closedallenai/allennlp

Type hint error

In the 94 line of the file allennlp/allennlp/data/fields/text_field.py, the output type should be TextFieldTensors rather than Dict[str, torch.Tensor]. Python doesn't inspect whether the type of the parameter is right, while I think it is important for us to understand the code.

image

closed time in 6 days

Zessay

PR opened allenai/allennlp

Fix type hint in text_field.py

Fixes #4433.

+1 -1

0 comment

1 changed file

pr created time in 6 days

create barnchallenai/allennlp

branch : matt-gardner-patch-1

created branch time in 6 days

issue commentallenai/allennlp

How to add hand-craft feature in seq2seq model ?

See here, particularly the part where it talks about combining multiple indexers. If that isn't enough to solve your issue, feel free to ask more questions.

wlhgtc

comment created time in 6 days

issue commentallenai/allennlp

allowed_start_transitions and allowed_end_transitions don't actually enforce their constraints

Hmm, ok, I see now where I also missed something :). Looks like this code could use some better comments in general, because the logic is not obvious. I looked at it several times over the course of this discussion and missed things each time.

zhuango

comment created time in 6 days

issue commentallenai/allennlp-guide-examples

Type hint error

Yes, you're right. Care to submit a PR to fix this?

Zessay

comment created time in 6 days

push eventmatt-gardner/allennlp

Matt Gardner

commit sha b295cec59e7649e600ecd2a293c71bcf6cc347b7

remove prints

view details

push time in 6 days

push eventmatt-gardner/allennlp

Matt Gardner

commit sha 3a235212fbdfdc5b198ec82651e1dea15eb7f0ae

tests are passing with small fixtures

view details

push time in 6 days

issue commentallenai/allennlp

Error loading model using load_archive from local path

You need to be sure that your discourse_classifier model has been registered. Assuming you did that with a @Model.register decorator, the easiest way to do that is just to import your model class in your script. You'll likely also have to do the same for your dataset reader class.

itsmemala

comment created time in 6 days

issue commentallenai/allennlp

BERT SRL model: RuntimeError: Expected object of backend CPU but got backend CUDA for sequence element 1 in sequence argument at position #1 'tensors'

Oh, wow, is spacy.require_gpu() changing the behavior of torch.ones()? That's the only thing that makes sense to me for getting this error. That seems really bad and dangerous, because it affects non-spacy code. Or maybe that's something that torch supports natively that we're not aware of.

Either way, that line of code in viterbi_decode shouldn't just be calling torch.ones() anyway, it should be smarter about making sure it creates a tensor on the right device. That's the real problem here. #4429 brings up a different problem with the same piece of code; @zhuango, if you're going to fix #4429, can you fix the device allocation at the same time?

ducalpha

comment created time in 6 days

IssuesEvent

issue commentallenai/allennlp

How does allowed_start_transitions work in viterbi_decode function ?

Ok, sorry for being obtuse earlier, I was trying to go too fast through the issues, and assumed the code was right and didn't pay enough attention to the points you made. With your additional explanations, I now understand and agree with what you're saying. I went back and looked at when that code got added, and I think it just never did what it was supposed to.

Can you submit a PR to change those sentinel valus?

zhuango

comment created time in 6 days

issue commentallenai/allennlp

How does allowed_start_transitions work in viterbi_decode function ?

What you're missing is the broadcasting that's happening. Your set of equations for summed_potentials is wrong, because it doesn't account for the .unsqueeze() that happens, and the associated broadcasting. I'd recommend stepping through the execution of that code with your toy example to see exactly what happens; it's more complicated than it looks at first glance.

zhuango

comment created time in 6 days

push eventallenai/contrast-sets

Eric Wallace

commit sha 7db4c57cd7dfd1417f253f5b1c418446513b7d88

update imdb to fix formatting issues

view details

Matt Gardner

commit sha e44faab7e5f77c5529c93fbc341ef208cfdbd541

Merge pull request #16 from Eric-Wallace/imdb update imdb to fix formatting issues

view details

push time in 6 days

PR merged allenai/contrast-sets

update imdb to fix formatting issues

Fixes #15 @matt-gardner

+494 -494

1 comment

4 changed files

Eric-Wallace

pr closed time in 6 days

issue closedallenai/contrast-sets

Formatting discrepancies with IMDb data relative to originals

FYI--

When adding your IMDb data to my experiments here (https://arxiv.org/pdf/1906.01154.pdf), I noticed that there are some non-verbatim matches (on the order of ~26 out of 488) relative to the originals in https://github.com/acmi-lab/counterfactually-augmented-data (as a result of dropped special characters):

For example, Row 40 of https://github.com/allenai/contrast-sets/blob/master/IMDb/data/test_original.tsv does not exactly match line 183 of https://github.com/acmi-lab/counterfactually-augmented-data/blob/master/sentiment/orig/test.tsv because of dropped special characters.

Also, for example, see the diff with clichés->clichs in row 363 (test.tsv) vs. row 116 (test_original.tsv).

Also, as an aside, your IMDb data is double quote-escaped with depth 2, unlike the original.

closed time in 6 days

allenschmaltz

pull request commentallenai/contrast-sets

update imdb to fix formatting issues

Thanks @Eric-Wallace!

Eric-Wallace

comment created time in 6 days

push eventmatt-gardner/allennlp

Matt Gardner

commit sha 9f1af51abbe2268f80b77b5c94dca328fd6b4a5c

Initial test mostly passing, though things are still a bit of a mess

view details

push time in 6 days

issue commentallenai/allennlp

srl bert model didn't register

@itsmemala, can you open a new issue? It's a lot easier for us to track. Before you do, though, can you confirm that you see this issue with the latest master? There have been some fixes along this line recently, which may not have made it into a new release yet.

jinruiyang

comment created time in 7 days

issue commentallenai/allennlp

BERT sliding window embedding size smaller than number of tokens

Hmm, it's possible there's a bug somewhere when the contexts get really long. Two thoughts:

  1. You mention the allennlp_rc repo; we do still have a reader in the allennlp-models repo; we just changed our plans on how exactly we were going to split things. Not sure if that's helpful for you or not.

  2. Can you construct an input for which the code reliably fails? If so, it would be great to have that as a failing test case that we can try to figure out. Looking at the input would likely tell us what the problem is, because this does sound like a bug to me.

j6mes

comment created time in 7 days

issue commentallenai/allennlp

BERT sliding window embedding size smaller than number of tokens

Whether that assertion should be valid depends entirely on the token indexer that you're using. Can you tell me what that is? If you're using a mismatched indexer (or, for pre-1.0 code, a BertIndexer, or whatever we called it), then this assertion will fail by design. See this chapter of the guide for how these options work.

If you're not using a mismatched indexer, then I would expect that assertion to work, though I might be missing something about the context.

j6mes

comment created time in 7 days

issue closedallenai/allennlp

How does allowed_start_transitions work in viterbi_decode function ?

I working on a project that uses viterbi_decode function in allennlp.nn.util. viterbi_decode function has allowed_start_transitions/allowed_end_transitions parameters, which aim to enable viterbi decoding process to take the start/end timestep restriction into account. But how it works confuses me.

Follows are what I learned from viterbi_decode code. path_score is initialized with tag_sequence[0] which is a zero vector when has_start_end_restrictions is true. When calculating the score for the first time step through a topk op on path_score + transition_matrix, the state of previous timestep (paths) is depend on transition_matrix and not restricted on "start".

For an extreme case, the first timestep cannot be any state and allowed_start_transitions are all -inf. Normally, the score of the first timestep is -inf vector because no state is allowed at first timestep. But topk op on path_score + transition_matrix could still get non-inf score and non-start _paths_of previous time step. Because there are potentials bigger than -inf and those corresponding previous states will be append to paths which is not "start" state. The allowed_start_transitions is not working ? So it confuses me. Please help me to understand how does the allowed_start_transitions/allowed_end_transitions actually work.

closed time in 7 days

zhuango

issue commentallenai/allennlp

How does allowed_start_transitions work in viterbi_decode function ?

I'm going to give you a minimal explanation, but feel free to ask more questions if this isn't enough.

The place to look for an example for how this is used is our SRL model: here and here. There, you will see that we set all allowed transitions to have a score of 0 in the transition matrix, and all disallowed transitions to have a score of float('-inf'). Because we're adding log probabilities inside of viterbi_decode, a score of float('-inf') means that any path through that transition will get a score of float('-inf'), or a probability of 0.

zhuango

comment created time in 7 days

issue commentallenai/allennlp

AugmentedLSTM not working

Given that this has never worked and you're the first one to complain about it, I think we can safely say that this code is unused, so it's ok to change it. Can you just change both AugmentedLstmSeq2*Encoders to be BiAugmented..., and point to the BiAugmentedLstm class?

harshnarang8

comment created time in 7 days

push eventallenai/allennlp-models

Allen Peng

commit sha 11c6814a1de2908ab34794640411c1c5f879705d

fix the bug of bimpm and update CHANGELOG (#87) Co-authored-by: pengshuang <jianfeng.ps@antfin.com>

view details

push time in 7 days

PR merged allenai/allennlp-models

Fix the initializer of BIMPM model

Similar issue with allenai/allennlp#4419

BIMPM in allennlp-models still used the old version of initializer. (But the experiment.json in the test_fixtures is right, see https://github.com/allenai/allennlp-models/blob/master/test_fixtures/pair_classification/bimpm/experiment.json)

I am now upgrading our code from allennlp<=0.9.0 to allennlp==1.0.0.

Thanks for your great work in the latest version.

+11 -9

0 comment

2 changed files

pengshuang

pr closed time in 7 days

issue commentallenai/allennlp

PretrainedModelInitializer fails to initialize a model with a 0-dim tensor

Yep, you're right, this looks like a bug to me, and I agree with your fix of just using copy_. Can you submit a PR, adding your minimal repro as a test case so we don't have a regression?

reiyw

comment created time in 7 days

issue closedallenai/allennlp

QANet in allennlp-models still used the old version of regularizer

As the Upgrade guide from v0.9.0 said:

Regularization now needs another key in a config file. Instead of specifying regularization as "regularizer": [[regex1, regularizer_params], [regex2, regularizer_params]], it now must be specified as "regularizer": {"regexes": [[regex1, regularizer_params], [regex2, regularizer_params]]}(https://github.com/allenai/allennlp/releases/tag/v1.0.0).

But I find that the training_config of qanet still used the old version of regularizer

see: https://github.com/allenai/allennlp-models/blob/37136f8ecbc42d26d9b135d06e4042f22d0d1bee/training_config/rc/qanet.jsonnet#L114.

closed time in 7 days

pengshuang

issue commentallenai/allennlp

QANet in allennlp-models still used the old version of regularizer

Awesome, thanks.

pengshuang

comment created time in 7 days

push eventallenai/allennlp-models

Allen Peng

commit sha a735ddd15e40cfacc6bba3270f61a3d91e9c1e65

Fix the regularizer of QANet model (#86) * fix the regularizer of qanet * update CHANGELOG Co-authored-by: pengshuang <jianfeng.ps@antfin.com>

view details

push time in 7 days

PR merged allenai/allennlp-models

Fix the regularizer of QANet model

See the issue https://github.com/allenai/allennlp/issues/4419

I replace the regularizer of QANet with the latest version of allennlp==1.0.0

+11 -8

1 comment

2 changed files

pengshuang

pr closed time in 7 days

pull request commentallenai/allennlp-models

Fix the regularizer of QANet model

Thanks!

pengshuang

comment created time in 7 days

Pull request review commentallenai/allennlp

Implement LXMERT / VilBERT / something similar for NLVR2

+from glob import glob+from typing import Dict, List+import json+import os+import sys+import unicodedata+import pickle++from overrides import overrides+import numpy as np+import spacy++from pytorch_transformers import BertTokenizer++from lib.lxmert.src.utils import load_obj_tsv++from allennlp.common.checks import ConfigurationError+from allennlp.data.dataset_readers.dataset_reader import DatasetReader+from allennlp.data.fields import (+    ArrayField,+    LabelField,+    TextField,+    MetadataField,+    ProductionRuleField,+    ListField,+    IndexField,+)+from allennlp.data.instance import Instance+from allennlp.data.tokenizers.token import Token+from allennlp.data.token_indexers import (+    TokenIndexer,+    SingleIdTokenIndexer,+    PretrainedTransformerIndexer,+)+from allennlp.data.tokenizers import Tokenizer, WordTokenizer+++@DatasetReader.register("nlvr2_lxmert")+class Nlvr2LxmertReader(DatasetReader):+    """+    Parameters+    ----------+    text_path_prefix: ``str``+        Path to folder containing text files for each dataset split. These files contain+        the sentences and metadata for each task instance.+    visual_path_prefix: ``str``+        Path to folder containing `tsv` files with the extracted objects and visual+        features+    topk_images: ``int``, optional (default=-1)+        Number of images to load from each split's visual features file. If -1, all+        images are loaded+    mask_prepositions_verbs: ``bool``, optional (default=False)+        Whether to mask prepositions and verbs in each sentence+    drop_prepositions_verbs: ``bool``, optional (default=False)+        Whether to drop (remove without replacement) prepositions and verbs in each sentence+    lazy : ``bool``, optional+        Whether to load data lazily.  Passed to super class.+    """++    def __init__(+        self,+        text_path_prefix: str,+        visual_path_prefix: str,+        topk_images: int = -1,+        mask_prepositions_verbs: bool = False,+        drop_prepositions_verbs: bool = False,+        lazy: bool = False,+    ) -> None:+        super().__init__(lazy)+        self.text_path_prefix = text_path_prefix+        self.visual_path_prefix = visual_path_prefix+        self._tokenizer = BertTokenizer.from_pretrained(+            "bert-base-uncased", do_lower_case=True+        )+        self._token_indexers = {+            "tokens": PretrainedTransformerIndexer(+                "bert-base-uncased", do_lowercase=True+            )+        }+        self.topk_images = topk_images+        self.mask_prepositions_verbs = mask_prepositions_verbs+        self.drop_prepositions_verbs = drop_prepositions_verbs+        self.image_data = {}+        # Loading Spacy to find prepositions and verbs+        self.nlp = spacy.load("en_core_web_sm")++    def get_all_grouped_instances(self, split: str):+        text_file_path = os.path.join(self.text_path_prefix, split + ".json")+        visual_file_path = os.path.join(self.visual_path_prefix, split + ".tsv")+        visual_features = load_obj_tsv(visual_file_path, self.topk_images)+        for img in visual_features:+            self.image_data[img["img_id"]] = img+        instances = []+        with open(text_file_path) as f:+            examples = json.load(f)+            examples_dict = {}+            for example in examples:+                if (+                    example["img0"] not in self.image_data+                    or example["img1"] not in self.image_data+                ):+                    continue+                identifier = example["identifier"].split("-")+                identifier = identifier[0] + "-" + identifier[1] + "-" + identifier[-1]+                if identifier not in examples_dict:+                    examples_dict[identifier] = {+                        "sent": example["sent"],+                        "identifier": identifier,+                        "image_ids": [],+                    }+                examples_dict[identifier]["image_ids"] += [+                    example["img0"],+                    example["img1"],+                ]+            for key in examples_dict:+                instances.append(+                    self.text_to_instance(+                        examples_dict[key]["sent"],+                        examples_dict[key]["identifier"],+                        examples_dict[key]["image_ids"],+                        None,+                        None,+                        only_predictions=True,+                    )+                )+        return instances++    @overrides+    def _read(self, split: str):+        text_file_path = os.path.join(self.text_path_prefix, split + ".json")+        visual_file_path = os.path.join(self.visual_path_prefix, split + ".tsv")+        visual_features = load_obj_tsv(visual_file_path, self.topk_images)+        for img in visual_features:+            self.image_data[img["img_id"]] = img+        with open(text_file_path) as f:+            examples = json.load(f)+            for example in examples:+                if (+                    example["img0"] not in self.image_data+                    or example["img1"] not in self.image_data+                ):+                    continue+                yield self.text_to_instance(+                    example["sent"],+                    example["identifier"],+                    [example["img0"], example["img1"]],+                    example["label"],+                )++    @overrides+    def text_to_instance(+        self,  # type: ignore+        question: str,+        identifier: str,+        image_ids: List[str],+        denotation: str = None,+        only_predictions: bool = False,+    ) -> Instance:+        if self.mask_prepositions_verbs:+            doc = self.nlp(question)+            prep_verb_starts = [+                (token.idx, len(token))+                for token in doc+                if token.dep_ == "prep" or token.pos_ == "VERB"+            ]+            new_question = ""+            prev_end = 0+            for (idx, length) in prep_verb_starts:+                new_question += question[prev_end:idx] + self._tokenizer.mask_token+                prev_end = idx + length+            new_question += question[prev_end:]+            question = new_question+        elif self.drop_prepositions_verbs:+            doc = self.nlp(question)+            prep_verb_starts = [+                (token.idx, len(token))+                for token in doc+                if token.dep_ == "prep" or token.pos_ == "VERB"+            ]+            new_question = ""+            prev_end = 0+            for (idx, length) in prep_verb_starts:+                new_question += question[prev_end:idx]+                prev_end = idx + length+            new_question += question[prev_end:]+            question = new_question+        tokenized_sentence = self._tokenizer.tokenize(question)+        tokenized_sentence = [str(tok) for tok in tokenized_sentence]+        sentence_field = TextField(+            [Token(self._tokenizer.cls_token)]+            + [Token(tok) for tok in tokenized_sentence]+            + [Token(self._tokenizer.sep_token)],

Yes, each dataset would need a dataset reader. If suitable readers already exist in other places, we can re-use as much code as possible (though we'd definitely have to write at least a little bit of code to get them into allennlp's data pipeline).

matt-gardner

comment created time in 7 days

issue commentallenai/allennlp

Multi-task learning

My first pass at how I would design something for this:

You have a TextEncoder or a TextAndImageEncoder, which goes from inputs to an encoded representation that's ready to have various task heads applied. (Though, really, the TextFieldEmbedder already does what the TextEncoder would do, so we may not need anything new there, just for text+images.) Then you have a MultiTaskTextAndImageModel that takes as __init__ arguments a TextAndImageEncoder and a list of prediction heads. forward is a bit tricky, you probably have to have it accept **kwargs and make assumptions about the names of things that it gets. But assuming you can work that out, you then have the model apply whatever heads are required based on the inputs it receives, compute a joint loss, and that's it. You can configure this pretty easily to add another head just by adding another prediction head to the list, and an appropriate dataset inside a multi-task dataset reader.

dirkgr

comment created time in 7 days

issue commentallenai/allennlp

Flexible and configurable data loading and pre-processing

I think it probably makes more sense to have transforms, instead of what's done for data here. We should have a more in-depth conversation about options for doing this. The way that detectron reader works makes me think that if you want that you should just be using detectron instead of allennlp.

dirkgr

comment created time in 7 days

issue commentallenai/allennlp

Running detectron models as part of our models

Yes, I agree that the configuration you're talking about looks right to me. I would maybe rename the issue to something like Add an ImageEncoder abstraction, and a DetectronImageEncoder implementation of it.

dirkgr

comment created time in 7 days

issue commentallenai/allennlp

Can't get correct output using .saliency_interpret_from_json() from Interpret

Possibly related: https://github.com/allenai/allennlp-models/pull/85

koren-v

comment created time in 7 days

issue commentallenai/allennlp-guide

Chapter on document ranking/retrieval

Nice! On your two questions:

  1. What you have seems ok to me. I don't know enough the details to know if there's anything in particular that could be fixed, but a useful thing to think about to convince yourself one way or the other is to compare our MatrixAttention code to its prior version. What we used to do for MatrixAttention was expand the query vector out to be the size of the matrix, then compute a similarity function on top of these expanded representations. It turns out that this is very inefficient, because of all of the intermediate computations done on large matrices. If you can instead do the operations in pieces on smaller inputs, you can gain a bunch of efficiency. It looks like you're taking the approach of our old version of MatrixAttention, but it didn't look like there was an easy way to switch to a more efficient way of doing things, because you have quite a long sequence of operations inside your RelevanceMatcher, and I'm not sure if it's even possible to do it with smaller intermediate computations without spending a lot more time looking at the code. Another thing that came to mind when I saw your code was wondering whether it would be easier conceptually and implementation-wise to use a TimeDistributed layer in there, so the RelevanceMatcher only has to score a single document and query at a time, and TimeDistributed does the batch rolling and unrolling for you. Not sure if that's a good idea or not, but it's something to think about.

  2. For your second question about metrics, I think you're right that it'd be difficult to re-use existing code for what you want. I was trying to decide if rolling things into the batch dimension and handling a mask appropriately might work, but I don't think these metrics allow masks. So it seems like there are two options: (1) change the metrics to allow masks to be passed, which will let you solve your problem by rolling into the batch dimension (though this might not aggregate things how you want...), or (2) start by copying the existing metric code, and generalize it to handle lists.

Also, these are just torch.Tensors, not TextFieldTensors.

jacobdanovitch

comment created time in 7 days

pull request commentallenai/allennlp-models

fixes for next_token_lm

For the issue you're seeing, the weights must have changed somehow, either in huggingface code or in allennlp code. I don't remember any model changes that should have caused this on our side, but I could be forgetting something. To fix that, the easiest thing to do is to just instantiate the model in code and save it. There are no parameters that need to be trained in that model, as it's just taking an existing pretrained model and putting it into a different format.

epwalsh

comment created time in 7 days

Pull request review commentallenai/allennlp-models

fixes for next_token_lm

 def predictions_to_labeled_instances(     ):         new_instance = instance.duplicate()         token_field: TextField = instance["tokens"]  # type: ignore-        mask_targets = [Token(target_top_k[0]) for target_top_k in outputs["words"]]+        mask_targets = [+            Token(target_top_k_text[0], text_id=target_top_id_id)

top_id_id is a bit odd, but otherwise LGTM.

epwalsh

comment created time in 7 days

delete branch allenai/allennlp-template-config-files

delete branch : epwalsh-patch-1

delete time in 8 days

push eventallenai/allennlp-template-config-files

Evan Pete Walsh

commit sha 4c98c58287dea4e94aec2ecd0f2c5d2722b8d6c4

Update my_model_trained_on_my_dataset.jsonnet

view details

Matt Gardner

commit sha 9d603c8730ccf8198721088349e4896a8d783f49

Merge pull request #2 from allenai/epwalsh-patch-1 Update my_model_trained_on_my_dataset.jsonnet

view details

push time in 8 days

issue commentallenai/allennlp

QANet in allennlp-models still used the old version of regularizer

Yep, this is a bug. Not sure how we missed this. Care to submit a PR to fix it?

pengshuang

comment created time in 8 days

issue commentallenai/allennlp

AugmentedLSTM not working

Yeah, this looks like a bug to me. It looks, though, like we really should be using the BiAugmentedLstm instead of the AugmentedLstm for this, and that looks like it should work. Care to submit a PR to change this?

harshnarang8

comment created time in 8 days

create barnchallenai/allennlp

branch : vision

created branch time in 8 days

create barnchmatt-gardner/allennlp

branch : lxmert

created branch time in 8 days

pull request commentallenai/allennlp

Adds Detectron support

Maybe make a new vision branch that we can merge things into for now?

dirkgr

comment created time in 8 days

delete branch allenai/allennlp-template-config-files

delete branch : rename-config

delete time in 8 days

push eventallenai/allennlp-template-config-files

Evan Pete Walsh

commit sha 113660843de2091e53708592273e78c8585d872c

Rename my_model_trained_on_my_dataset.json to my_model_trained_on_my_dataset.jsonnet

view details

Matt Gardner

commit sha f0a4f1040969138262b073eb441fddc5ec8f3cc8

Merge pull request #1 from allenai/rename-config Change config extension to jsonnet

view details

push time in 8 days

push eventmatt-gardner/allennlp

Dirk Groeneveld

commit sha 54c41fcc8f2a7ba366e848c88874199a4246385a

Adds the ability to automatically detect whether we have a GPU (#4400) * Adds the ability to automatically detect whether we have a GPU * Make `None` the default * Changelog * Fix type annotations and defaults * Actually test that the right device is used * More tests * The batches we see in the callback are never on the GPU * We have many places where we initialize the trainer.

view details

Evan Pete Walsh

commit sha 84988b8149c2b69329136383c05ace2865adba17

Log plugins discovered and filter out transformers "PyTorch version ... available" log message (#4414) * filter out transformers pytorch log msg, log plugins * update CHANGELOG * make flake8 happy

view details

push time in 8 days

pull request commentallenai/allennlp

Automatic file-friendly logging

I would agree that having reasonable access to stack traces in common environments is a pretty hard requirement, and it's pretty bad if this makes it impossible / dramatically harder to get them.

dirkgr

comment created time in 8 days

Pull request review commentallenai/allennlp

generalize DataLoader

 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Added the ability to pass an archive file instead of a local directory to `Vocab.from_files`. - Added the ability to pass an archive file instead of a glob to `ShardedDatasetReader`. +### Changed++- `allennlp.data.DataLoader` is now an abstract registrable class. The default implementation+remains the same, but was renamed to is `allennlp.data.PyTorchDataLoader`.
remains the same, but was renamed to `allennlp.data.PyTorchDataLoader`.
epwalsh

comment created time in 8 days

Pull request review commentallenai/allennlp

generalize DataLoader

 def __init__(         self,         model: Model,         optimizer: torch.optim.Optimizer,-        data_loader: torch.utils.data.DataLoader,

There are small parts of the guide that should be updated with this change. Not sure how to keep this in sync with the guide / when to release things.

epwalsh

comment created time in 8 days

Pull request review commentallenai/allennlp

Adds the ability to automatically detect whether we have a GPU

 def from_partial_objects(         If you're not using `FromParams`, you can just construct these arguments in the right order         yourself in your code and call the constructor directly.         """+        if cuda_device is None:

Unfortunate that this has to be in both places, but it looks like that's necessary. It might work to put this into a shared method or something, though, like the existing check_for_gpu or int_to_device.

dirkgr

comment created time in 9 days

issue closedallenai/allennlp

ElmoEmbedder seem missing from new version

ElmoEmbedder seem missing from new version, how can I get embedding from ELMo (already have hdf5 weight)

closed time in 9 days

pinedbean

issue commentallenai/allennlp

ElmoEmbedder seem missing from new version

See here: https://github.com/allenai/allennlp/issues/4384. If you have a more specific question that's not addressed by that, or by our intro material in the guide, please open a new issue.

pinedbean

comment created time in 9 days

pull request commentallenai/allennlp-demo

Improving Event2Mind demo UI

This demo has been removed from the AllenNLP demos, but a better version is available here: https://mosaickg.apps.allenai.org/.

aaronsarnat

comment created time in 9 days

issue commentallenai/allennlp

Allennlp Predictor code giving an error.

What if you just run import spacy, import torch, and import numpy from a python interpreter? Do they work? It looks to me like they would fail, and this is a problem with your local environment. Can you check that?

mmb1793

comment created time in 9 days

issue commentallenai/allennlp

AllenNLP crf-tagger with BERT

If there is an issue with the BERT SRL model, it's isolated to the issue in #4392. The issue here is very different. We have plenty of models that train correctly with the transformers library.

Sneriko

comment created time in 9 days

issue commentallenai/allennlp

request the code of ESIM+ELMo

ESIM has moved to the allennlp-models repo.

rzhangpku

comment created time in 9 days

issue commentallenai/allennlp

SRL predictor misses Auxiliary verb

This is due to a spacy version difference. See, e.g., https://github.com/allenai/allennlp/issues/4337, https://github.com/allenai/allennlp/issues/3418#issuecomment-585251913.

deanyan7

comment created time in 9 days

issue commentallenai/allennlp

Is there any guide that writes a script to use trained model to predict?

If you look at the documentation for Predictor, you'll see that there's a predictor_name argument to from_path. The predictor name for SentenceTaggerPredictor is "sentence_tagger".

But I'm a bit confused about your second line of code - that looks to me like it should return an instance of SentenceTaggerPredictor, not Predictor. Are you sure it's returning just a Predictor?

nine09

comment created time in 9 days

pull request commentallenai/allennlp

[FOR DISCUSSION] Automatically capture params on object construction

For vocab: Model.save() would have the same vocab saving logic that's currently spread throughout the trainer and the archival code. Model.load() would take an archive file that includes the vocabulary, as done in the archival code now.

For Dirk's question: yes, dill could work. It's inconsistent, though, and makes it harder to, e.g., support demos that were trained using a python script and saved with dill. Unless we change all of our archival logic to just use dill.

Hmm, a major problem, though, is that the model archive currently also has information about the dataset reader, which is required by the demo / predictor code. So you can't just call Model.save()...

matt-gardner

comment created time in 9 days

issue commentallenai/contrast-sets

Formatting discrepancies with IMDb data relative to originals

Thanks @Eric-Wallace!

allenschmaltz

comment created time in 9 days

pull request commentallenai/allennlp

[FOR DISCUSSION] Automatically capture params on object construction

For us to do that, we have to be able to save objects for which we don't have implementation code ...

I'm not sure I follow this, can you give me an example where we wouldn't have implementation code? Is this why we can't just use pickle?

The reason this works in huggingface's code is because everything relies on an opaque config object. We wanted to avoid this, using typed python objects directly in constructors, instead of some untyped config. The tricky thing is if you don't have a blanket config object, you need some other way of saying what state needs to be saved.

You're right that the way I said this the first time was wrong.

matt-gardner

comment created time in 9 days

PR opened allenai/allennlp

[FOR DISCUSSION] Automatically capture params on object construction

As I was writing parts of the guide, one thing that bothered me was that we don't have a good story around model saving and loading when you're not using config files. Also, I saw examples of huggingface code where you could just call .save_pretrained() on your in-python object and have things just work. For us to do that, we have to be able to save objects for which we don't have implementation code, but I think we can still do it. I got to thinking about this problem this morning, and was playing around with something that actually works, pretty simply.

Basic idea: use a single metaclass on FromParams to make it so that all FromParams objects (including subclasses) have a _params field, which stores the config that they would have been created with. Then the model can just implement a simple .save() method which saves _params as a json object, then calls the archive code. We could probably even just remove the archive code altogether and have it live as methods on Model, because this would make it much easier.

I implemented a quick prototype that works for simple classes. I put it as a pull request instead of an issue so that the code could be more easily commented on. There are three major hurdles that I see to getting this to actually work, which may or may not be surmountable:

  1. We allow subclasses to take **kwargs, and inspect the superclass to grab those arguments. This one should be pretty straightforward - we can directly use the existing logic to get the parameter list, and that might just be enough by itself.

  2. When we use non-standard constructors, like from_partial_objects, or what we do in Vocabulary. We have to know a priori what things are registered that way, and override them accordingly. This might be possible if we have this metaclass inspect the Registry somehow.

  3. We need to know which parameters don't need to get saved in the config file. Things like the Vocabulary for the Model object (because it's actually specified higher up in the config). We don't have a way of programmatically detecting this, and I'm not sure how to do it. The goal of this is just for use in saving models, so we could feasibly special-case a thing or two (we'd want to serialize the vocabulary separately anyway), but that's not ideal. This one needs some more thought.

A possible side-benefit of this is being able to pickle and recover some objects for use in multi-processing using just the saved _params.

@epwalsh, @dirkgr, what do you think?

+76 -0

0 comment

1 changed file

pr created time in 9 days

push eventmatt-gardner/allennlp

Matt Gardner

commit sha 5d460bfefc0286a1570a04ddbf11c7e8493693c5

Made a proof of concept

view details

push time in 9 days

push eventmatt-gardner/allennlp

Matt Gardner

commit sha 2f41cc8a58c545bff76b71f1596eecce8ca94ed1

Made a proof of concept

view details

push time in 9 days

create barnchmatt-gardner/allennlp

branch : auto_from_params

created branch time in 9 days

push eventmatt-gardner/allennlp

epwalsh

commit sha 4f70bc93d286a8bb6d563c17397573efe9257474

tick version for nightly releases

view details

Matt Gardner

commit sha 3e8a9ef606ed258295237d54f7cdae4a22382542

Add link to new template repo for config file development (#4372)

view details

Evan Pete Walsh

commit sha e52b7518c2a957cf34812855db42658566be8484

ensure transformer params are frozen at initialization when train_parameters is false (#4377) * ensure transformer params are frozen at initialization * update CHANGELOG * removed un-needed member var

view details

dependabot-preview[bot]

commit sha ebde6e8567b6e6f43aefefb8fbe8adc254e2a897

Bump overrides from 3.0.0 to 3.1.0 (#4375)

view details

Matt Gardner

commit sha bf422d5648a6b9db7332252d786d52be5f2d11dc

Add github template for using your own python run script (#4380)

view details

Evan Pete Walsh

commit sha 6852deff7a7e59f11af3f35fd0d71de34e74b98a

pin some doc building requirements (#4386) * pin some doc building requirements * tweak

view details

Matt Gardner

commit sha c2ecb7a282eb8462d0557bbfed913b1dbe213a70

Add a method to ModelTestCase for use without config files (#4381) * Add a method to ModelTestCase for use without config files * Updated changelog * add missing parameter to docstring

view details

Michael Schmitz

commit sha 85e531c240ec8df0979e01d2dc38c27b33550094

Update README.md (#4385) Co-authored-by: Dirk Groeneveld <dirkg@allenai.org>

view details

dependabot-preview[bot]

commit sha ba79f146874b35094d8380c6b1a8fcf5c4c2d4c3

Bump mypy from 0.780 to 0.781 (#4390)

view details

Crissman Loomis

commit sha 20afe6ce8a4457a4822aaf92e9925dd2c3f8dd29

Add Optuna integrated badge to README.md (#4361)

view details

Evan Pete Walsh

commit sha ffc51843ff76931f990f17877fad820281c898be

ensure Vocab.from_files and ShardedDatasetReader can handle archives (#4371) * ensure Vocab.from_files can handle archives * handle archive with ShardedDatasetReader * through helpful ConfigurationError * update docstring

view details

dependabot-preview[bot]

commit sha 1d07cc75ade2eedda2dd681037aec9e133ba6755

Bump mkdocs-material from 5.3.0 to 5.3.2 (#4389)

view details

epwalsh

commit sha b0ba2d4c76eb3c7081fa0a9eba52d864182554be

update version

view details

dependabot-preview[bot]

commit sha 30e5dbfc2f806eb607600bf024b4f3118e46562a

Bump mypy from 0.781 to 0.782 (#4395) Bumps [mypy](https://github.com/python/mypy) from 0.781 to 0.782. - [Release notes](https://github.com/python/mypy/releases) - [Commits](https://github.com/python/mypy/compare/v0.781...v0.782) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>

view details

Evan Pete Walsh

commit sha b6fd6978b507ce6118023e23f3e4dbfa334d39b5

fix sharded dataset reader (#4396) * fix sharded dataset reader * update CHANGELOG * fix

view details

Evan Pete Walsh

commit sha e104e4419157a7d41e91421db6fa8ce304082feb

Add test to ensure data loader yields all instances when batches_per_epoch is set (#4394)

view details

Evan Pete Walsh

commit sha 7fa7531c8c3e9cb4fd0f2807fbd22e70dab7600e

fix __eq__ method of ArrayField (#4401) * fix __eq__ method of ArrayField * update CHANGELOG

view details

dependabot-preview[bot]

commit sha aa2943e50d8ee81ea86a907c72f5d56a83ddc2fc

Bump mkdocs-material from 5.3.2 to 5.3.3 (#4398)

view details

Pengcheng YIN

commit sha eee15ca80ade6ec414162aa945aba61446698356

Assign an empty mapping array to empty fields of `NamespaceSwappingField` (#4403)

view details

Dirk Groeneveld

commit sha 96ff58514d84aeaf7e46f66756c202c713e1101c

Changes from my multiple-choice work (#4368) * Ability to ignore dimensions in the bert pooler * File reading utilities * Productivity through formatting * More reasonable defaults for the Huggingface AdamW optimizer * Changelog * Adds a test for the BertPooler * We can't run the new transformers lib yet * Pin more recent transformer version * Update CHANGELOG.md Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * Adds ability to override transformer weights * Adds a transformer cache, and the ability to override weights * Fix up this PR * Fix comment Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com>

view details

push time in 9 days

push eventallenai/allennlp-semparse

Ted Goddard

commit sha 28b7cf5aaf957e7311b501fe336c30330d398334

Update to support Allen NLP 1.0 (#25) * Update requirements.txt * Rename decode to make_output_human_readable * Rename decode to make_output_human_readable * Rename decode to make_output_human_readable * Rename decode to make_output_human_readable * remove conversion to float * fix tests * fix configs * flake * more flake... * change test names * fix more tests Co-authored-by: Matt Gardner <mattg@allenai.org>

view details

push time in 11 days

PR merged allenai/allennlp-semparse

Update to support Allen NLP 1.0

This branch turned out to be the main functional changes from https://github.com/allenai/allennlp-semparse/pull/23

With these changes it is possible to run a wikitables prediction using AllenNLP 1.0.

+218 -249

3 comments

47 changed files

tedgoddard

pr closed time in 11 days

more