profile
viewpoint
Benjamin Woodruff bgw @facebook Redwood City, CA https://benjam.info/ Currently working at Instagram.

bgw/NotVeryCleverBot 28

A Reddit bot that tries to automatically reply to people's comments.

bgw/pan-am 22

Simple CSS for Pandoc

bgw/bdw-ucode-update-tool 20

Intel i5-5675C, i7-5775C, and i7-5700HQ microcode updates extracted from MSI's UEFI updates, along with a tiny zero-dependency install script for Linux users.

bgw/ansible-playbooks 17

Custom Ansible playbooks for personal use

bgw/ansible-honeybadger 14

Because Honey Badger don't give a sh*t!

bgw/dotfiles 10

Custom "dotfiles" for personal use

bgw/Century 8

A Library to Help with Automating and Pulling Data from the Univerisity of Florida's (UF) Website

bgw/dash-button 7

Amazon Dash Button hacking tools for OpenWRT/dnsmasq

bgw/hyper-native-window-decoration 5

Native window decorations in HyperTerm

bgw/campagnol-debian 3

Debian packaging for Campagnol

Pull request review commentInstagram/LibCST

Type inference metadata

+from typing import Optional++import libcst as cst+from libcst import BatchableMetadataProvider+from libcst.metadata import PositionProvider+++class TypeInferenceProvider(BatchableMetadataProvider[str]):+    METADATA_DEPENDENCIES = (PositionProvider,)+    is_cache_required = True++    def __init__(self, cache: object):+        super().__init__(cache)+        lookup = {}+        for item in cache:+            location = item["location"]+            start = location["start"]+            end = location["stop"]+            lookup[(start["line"], start["column"], end["line"], end["column"])] = item[+                "annotation"+            ]+        self.lookup = lookup++    def _parse_metadata(self, node: cst.CSTNode) -> None:+        range = self.get_metadata(PositionProvider, node)+        key = (range.start.line, range.start.column, range.end.line, range.end.column)+        if key in self.lookup:+            self.set_metadata(node, self.lookup[key])+            self.lookup.pop(key)

This is a weird use of pop. If you don't care about the return value, the more idiomatic pattern for dicts seems to be

del self.lookup[key]

(discarding the return value of pop on a list is common, but I think that's just because there's not as good of an alternative)

jimmylai

comment created time in 4 months

Pull request review commentInstagram/LibCST

Type inference metadata

+from textwrap import dedent++import libcst as cst+from libcst import MetadataWrapper+from libcst.metadata.type_inference_provider import TypeInferenceProvider+from libcst.testing.utils import UnitTest+++class TypeInferenceProviderTest(UnitTest):+    def test_basic_class_types(self) -> None:+        wrapper = MetadataWrapper(+            cst.parse_module(+                dedent(+                    """\+                    from typing import Sequence+++                    class Item:+                        def __init__(self, n: int):+                            self.number = n+++                    class ItemCollector:+                        def get_items(self, n: int) -> Sequence[Item]:+                            return [Item() for i in range(n)]+++                    collector = ItemCollector()+                    items = collector.get_items()+                    for item in items:+                        item.number+                    """+                )+            ),+            # pyre-fixme[6]: Expected `Mapping[Type[BaseMetadataProvider[object]],+            #  object]` for 2nd param but got `Dict[Type[TypeInferenceProvider],+            #  List[Dict[str, Union[Dict[str, Dict[str, int]], str]]]]`.+            cache={+                TypeInferenceProvider: [+                    {+                        "location": {+                            "start": {"line": 6, "column": 8},+                            "stop": {"line": 6, "column": 19},+                        },+                        "annotation": "int",+                    },+                    {+                        "location": {+                            "start": {"line": 6, "column": 8},+                            "stop": {"line": 6, "column": 12},+                        },+                        "annotation": "libcst.metadata.example_type_infer.Item",+                    },+                    {+                        "location": {+                            "start": {"line": 15, "column": 8},+                            "stop": {"line": 15, "column": 17},+                        },+                        "annotation": "libcst.metadata.example_type_infer.ItemCollector",+                    },+                    {+                        "location": {+                            "start": {"line": 10, "column": 27},+                            "stop": {"line": 10, "column": 30},+                        },+                        "annotation": "typing.Type[int]",+                    },+                    {+                        "location": {+                            "start": {"line": 10, "column": 35},+                            "stop": {"line": 10, "column": 49},+                        },+                        "annotation": "typing.Type[typing.Sequence[libcst.metadata.example_type_infer.Item]]",+                    },+                    {+                        "location": {+                            "start": {"line": 11, "column": 15},+                            "stop": {"line": 11, "column": 41},+                        },+                        "annotation": "typing.List[libcst.metadata.example_type_infer.Item]",+                    },+                    {+                        "location": {+                            "start": {"line": 11, "column": 38},+                            "stop": {"line": 11, "column": 39},+                        },+                        "annotation": "int",+                    },+                    {+                        "location": {+                            "start": {"line": 10, "column": 44},+                            "stop": {"line": 10, "column": 48},+                        },+                        "annotation": "typing.Type[libcst.metadata.example_type_infer.Item]",+                    },+                    {+                        "location": {+                            "start": {"line": 10, "column": 18},+                            "stop": {"line": 10, "column": 22},+                        },+                        "annotation": "libcst.metadata.example_type_infer.ItemCollector",+                    },+                    {+                        "location": {+                            "start": {"line": 15, "column": 0},+                            "stop": {"line": 15, "column": 5},+                        },+                        "annotation": "typing.Sequence[libcst.metadata.example_type_infer.Item]",+                    },+                    {+                        "location": {+                            "start": {"line": 11, "column": 16},+                            "stop": {"line": 11, "column": 20},+                        },+                        "annotation": "typing.Type[libcst.metadata.example_type_infer.Item]",+                    },+                    {+                        "location": {+                            "start": {"line": 14, "column": 12},+                            "stop": {"line": 14, "column": 27},+                        },+                        "annotation": "libcst.metadata.example_type_infer.ItemCollector",+                    },+                    {+                        "location": {+                            "start": {"line": 14, "column": 12},+                            "stop": {"line": 14, "column": 25},+                        },+                        "annotation": "typing.Type[libcst.metadata.example_type_infer.ItemCollector]",+                    },+                    {+                        "location": {+                            "start": {"line": 5, "column": 23},+                            "stop": {"line": 5, "column": 24},+                        },+                        "annotation": "int",+                    },+                    {+                        "location": {+                            "start": {"line": 5, "column": 26},+                            "stop": {"line": 5, "column": 29},+                        },+                        "annotation": "typing.Type[int]",+                    },+                    {+                        "location": {+                            "start": {"line": 11, "column": 16},+                            "stop": {"line": 11, "column": 22},+                        },+                        "annotation": "libcst.metadata.example_type_infer.Item",+                    },+                    {+                        "location": {+                            "start": {"line": 15, "column": 8},+                            "stop": {"line": 15, "column": 29},+                        },+                        "annotation": "typing.Sequence[libcst.metadata.example_type_infer.Item]",+                    },+                    {+                        "location": {+                            "start": {"line": 11, "column": 32},+                            "stop": {"line": 11, "column": 37},+                        },+                        "annotation": "typing.Type[range]",+                    },+                    {+                        "location": {+                            "start": {"line": 10, "column": 35},+                            "stop": {"line": 10, "column": 43},+                        },+                        "annotation": "typing.Callable(typing.GenericMeta.__getitem__)[[typing.Type[Variable[typing._T_co](covariant)]], typing.Type[typing.Sequence[Variable[typing._T_co](covariant)]]]",+                    },+                    {+                        "location": {+                            "start": {"line": 5, "column": 17},+                            "stop": {"line": 5, "column": 21},+                        },+                        "annotation": "libcst.metadata.example_type_infer.Item",+                    },+                    {+                        "location": {+                            "start": {"line": 16, "column": 4},+                            "stop": {"line": 16, "column": 8},+                        },+                        "annotation": "libcst.metadata.example_type_infer.Item",+                    },+                    {+                        "location": {+                            "start": {"line": 11, "column": 32},+                            "stop": {"line": 11, "column": 40},+                        },+                        "annotation": "range",+                    },+                    {+                        "location": {+                            "start": {"line": 15, "column": 8},+                            "stop": {"line": 15, "column": 27},+                        },+                        "annotation": "typing.Callable(libcst.metadata.example_type_infer.ItemCollector.get_items)[[Named(n, int)], typing.Sequence[libcst.metadata.example_type_infer.Item]]",+                    },+                    {+                        "location": {+                            "start": {"line": 10, "column": 24},+                            "stop": {"line": 10, "column": 25},+                        },+                        "annotation": "int",+                    },+                    {+                        "location": {+                            "start": {"line": 6, "column": 22},+                            "stop": {"line": 6, "column": 23},+                        },+                        "annotation": "int",+                    },+                    {+                        "location": {+                            "start": {"line": 16, "column": 12},+                            "stop": {"line": 16, "column": 17},+                        },+                        "annotation": "typing.Sequence[libcst.metadata.example_type_infer.Item]",+                    },+                    {+                        "location": {+                            "start": {"line": 17, "column": 4},+                            "stop": {"line": 17, "column": 8},+                        },+                        "annotation": "libcst.metadata.example_type_infer.Item",+                    },+                    {+                        "location": {+                            "start": {"line": 14, "column": 0},+                            "stop": {"line": 14, "column": 9},+                        },+                        "annotation": "libcst.metadata.example_type_infer.ItemCollector",+                    },+                ]+            },+        )+        types = wrapper.resolve(TypeInferenceProvider)+        m = wrapper.module+        self_number_attr = cst.ensure_type(+            cst.ensure_type(+                cst.ensure_type(+                    cst.ensure_type(+                        cst.ensure_type(m.body[1].body, cst.IndentedBlock).body[0],+                        cst.FunctionDef,+                    ).body.body[0],+                    cst.SimpleStatementLine,+                ).body[0],+                cst.Assign,+            )+            .targets[0]+            .target,+            cst.Attribute,+        )+        self.assertEqual(types[self_number_attr], "int")

This is a rather large example, but then you only inspect two types on it.

I'd either test more things, or strip down the size of the example.

jimmylai

comment created time in 4 months

Pull request review commentInstagram/LibCST

Type inference metadata

+from typing import Optional++import libcst as cst+from libcst import BatchableMetadataProvider+from libcst.metadata import PositionProvider+++class TypeInferenceProvider(BatchableMetadataProvider[str]):+    METADATA_DEPENDENCIES = (PositionProvider,)+    is_cache_required = True++    def __init__(self, cache: object):+        super().__init__(cache)+        lookup = {}+        for item in cache:+            location = item["location"]+            start = location["start"]+            end = location["stop"]+            lookup[(start["line"], start["column"], end["line"], end["column"])] = item[

It'd be interesting to run this over IGSRV and log if we ever get duplicate ranges. That was one concern I had with this approach. We probably won't, but if it did happen, we might want to start considering node types in addition to positional ranges.

jimmylai

comment created time in 4 months

Pull request review commentInstagram/LibCST

[metadata] add cache field to metadata wrapper

 class BaseMetadataProvider(MetadataDependent, Generic[_ProvidedMetadataT]):     # explanation.     _computed: MutableMapping["CSTNode", _ProvidedMetadataT] -    def __init__(self) -> None:+    is_cache_required: bool = False++    def __init__(self, cache: object = None) -> None:

It's probably good enough for an MVP, but I'm not particularly happy about cache being typed as object. Is there a way we could make this a typevar? Could we infer is_cache_required from the generic type parameter?

jimmylai

comment created time in 4 months

issue commentInstagram/LibCST

3.8 Support

Hey @dseevr, that sucks. Unfortunately, Thanksgiving is approaching, and a lot of people are on vacation, so we might not be able to get back to you until next week.

cooperlees

comment created time in 4 months

push eventbgw/LibCST

Benjamin Woodruff

commit sha 643d097d4fdc07bed18a7ac433b462a3e55e22ad

Add an unsafe_skip_copy option to MetadataWrapper In certain cases (e.g. inside Instagram's lint framework) we know that our tree originates from the parser, so we know that there shouldn't be any duplicate nodes in our tree. MetadataWrapper exists to copy the tree ensuring that there's no duplicate nodes. This diff provides an escape hatch on MetadataWrapper that allows us to save a little time and avoid a copy when we know that it's safe to skip the copy. As part of this, I ran into some issues with `InitVar` and pyre, so I removed `@dataclass` from the class. This means that this is techincally a breaking change if someone depended on the MetadataWrapper being an actual dataclass, but I think this is unlikely. I implemented `__repr__` and added tests for hashing/equality behavior.

view details

push time in 5 months

PR opened Instagram/LibCST

Add an unsafe_skip_copy option to MetadataWrapper

Summary

In certain cases (e.g. inside Instagram's lint framework) we know that our tree originates from the parser, so we know that there shouldn't be any duplicate nodes in our tree.

MetadataWrapper exists to copy the tree ensuring that there's no duplicate nodes.

This diff provides an escape hatch on MetadataWrapper that allows us to save a little time and avoid a copy when we know that it's safe to skip the copy.

As part of this, I ran into some issues with InitVar and pyre, so I removed @dataclass from the class. This means that this is techincally a breaking change if someone depended on the MetadataWrapper being an actual dataclass, but I think this is unlikely. I implemented __repr__ and added tests for hashing/equality behavior.

Test Plan

Unit tests!

Screenshot from 2019-10-29 13-16-24

+90 -11

0 comment

3 changed files

pr created time in 5 months

create barnchbgw/LibCST

branch : unsafe-skip-copy

created branch time in 5 months

issue commentInstagram/LibCST

3.8 Support

Not sure why mentioning this in a comment auto-closed on merge

:laughing:

Screenshot from 2019-10-28 17-20-48

Certain keywords, like "closes" and "fixes" can be used to auto-close issues, however github isn't context sensitive. https://help.github.com/en/github/managing-your-work-on-github/closing-issues-using-keywords

cooperlees

comment created time in 5 months

pull request commentInstagram/LibCST

Add a reentrant (incremental) codegen API

Sorry for all the force pushes, I realized that I was running an outdated version of black, causing my local machine to disagree with CircleCI, and had to wipe my .tox to fix it.

bgw

comment created time in 5 months

push eventbgw/LibCST

Benjamin Woodruff

commit sha 0891a21ad737dc03c4d2afac7d1acb1a6b873dd1

Add a reentrant (incremental) codegen API **Context:** This is an experimental performance optimization that we're hoping to use for our internal linter at Instagram. I added some documentation, but it's unsupported, and isn't very user-friendly. This adds `ExperimentalReentrantCodegenProvider`, which tracks the codegen's internal state (indentation level, character offsets, encoding, etc.) and for each statement, it stores a `CodegenPartial` object. The `CodegenPartial` object has enough information about the previous codegen pass to run the codegen on part of a tree and patch the result back into the original module's string. In cases where we need to generate a bunch of small independent patches for the same file (and we can't just generate a new tree with each patch applied), this *should* be a faster alternative. I don't have any performance numbers because I still need to test this end-to-end with our internal codebase, but I'd be shocked if it was slower than what we're doing. This could theoretically live outside of LibCST, but it depends on a whole bunch of LibCST internals, so there's some value in making sure that this is in sync with the rest of LibCST.

view details

Benjamin Woodruff

commit sha 16f4e7779e07bc9ff93252952097f7d16de73a61

Rename CodegenPartial.get_*_node_code to get_*_statement_code This should address @DragonMinded's review https://github.com/Instagram/LibCST/pull/132#pullrequestreview-308173941

view details

push time in 5 months

push eventbgw/LibCST

Benjamin Woodruff

commit sha 4d53aa5ee9b4630691e9942cb94c3972c1d2fb58

Add a reentrant (incremental) codegen API **Context:** This is an experimental performance optimization that we're hoping to use for our internal linter at Instagram. I added some documentation, but it's unsupported, and isn't very user-friendly. This adds `ExperimentalReentrantCodegenProvider`, which tracks the codegen's internal state (indentation level, character offsets, encoding, etc.) and for each statement, it stores a `CodegenPartial` object. The `CodegenPartial` object has enough information about the previous codegen pass to run the codegen on part of a tree and patch the result back into the original module's string. In cases where we need to generate a bunch of small independent patches for the same file (and we can't just generate a new tree with each patch applied), this *should* be a faster alternative. I don't have any performance numbers because I still need to test this end-to-end with our internal codebase, but I'd be shocked if it was slower than what we're doing. This could theoretically live outside of LibCST, but it depends on a whole bunch of LibCST internals, so there's some value in making sure that this is in sync with the rest of LibCST.

view details

Benjamin Woodruff

commit sha 9ad2c481ffa20a910b346b5d13519e41daaa48b6

Rename CodegenPartial.get_*_node_code to get_*_statement_code This should address @DragonMinded's review https://github.com/Instagram/LibCST/pull/132#pullrequestreview-308173941

view details

push time in 5 months

PR opened Instagram/LibCST

Add a reentrant (incremental) codegen API

Summary

Context: This is an experimental performance optimization that we're hoping to use for our internal linter at Instagram. I added some documentation, but it's unsupported, and isn't very user-friendly.

This adds ExperimentalReentrantCodegenProvider, which tracks the codegen's internal state (indentation level, character offsets, encoding, etc.) and for each statement, it stores a CodegenPartial object.

The CodegenPartial object has enough information about the previous codegen pass to run the codegen on part of a tree and patch the result back into the original module's string.

In cases where we need to generate a bunch of small independent patches for the same file (and we can't just generate a new tree with each patch applied), this should be a faster alternative.

I don't have any performance numbers because I still need to test this end-to-end with our internal codebase, but I'd be shocked if it was slower than what we're doing.

This could theoretically live outside of LibCST, but it depends on a whole bunch of LibCST internals, so there's some value in making sure that this is in sync with the rest of LibCST.

Test Plan

  • Unit tests
  • Pyre

Screenshot_2019-10-28 Experimental APIs — LibCST documentation

+365 -9

0 comment

9 changed files

pr created time in 5 months

create barnchbgw/LibCST

branch : reentrant-codegen

created branch time in 5 months

push eventbgw/LibCST

Benjamin Woodruff

commit sha 105e4398416f8e98d36ed0dbca436f3148146621

Rename before_visit/after_leave to before_codegen/after_codegen This addresses @DragonMinded's review here: https://github.com/Instagram/LibCST/pull/126#pullrequestreview-305577895

view details

push time in 5 months

PR opened Instagram/LibCST

Make the codegen enter/leave tracking more generic

Summary

I need to do some additional work on visit/leave to make codegen re-entrant, so this makes it more generic.

This should have an additional small positive effect of creating less throwaway objects when we're doing codegen without position calculation.

Test Plan

pyre/unit tests/lint

+23 -17

0 comment

4 changed files

pr created time in 5 months

create barnchbgw/LibCST

branch : codegen-state-tracking

created branch time in 5 months

PR opened Instagram/LibCST

Move CodegenState construction to PositionProvider

Summary

Previously, libcst.Module.code_for_node accepted a provider parameter, and would construct the appropriate CodegenState subclass based on some if/else logic.

This had a few knock-on effects:

  • A tighter circular dependency between node definitions and metadata, which was previously mitigated with an inner import.

  • Adding a new CodegenState subclass required the non-obvious task of modifying Module. I'll need to add a new CodegenState subclass to support incremental codegen.

  • What was intended to be a private implementation detail (how positions are computed by hooking into codegen) was exposed as a parameter on a public method.

This diff aims to clean up those knock on effects. The position-related subclasses have been moved from libcst.nodes._internal into libcst.metadata.position_provider, which keeps more of the position computation logic together.

Technically this is a breaking change. If somebody was passing the second parameter into code_for_node, their code will break. However:

  • It will break in a clear and obvious way.

  • This second parameter was never documented (aside from my recent addition of some remarks telling people not to use it). There's plenty of documentation that shows how to fetch positions properly.

So it's my opinion that we shouldn't require a major version bump for this change.

Test Plan

  • pyre check
  • ran tests and lint
+250 -297

0 comment

8 changed files

pr created time in 5 months

create barnchbgw/LibCST

branch : move-state

created branch time in 5 months

issue commentInstagram/LibCST

Handle deprecation in a standard and consistent way?

I've seen a lot of projects use their changelog (and by extension, the github releases page) for declaring deprecations.

jimmylai

comment created time in 5 months

PR opened Instagram/LibCST

Update tutorial to use renamed PositionProvider

Summary

I missed this in #114 when renaming SyntacticPositionProvider to PositionProvider because it's a different file extension and I was only grepping for rst and py files.

Test Plan

tox -e docs
+3 -10

0 comment

1 changed file

pr created time in 5 months

create barnchbgw/LibCST

branch : update-metadata-tutorial

created branch time in 5 months

PR opened Instagram/LibCST

Export CodePosition and CodeRange from metadata

Summary

While these classes are used by the codegen implementation, conceptually they're part of libcst.metadata, so we should export them from libcst.metadata instead of the top-level libcst package.

These classes are still exported from libcst for backwards compatibility, but we can remove them from libcst in the next major version bump.

I cleaned up all of the internal imports by hand with the help of ripgrep.

This work is roughly related to the position provider renames I did in #114.

Test Plan

Pyre, unit tests, lint.

+150 -97

0 comment

46 changed files

pr created time in 5 months

create barnchbgw/LibCST

branch : code-position-move

created branch time in 5 months

pull request commentInstagram/LibCST

[metadata] make QualifiedNameProvider typing more specific

Seems like an improvement, LGTM.

jimmylai

comment created time in 6 months

push eventbgw/LibCST

Benjamin Woodruff

commit sha ab37bdd5cd04cb679dd4b8db74075a05ef682fa2

Reorder position providers in documentation PositionProvider is the best choice in most cases, so it should come before WhitespaceInclusivePositionProvider in the documentation.

view details

push time in 6 months

pull request commentInstagram/LibCST

Rename position provider classes

We talked about move CodeRange to libcst.metadata along with this change, do you plan to do that in another PR?

yes.

bgw

comment created time in 6 months

PR opened Instagram/LibCST

Rename position provider classes

Summary

I discussed the high-level idea here with @DragonMinded a few months ago, but this isn't set in stone. If people have better ideas for names, I'd love to hear it.

Publicly-Visible Changes

  • SyntacticPositionProvider is deprecated. The new name is PositionProvider.
  • BasicPositionProvider is deprecated. The new name is WhitespaceInclusivePositionProvider.
  • Documentation is updated to better explain these renamed providers and how to use them.

The prefixes "Syntactic" and "Basic" were pretty bad because they're just concepts that we made up for LibCST.

The idea for the new names is that most users will want the SyntacticPositionProvider, and so we should name things so that the user will naturally gravitate towards the correct choice.

There's some argument that we shouldn't even bother exposing WhitespaceInclusivePositionProvider, but we already need to implement it as a fallback for PositionProvider, and it might be useful for some niche use-cases.

Once we have another major version bump, we can remove the old class names. The old class names have already be removed from the documentation so that new users aren't tempted to use them.

Internal-Only Changes

  • PositionProvider is now _PositionProviderUnion. This type alias was never a public API (and probably never will be).
  • BasicCodegenState is now WhitespaceInclusivePositionProvidingCodegenState.
  • SyntacticCodegenState is now PositionProvidingCodegenState.

Test Plan

The existing tests pass. I also tried testing this in the python shell, since I didn't write any unit tests for backwards compatibility. (the end goal is to just delete it)

Backwards Compatibility

>>> print("\n".join(f"{type(k).__name__.ljust(20)}: {v}" for k, v in m.MetadataWrapper(cst.parse_module("((a).b)")).resolve(m.BasicPositionProvider).items()))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=1), end=CodePosition(line=1, column=1))
LeftParen           : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=1, column=1))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=2), end=CodePosition(line=1, column=2))
LeftParen           : CodeRange(start=CodePosition(line=1, column=1), end=CodePosition(line=1, column=2))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=3), end=CodePosition(line=1, column=3))
RightParen          : CodeRange(start=CodePosition(line=1, column=3), end=CodePosition(line=1, column=4))
Name                : CodeRange(start=CodePosition(line=1, column=1), end=CodePosition(line=1, column=4))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=4), end=CodePosition(line=1, column=4))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=5), end=CodePosition(line=1, column=5))
Dot                 : CodeRange(start=CodePosition(line=1, column=4), end=CodePosition(line=1, column=5))
Name                : CodeRange(start=CodePosition(line=1, column=5), end=CodePosition(line=1, column=6))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=6), end=CodePosition(line=1, column=6))
RightParen          : CodeRange(start=CodePosition(line=1, column=6), end=CodePosition(line=1, column=7))
Attribute           : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=1, column=7))
Expr                : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=1, column=7))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=7), end=CodePosition(line=1, column=7))
Newline             : CodeRange(start=CodePosition(line=1, column=7), end=CodePosition(line=2, column=0))
TrailingWhitespace  : CodeRange(start=CodePosition(line=1, column=7), end=CodePosition(line=2, column=0))
SimpleStatementLine : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=2, column=0))
Module              : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=2, column=0))
>>> print("\n".join(f"{type(k).__name__.ljust(20)}: {v}" for k, v in m.MetadataWrapper(cst.parse_module("((a).b)")).resolve(m.SyntacticPositionProvider).items()))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=1), end=CodePosition(line=1, column=1))
LeftParen           : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=1, column=1))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=2), end=CodePosition(line=1, column=2))
LeftParen           : CodeRange(start=CodePosition(line=1, column=1), end=CodePosition(line=1, column=2))
Name                : CodeRange(start=CodePosition(line=1, column=2), end=CodePosition(line=1, column=3))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=3), end=CodePosition(line=1, column=3))
RightParen          : CodeRange(start=CodePosition(line=1, column=3), end=CodePosition(line=1, column=4))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=4), end=CodePosition(line=1, column=4))
Dot                 : CodeRange(start=CodePosition(line=1, column=4), end=CodePosition(line=1, column=5))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=5), end=CodePosition(line=1, column=5))
Name                : CodeRange(start=CodePosition(line=1, column=5), end=CodePosition(line=1, column=6))
Attribute           : CodeRange(start=CodePosition(line=1, column=1), end=CodePosition(line=1, column=6))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=6), end=CodePosition(line=1, column=6))
RightParen          : CodeRange(start=CodePosition(line=1, column=6), end=CodePosition(line=1, column=7))
Expr                : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=1, column=7))
SimpleStatementLine : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=1, column=7))
SimpleWhitespace    : CodeRange(start=CodePosition(line=1, column=7), end=CodePosition(line=1, column=7))
Newline             : CodeRange(start=CodePosition(line=1, column=7), end=CodePosition(line=2, column=0))
TrailingWhitespace  : CodeRange(start=CodePosition(line=1, column=7), end=CodePosition(line=2, column=0))
Module              : CodeRange(start=CodePosition(line=1, column=0), end=CodePosition(line=2, column=0))

Updated Documentation

Screenshot from 2019-10-16 16-49-05


Screenshot from 2019-10-16 16-49-24


Screenshot from 2019-10-16 16-49-54

+119 -74

0 comment

9 changed files

pr created time in 6 months

create barnchbgw/LibCST

branch : position-rename

created branch time in 6 months

PR opened Instagram/LibCST

Tweak wording/formatting of the scope provider intro

Summary

Explaining the implementation details of scopes to someone unfamiliar with compilers can be tricky. Hopefully this helps.

  • Rephrased the definition of a scope to be more applicable to Python (remove references to "blocks"), and made it use an example for (hopefully) better clarity.
  • New scopes are also created for comprehensions.
  • Set a fixed width (400px) for the scope diagram, since it was too large before.
  • Tweaked some tenses.
  • Add a final call to action: "LibCST allows you to inspect these scopes"

Test Plan

Screenshot from 2019-10-15 16-51-10

+18 -9

0 comment

1 changed file

pr created time in 6 months

create barnchbgw/LibCST

branch : scope-metadata-intro

created branch time in 6 months

push eventbgw/LibCST

Benjamin Woodruff

commit sha 853414ddc8466e9d1b1ae218826d2aa3da19c35e

Make `BaseMetadataProvider[T]` covariant over `T` Because we'd consider `BaseMetadataProvider[int]` to be a subtype of `BaseMetadataProvider[object]`, it should be covariant over its typevar, rather than invariant. This isn't entirely correct because we have a mutable data structure (`_computed`) that depends on the typevar, and pyre points this out (though with a really confusing error message). However, it's not correct to say that `BaseMetadataProvider` is invariant either, so I think this is the lesser evil. I don't think it's practical to redesign this API to avoid the variance issue, so I'm ignoring the new type error that results from this change. I think this may resolve some of the issues we've seen internally with D17820032.

view details

push time in 6 months

PR opened Instagram/LibCST

Make `BaseMetadataProvider[T]` covariant over `T`

Summary

Because we'd consider BaseMetadataProvider[int] to be a subtype of BaseMetadataProvider[object], it should be covariant over its typevar, rather than invariant.

This isn't entirely correct because we have a mutable data structure (_computed) that depends on the typevar, and pyre points this out (though with a really confusing error message). However, it's not correct to say that BaseMetadataProvider is invariant either, so I think this is the lesser evil.

I don't think it's practical to redesign this API to avoid the variance issue, so I'm ignoring the new type error that results from this change.

I think this may resolve some of the issues we've seen internally with D17820032.

Test Plan

pyre check
+21 -11

0 comment

2 changed files

pr created time in 6 months

push eventbgw/LibCST

Benjamin Woodruff

commit sha 94d34fe8eb02f7a9ef0f4060e6b2f1ce414493e2

Make `BaseMetadataProvider[T]` covariant over `T` Because we'd consider `BaseMetadataProvider[int]` to be a subtype of `BaseMetadataProvider[object]`, it should be covariant over its typevar, rather than invariant. This isn't entirely correct because we have a mutable data structure (`_computed`) that depends on the typevar, and pyre points this out (though with a really confusing error message). However, it's not correct to say that `BaseMetadataProvider` is invariant either, so I think this is the lesser evil. I don't think it's practical to redesign this API to avoid the variance issue, so I'm ignoring the new type error that results from this change. I think this may resolve some of the issues we've seen internally with D17820032.

view details

push time in 6 months

push eventbgw/LibCST

Benjamin Woodruff

commit sha 2497c3f5b0f5eddd4bb0f7bd75bcdaa5238fc1b9

Make `BaseMetadataProvider[T]` covariant over `T` Because we'd consider `BaseMetadataProvider[int]` to be a subtype of `BaseMetadataProvider[object]`, it should be covariant over its typevar, rather than invariant. This isn't entirely correct because we have a mutable data structure (`_computed`) that depends on the typevar, and pyre points this out (though with a really confusing error message). However, it's not correct to say that `BaseMetadataProvider` is invariant either, so I think this is the lesser evil. I don't think it's practical to redesign this API to avoid the variance issue, so I'm ignoring the new type error that results from this change. I think this may resolve some of the issues we've seen internally with D17820032.

view details

push time in 6 months

create barnchbgw/LibCST

branch : covariant-metadata-provider

created branch time in 6 months

Pull request review commentInstagram/LibCST

Add Assignments and Accesses to Scope Analysis

+{+ "cells": [+  {+   "cell_type": "raw",+   "metadata": {+    "raw_mimetype": "text/restructuredtext"+   },+   "source": [+    "==============\n",+    "Scope Analysis\n",+    "==============\n",+    "Scope analysis keeps track of assignments and accesses which could be useful for code automatic refactoring. If you're not familiar with Scope analysis, see :doc:`Metadata <metadata>` for more detail about Scope metadata. This tutorial demonstrates some use cases of Scope analysis. \n",

If you're not familiar with Scope analysis, see :doc:Metadata <metadata> for more detail about Scope metadata.

  • Can you link to the "Scope Metadata" section of that page? Otherwise it's not clear what part of that page I should be looking at.
  • Scope isn't a proper noun, so unless it's a reference to a class name or the start of a sentence, it shouldn't be capitalized.
  • You might also want to link to the other metadata tutorial.
jimmylai

comment created time in 6 months

Pull request review commentInstagram/LibCST

Add Assignments and Accesses to Scope Analysis

+{+ "cells": [+  {+   "cell_type": "raw",+   "metadata": {+    "raw_mimetype": "text/restructuredtext"+   },+   "source": [+    "==============\n",+    "Scope Analysis\n",+    "==============\n",+    "Scope analysis keeps track of assignments and accesses which could be useful for code automatic refactoring. If you're not familiar with Scope analysis, see :doc:`Metadata <metadata>` for more detail about Scope metadata. This tutorial demonstrates some use cases of Scope analysis. \n",+    "Given source codes, Scope analysis parses all variable :class:`~libcst.metadata.Assignment` (or a :class:`~libcst.metadata.BuiltinAssignment` if it's a builtin) and :class:`~libcst.metadata.Access` to store in :class:`~libcst.metadata.Scope` containers.\n",+    "\n",+    "Given the following example source code contains a couple of unused imports (``f``, ``i``, ``m`` and ``n``) and undefined variable references (``func_undefined`` and ``var_undefined``). Scope analysis helps us identifying those unused imports and undefined variables to automatically provide warnings to developers to prevent bugs while they're developing.\n",

I would probably move the bit of code that defines source up immediately after this paragraph.

jimmylai

comment created time in 6 months

Pull request review commentInstagram/LibCST

Add Assignments and Accesses to Scope Analysis

 class Scope(abc.ABC):     A scope has a parent scope which represents the inheritance relationship. That means     an assignment in parent scope is viewable to the child scope and the child scope may     overwrites the assignment by using the same name.+     Use ``name in scope`` to check whether a name is viewable in the scope.     Use ``scope[name]`` to retrieve all viewable assignments in the scope.++    .. warning::+       This scope analysis module only analyzes local variable names and it doesn't handle+       attribute names; for example, given a.b.c = 1, local variable name ``a`` is recorded+       as an assignment instead of ``c`` or ``a.b.c``. To analyze the assignment/access of+       arbitrary object attributes, we leave the the job to type inference metadata provider

we leave the the job

Two 'the's.

jimmylai

comment created time in 6 months

Pull request review commentInstagram/LibCST

Add Assignments and Accesses to Scope Analysis

+{+ "cells": [+  {+   "cell_type": "raw",+   "metadata": {+    "raw_mimetype": "text/restructuredtext"+   },+   "source": [+    "==============\n",+    "Scope Analysis\n",+    "==============\n",+    "Scope analysis keeps track of assignments and accesses which could be useful for code automatic refactoring. If you're not familiar with Scope analysis, see :doc:`Metadata <metadata>` for more detail about Scope metadata. This tutorial demonstrates some use cases of Scope analysis. \n",+    "Given source codes, Scope analysis parses all variable :class:`~libcst.metadata.Assignment` (or a :class:`~libcst.metadata.BuiltinAssignment` if it's a builtin) and :class:`~libcst.metadata.Access` to store in :class:`~libcst.metadata.Scope` containers.\n",+    "\n",+    "Given the following example source code contains a couple of unused imports (``f``, ``i``, ``m`` and ``n``) and undefined variable references (``func_undefined`` and ``var_undefined``). Scope analysis helps us identifying those unused imports and undefined variables to automatically provide warnings to developers to prevent bugs while they're developing.\n",+    "With a parsed :class:`~libcst.Module`, we construct a :class:`~libcst.metadata.MetadataWrapper` object and it provides a :func:`~libcst.metadata.MetadataWrapper.resolve` function to resolve metadata given a metadata provider.\n",+    ":class:`~libcst.metadata.ScopeProvider` is used here for analysing scope and there are three types of scopes (:class:`~libcst.metadata.GlobalScope`, :class:`~libcst.metadata.FunctionScope` and :class:`~libcst.metadata.ClassScope`) in this example.\n",+    "\n",+    ".. note::\n",+    "   The scope analysis only handles local variable name access and cannot handle simple string type annotation forward references. See :class:`~libcst.metadata.Access`\n"+   ]+  },+  {+   "cell_type": "code",+   "execution_count": null,+   "metadata": {+    "nbsphinx": "hidden"+   },+   "outputs": [],+   "source": [+    "import sys\n",+    "sys.path.append(\"../../\")"+   ]+  },+  {+   "cell_type": "code",+   "execution_count": null,+   "metadata": {},+   "outputs": [],+   "source": [+    "import libcst as cst\n",+    "\n",+    "source = \"\"\"\\\n",+    "import a, b, c as d, e as f  # expect to keep: a, c as d\n",+    "from g import h, i, j as k, l as m  # expect to keep: h, j as k\n",+    "from n import o  # expect to be removed entirely\n",+    "\n",+    "a()\n",+    "\n",+    "def fun():\n",+    "    d()\n",+    "\n",+    "class Cls:\n",+    "    att = h.something\n",+    "    \n",+    "    def __new__(self) -> \"Cls\":\n",+    "        var = k.method()\n",+    "        func_undefined(var_undefined)\n",+    "\"\"\"\n",+    "wrapper = cst.metadata.MetadataWrapper(cst.parse_module(source))\n",+    "scopes = set(wrapper.resolve(cst.metadata.ScopeProvider).values())\n",+    "for scope in scopes:\n",+    "    print(scope)"+   ]+  },+  {+   "cell_type": "raw",+   "metadata": {+    "raw_mimetype": "text/restructuredtext"+   },+   "source": [+    "Warn on unused imports and undefined references\n",+    "===============================================\n",+    "To find all unused imports, we iterate through :attr:`~libcst.metadata.Scope.assignments` and an assignment is unused when its :attr:`~libcst.metadata.BaseAssignment.references` is empty. To find all undefined references, we iterate through :attr:`~libcst.metadata.Scope.accesses` (we focus on :class:`~libcst.Import`/:class:`~libcst.ImportFrom` assignments) and an access is undefined reference when its :attr:`~libcst.metadata.Access.referents` is empty. When reporting the warning to developer, we'll want to report the line number and column offset along with the suggestion to make it more clear. We can get position information from :class:`~libcst.metadata.SyntacticPositionProvider` and print the warnings as follows.\n"+   ]+  },+  {+   "cell_type": "code",+   "execution_count": null,+   "metadata": {},+   "outputs": [],+   "source": [+    "from collections import defaultdict\n",+    "from typing import Dict, Union, Set\n",+    "\n",+    "unused_imports: Dict[Union[cst.Import, cst.ImportFrom], Set[str]] = defaultdict(set)\n",+    "undefined_references: Dict[cst.CSTNode, Set[str]] = defaultdict(set)\n",+    "ranges = wrapper.resolve(cst.metadata.SyntacticPositionProvider)\n",+    "for scope in scopes:\n",+    "    for assignment in scope.assignments:\n",+    "        node = assignment.node\n",+    "        if isinstance(assignment, cst.metadata.Assignment) and isinstance(\n",+    "            node, (cst.Import, cst.ImportFrom)\n",+    "        ):\n",+    "            if len(assignment.references) == 0:\n",+    "                unused_imports[node].add(assignment.name)\n",+    "                location = ranges[node].start\n",+    "                print(\n",+    "                    f\"Warning on line {location.line:2d}, column {location.column:2d}: Imported name `{assignment.name}` is unused.\"\n",+    "                )\n",+    "\n",+    "    for access in scope.accesses:\n",+    "        if len(access.referents) == 0:\n",+    "            node = access.node\n",+    "            location = ranges[node].start\n",+    "            print(\n",+    "                f\"Warning on line {location.line:2d}, column {location.column:2d}: Name reference `{node.value}` is not defined.\"\n",+    "            )\n"+   ]+  },+  {+   "cell_type": "raw",+   "metadata": {+    "raw_mimetype": "text/restructuredtext"+   },+   "source": [+    "Automatically Remove Unused Import\n",+    "==================================\n",+    "Unused import is a commmon code suggestion provided by lint tool like `flake8 F401 <https://lintlyci.github.io/Flake8Rules/rules/F401.html>`_ ``imported but unused``.\n",+    "Even though reporing unused import is already useful, with LibCST we can provide automatic fix to remove unused import. That can make the suggestion more actionable and save developer's time.\n",+    "\n",+    "An import statement may import multiple names, we want to remove those unused names from the import statement. If all the names in the import statement are not used, we remove the entire import.\n",+    "To remove the unused name, we implement ``RemoveUnusedImportTransformer`` by subclassing :class:`~libcst.CSTTransformer`. We overwrite ``leave_Import`` and ``leave_ImportFrom`` to modify the import statements.\n",+    "When we find the import node in lookup table, we iterate through all ``names`` and keep used names in ``names_to_keep``.\n",+    "If ``names_to_keep`` is empty, all names are unused and we remove the entire import node.\n",+    "Otherwise, we update the import node and just removing partial names."+   ]+  },+  {+   "cell_type": "code",+   "execution_count": null,+   "metadata": {},+   "outputs": [],+   "source": [+    "class RemoveUnusedImportTransformer(cst.CSTTransformer):\n",+    "    def __init__(\n",+    "        self, unused_imports: Dict[Union[cst.Import, cst.ImportFrom], Set[str]]\n",+    "    ) -> None:\n",+    "        self.unused_imports = unused_imports\n",+    "\n",+    "    def leave_import_alike(\n",+    "        self,\n",+    "        original_node: Union[cst.Import, cst.ImportFrom],\n",+    "        updated_node: Union[cst.Import, cst.ImportFrom],\n",+    "    ) -> Union[cst.Import, cst.ImportFrom, cst.RemovalSentinel]:\n",+    "        if original_node not in self.unused_imports:\n",+    "            return updated_node\n",+    "        names_to_keep = []\n",+    "        for name in updated_node.names:\n",+    "            asname = name.asname\n",+    "            if asname is not None:\n",+    "                name_value = asname.name.value\n",+    "            else:\n",+    "                name_value = name.name.value\n",+    "            if name_value not in self.unused_imports[original_node]:\n",+    "                names_to_keep.append(name.with_changes(comma=cst.MaybeSentinel.DEFAULT))\n",+    "        if len(names_to_keep) == 0:\n",+    "            return cst.RemoveFromParent()\n",+    "        else:\n",+    "            return updated_node.with_changes(names=names_to_keep)\n",+    "\n",+    "    def leave_Import(\n",+    "        self, original_node: cst.Import, updated_node: cst.Import\n",+    "    ) -> cst.Import:\n",+    "        return self.leave_import_alike(original_node, updated_node)\n",+    "\n",+    "    def leave_ImportFrom(\n",+    "        self, original_node: cst.ImportFrom, updated_node: cst.ImportFrom\n",+    "    ) -> cst.ImportFrom:\n",+    "        return self.leave_import_alike(original_node, updated_node)\n"+   ]+  },+  {+   "cell_type": "raw",+   "metadata": {+    "raw_mimetype": "text/restructuredtext"+   },+   "source": [+    "After the transform, we use ``.code`` to generate fixed code and all unused names are fixed as expected!"+   ]+  },+  {+   "cell_type": "code",+   "execution_count": null,+   "metadata": {},+   "outputs": [],+   "source": [+    "fixed_module = wrapper.module.visit(RemoveUnusedImportTransformer(unused_imports))\n",+    "print(fixed_module.code)"

You could probably use difflib here to make the transformation more obvious, similar to what you did in the first tutorial: https://libcst.readthedocs.io/en/latest/tutorial.html#Generate-Source-Code

jimmylai

comment created time in 6 months

Pull request review commentInstagram/LibCST

Add Assignments and Accesses to Scope Analysis

 class Access:     """     An Access records an access of an assignment.++    .. warning::+       This scope analysis only analyze access via a :class:`~libcst.Name` or  a :class:`~libcst.Name`

s/only analyze/only analyzes/

jimmylai

comment created time in 6 months

Pull request review commentInstagram/LibCST

Add Assignments and Accesses to Scope Analysis

 class Access:     """     An Access records an access of an assignment.++    .. warning::+       This scope analysis only analyze access via a :class:`~libcst.Name` or  a :class:`~libcst.Name`+       node embedded in other node like :class:`~libcst.Call` or :class:`~libcst.Attribute`.+       It doesn't support type anontation using :class:`~libcst.SimpleString` literal for forward+       reference. E.g. in this example, the ``"Tree"`` isn't parsed as as an access::

s/reference/references/

jimmylai

comment created time in 6 months

Pull request review commentInstagram/LibCST

Add Assignments and Accesses to Scope Analysis

 class Access:     """     An Access records an access of an assignment.++    .. warning::+       This scope analysis only analyze access via a :class:`~libcst.Name` or  a :class:`~libcst.Name`+       node embedded in other node like :class:`~libcst.Call` or :class:`~libcst.Attribute`.+       It doesn't support type anontation using :class:`~libcst.SimpleString` literal for forward

s/type anontation/type annotations/

jimmylai

comment created time in 6 months

Pull request review commentInstagram/LibCST

Add Assignments and Accesses to Scope Analysis

 class Scope(abc.ABC):     A scope has a parent scope which represents the inheritance relationship. That means     an assignment in parent scope is viewable to the child scope and the child scope may     overwrites the assignment by using the same name.+     Use ``name in scope`` to check whether a name is viewable in the scope.     Use ``scope[name]`` to retrieve all viewable assignments in the scope.++    .. warning::+       This scope analysis module only analyzes local variable names and it doesn't handle+       attribute names; for example, given a.b.c = 1, local variable name ``a`` is recorded

a.b.c = 1 should probably be inside backticks so it gets formatted as code.

jimmylai

comment created time in 6 months

more