Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add end_lineno and end_col_offset to nodes #1258

Merged
merged 14 commits into from
Nov 21, 2021

Conversation

cdce8p
Copy link
Member

@cdce8p cdce8p commented Nov 19, 2021

Description

In Python 3.8, end_lineno and end_col_offset where added to all AST nodes which already had lineno and col_offset information. This PR adds them to the astoid nodes. Part of pylint-dev/pylint#5336

Type of Changes

Type
✨ New feature

Todo list

  • Add docstrings for end_lineno, end_col_offset
  • Write changelog entry
  • Write tests for all nodes
  • Decide if sys.version_info >= (3, 8) checks are ok, or if getattr(node, 'end_lineno', None) should be used.
    Note: getattr can't be checked with mypy.
  • Resolve open TODOs in code
  • ...

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, I only have two minor remarks.

@@ -1607,7 +1607,7 @@ def visit_attribute(
if not isinstance(parent, nodes.ExceptHandler):
self._delayed_assattr.append(newnode)
else:
if sys.version_info >= (3, 8):
if sys.version_info >= (3, 8): # pylint: disable=else-if-used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the other issue, I don't think it's a false positive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if not a false-positive, I would recommend to keep the pylint: disable as this preserves the symmetry with all other cases and will make removing the check once 3.8 is the default much easier.

astroid/rebuilder.py Show resolved Hide resolved
astroid/rebuilder.py Show resolved Hide resolved
astroid/rebuilder.py Show resolved Hide resolved
astroid/rebuilder.py Show resolved Hide resolved
astroid/rebuilder.py Show resolved Hide resolved
astroid/rebuilder.py Show resolved Hide resolved
astroid/rebuilder.py Show resolved Hide resolved
astroid/rebuilder.py Show resolved Hide resolved
@@ -1134,18 +1257,37 @@ def visit_decorators(
return None
# /!\ node is actually an _ast.FunctionDef node while
# parent is an astroid.nodes.FunctionDef node
if PY38_PLUS:
if sys.version_info >= (3, 8):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huhu. I'm going to stop suggesting for every test then. But why though ? I'd say it's a good thing to use PY38+ if only for the performance increase of not having to recalculate the bool everytime ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I don't really like the performance impact either. On the positive side sys.version_info is special cased by type checkers so they know how to infer it. That doesn't work with PY38_PLUS.

If performance is a real concern, the only other option, I think, would be to conditional define the visit_xxx methods itself. That would mean a lot of code duplications though. Not sure we could maintain that, at least not easily.

In the end, it's only the node tree. So the performance impact might be acceptable.

--

    if sys.version_info >= (3, 8):
        def visit_constant(self, node: "ast.Constant", parent: NodeNG) -> nodes.Const:
            """visit a Constant node by returning a fresh instance of Const"""
            return nodes.Const(
                value=node.value,
                kind=node.kind,
                lineno=node.lineno,
                col_offset=node.col_offset,
                end_lineno=node.end_lineno,
                end_col_offset=node.end_col_offset,
                parent=parent,
            )
    else:
        def visit_constant(self, node: "ast.Constant", parent: NodeNG) -> nodes.Const:
            """visit a Constant node by returning a fresh instance of Const"""
            return nodes.Const(
                node.value,
                node.lineno,
                node.col_offset,
                parent,
                node.kind,
            )

@DanielNoord
Copy link
Collaborator

Based on the documentation of ast I think only nodes.Statement and subclasses can have an end_lineno. nodes.Expr is a subclass so that is covered as well. Does it make sense to make 1) nodes.NodeNG have end_lineno = None, 2) add an __init__ to nodes.Statement and 3) handle adding end_lineno there?
This would give better IntelliSense as nodes.Module will have end_lineno: None instead of end_lineno: Optional[int]. Because of the way we visit_ nodes in pylint this information would be accessible to use. I'm not sure if we will need but as it seems to reflect reality it might be good to adhere to ast here.

(A similar point can be made for lineno and col_offset but let's not worry about that now)

@cdce8p
Copy link
Member Author

cdce8p commented Nov 19, 2021

@DanielNoord

Based on the documentation of ast I think only nodes.Statement and subclasses can have an end_lineno. nodes.Expr is a subclass so that is covered as well.

nodes.Expr is, but what about all ast.expr nodes? BoolOp, Set, Dict, List, just to name a few.

Does it make sense to make 1) nodes.NodeNG have end_lineno = None, 2) add an __init__ to nodes.Statement and 3) handle adding end_lineno there? This would give better IntelliSense as nodes.Module will have end_lineno: None instead of end_lineno: Optional[int]. Because of the way we visit_ nodes in pylint this information would be accessible to use. I'm not sure if we will need but as it seems to reflect reality it might be good to adhere to ast here.

(A similar point can be made for lineno and col_offset but let's not worry about that now)

If I understand you correctly, what you would like is for IntelliSense to know that some nodes will always have end_lineno = None (same for lineno, col_offset, and end_col_offset). That can still be archived by overriding the type. E.g. for nodes.Module:

class Module(LocalsDictNodeNG):
    lineno: None
    col_offset: None
    end_lineno: None
    end_col_offset: None
    parent: None

@DanielNoord
Copy link
Collaborator

Ah I didn't know not all ast.Expr nodes actually subclass nodes.Expr. That makes this more difficult indeed. Your latest commit is a nice solution!

@cdce8p cdce8p force-pushed the feature_end-line-column branch from 63fa73a to 309da22 Compare November 19, 2021 14:23
@cdce8p
Copy link
Member Author

cdce8p commented Nov 19, 2021

Ah I didn't know not all ast.Expr nodes actually subclass nodes.Expr.

Actually, no node subclasses ast.Expr, some subclass ast.expr but we don't represent that in astroid currently.

@cdce8p cdce8p added ast Enhancement ✨ Improvement to a component High effort 🏋 Difficult solution or problem to solve python 3.8 labels Nov 21, 2021
@cdce8p cdce8p added this to the 2.9.0 milestone Nov 21, 2021
@cdce8p cdce8p added the Needs review 🔍 Needs to be reviewed by one or multiple more persons label Nov 21, 2021
@cdce8p cdce8p marked this pull request as ready for review November 21, 2021 01:26
@cdce8p
Copy link
Member Author

cdce8p commented Nov 21, 2021

Took a moment to get this ready, especially writing the tests. I believe it's now ready for review.

Some additional comments regarding things I noticed while working on it:

  1. end_lineno and end_col_offset are usually set regardless the annotation int | None. With the exception of the few nodes that truly don't have these information.
  2. The ast module is usually considered unstable, that's even more apparent in the handling of end_lineno and end_col_offset.
  3. Sometimes there are just bugs that might or might not have been fixed from one version to the next. This could make testing a bit more interesting in pylint if we require those and they vary between versions.
  4. At least until 3.10 the lineno and col_offset information for JoinedStr and FormattedValue nodes don't make much sense.
  5. Many AssignName nodes don't have own lineno and col_offset info. Instead they fall back to that of the parent. The reason for it is the AST sometimes only stores strings (identifier) instead of ast.Name nodes. The TreeRebuilder still interprets them as AssignName but doesn't have any individual node information. E.g. MatchMapping.rest, MatchAs.name, arg.arg (argument names).
    https://github.com/PyCQA/astroid/blob/40d556070fd566b7bb43cb3ed0e498045b88f2d0/astroid/rebuilder.py#L997-L1013
  6. The handling of ast.Try nodes introduced in Python 3.3 should be adopted in TreeRebuilder. At the moment we split it into TryExcept and TryFinally the ast nodes for which have long been removed. I was thinking about adding a version argument to the TreeRebuilder initialization. That would allow us to make this a non-breaking change. Only downside being that it also adds additional maintenance and test requirements.
    https://github.com/PyCQA/astroid/blob/40d556070fd566b7bb43cb3ed0e498045b88f2d0/astroid/rebuilder.py#L1622-L1638
  7. There is already a NodeNG.tolineno property. This works by looking at all astroid child nodes and using the lineno of the last one. We could consider updating it, but that isn't required for this PR.
    https://github.com/PyCQA/astroid/blob/40d556070fd566b7bb43cb3ed0e498045b88f2d0/astroid/nodes/node_ng.py#L417-L427

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work ! Do you want to wait for 2.9.0 ? Should we use astroid 2.9.0 in pylint 2.12 ? We could bump the other 2.9.0 features in 2.10 so we can release the end column/end line.

tests/unittest_nodes_lineno.py Show resolved Hide resolved
assert (c1.keywords[0].lineno, c1.keywords[0].col_offset) == (3, 16)
assert (c1.keywords[0].end_lineno, c1.keywords[0].end_col_offset) == (3, 22)
assert (c1.body[0].lineno, c1.body[0].col_offset) == (4, 4)
assert (c1.body[0].end_lineno, c1.body[0].end_col_offset) == (4, 8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just reviewed the content of the test, not the expected column. But it comes from the ast, if it's wrong there isn't a lot we can do, right ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, at least not without significant effort. One area I was able to fix fairly easily was Arguments. See the snipped below. We might also be able to address the AssignName issue I mentioned by parsing the source code ourselves but that is definitely an issue for another time.

Usually the worse that can happen is that the lineno and col_offset also include the parent node. This is still correct just not as accurate as we would like it to be for error highlighting.

The true value of these tests is to know when a new version changed something. Like knowing that Keywords and Slice only include end_lineno in 3.9, or the issue with Arguments.

--
https://github.com/PyCQA/astroid/blob/7dd55fe8d96846617b2827e5272582ffc865190f/astroid/rebuilder.py#L834-L840

tests/unittest_nodes_lineno.py Show resolved Hide resolved
Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @cdce8p! Writing those tests must not have been an easy task.

tests/unittest_nodes_lineno.py Show resolved Hide resolved
tests/unittest_nodes_lineno.py Show resolved Hide resolved
tests/unittest_nodes_lineno.py Show resolved Hide resolved
tests/unittest_nodes_lineno.py Show resolved Hide resolved
@cdce8p
Copy link
Member Author

cdce8p commented Nov 21, 2021

Amazing work ! Do you want to wait for 2.9.0 ? Should we use astroid 2.9.0 in pylint 2.12 ? We could bump the other 2.9.0 features in 2.10 so we can release the end column/end line.

I would consider this a new feature that probably deserves its own minor version. I.e. release 2.9.0 next and bump the other features to 2.10.0 if that's ok with you.

@Pierre-Sassoulas
Copy link
Member

I would consider this a new feature that probably deserves its own minor version. I.e. release 2.9.0 next and bump the other features to 2.10.0

Sounds good, do we still release 2.8.6 first ?

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.10.0, 2.9.0 Nov 21, 2021
@cdce8p
Copy link
Member Author

cdce8p commented Nov 21, 2021

I would consider this a new feature that probably deserves its own minor version. I.e. release 2.9.0 next and bump the other features to 2.10.0

Sounds good, do we still release 2.8.6 first ?

Could make sense if it isn't too much effort. What is your opinion?
If we decide to, I just think we should do the release from the current main and do 2.9.0 directly afterwards with just this one. Everything currently targeted for 2.8.6 could be moved to 2.9.1.

@Pierre-Sassoulas
Copy link
Member

Hmm, I forgot to deprecate python 3.6.0 and 3.6.1. I'm going release 2.8.7 with the deprecation and suggest astroid 2.6.6 (as it's the one working with pylint 2.9.3).

@cdce8p cdce8p merged commit 62a679d into pylint-dev:main Nov 21, 2021
@cdce8p cdce8p deleted the feature_end-line-column branch November 21, 2021 16:55
cdce8p added a commit that referenced this pull request Nov 21, 2021
* Add end_lineno and end_col_offset infomation
* Add annotations for fields that are always None
* Changelog entry

Requires Python 3.8
tushar-deepsource pushed a commit to tushar-deepsource/astroid that referenced this pull request Dec 20, 2021
* Add end_lineno and end_col_offset infomation
* Add annotations for fields that are always None
* Changelog entry

Requires Python 3.8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ast Enhancement ✨ Improvement to a component High effort 🏋 Difficult solution or problem to solve Needs review 🔍 Needs to be reviewed by one or multiple more persons python 3.8
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants