-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Explicit as_str (no deref), add no allocation methods
#8826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit implements the AST design to account for implicit concatenation in string nodes, specifically the `ExprFString`, `ExprStringLiteral`, and `ExprBytesLiteral` nodes.
This commit adds the new variants for the string parts to `AnyNode` and `AnyNodeRef` enums. These parts are `StringLiteral` (part of `ExprStringLiteral`), `BytesLiteral` (part of `ExprBytesLiteral`), and `FString` (part of `ExprFString`). The reason for this is to add visitor methods for these parts. This is done in the following commit. So, the visitor would visit the string as a whole first and then visit each part. ``` ExprStringLiteral - "foo" "bar" |- StringLiteral - "foo" |- StringLiteral - "bar" ``` The above tree helps understand the way visitor would work.
The visitor implementations are updated to visit each part nodes for the respective string nodes. The following example better highlights this: ``` ExprStringLiteral - "foo" "bar" |- StringLiteral - "foo" |- StringLiteral - "bar" ``` The `visit_expr` method would be use to visit the `ExprStringLiteral` while the `visit_string_literal` method would be use for the `StringLiteral` node. Similar methods are added for bytes and f-strings.
The generator is basically improved. Earlier, for an implicitly concatenated string we would produce the joined form. So, ```python "foo" "bar" "baz" ``` For the above example, the generator would give us: ```python "foobarbaz" ``` Now, as we have the information for each part, we will be producing the exact code back.
`Expr` is a general type for all expressions while `LiteralExpressionRef` is a type which includes only the literal expressions. The method is suited more for this type instead. This will also help in the formatter change.
As highlighted in the review: > If you have two `ConcatenatedStringLiteral` values where both have > equivalent values for `strings` but where one has `value` initialized > and the other does not, would you expect them to compare equal? > Semantically I think I would, since the alternative is that equality is > dependent on whether `as_str()` has been called, which seems incidental. #7927 (comment)
|
Current dependencies on/for this PR:
This stack of pull requests is managed by Graphite. |
3b36f7f to
8b261bc
Compare
|
|
What has been the main motivation for the change (just asking)? I assume it's for better performance but codespeed isn't loading for me right now. Could we keep the |
These methods are just some low hanging fruits which are simple to implement to work on each character instead of (probably) allocating a String. That said, I haven't seen any regression or improvements.
Can you provide an example where you think the code is complicated? That would help me understand your perspective. |
| if checker.enabled(Rule::StaticJoinToFString) { | ||
| flynt::rules::static_join_to_fstring(checker, expr, string); | ||
| flynt::rules::static_join_to_fstring( | ||
| checker, | ||
| expr, | ||
| string_value.as_str(), | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the previous code was easier to read than now where we have the explicit as_str call and using Deref seemed to work just fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation here is that the Deref implementation includes an allocation -- the conversion here can allocate if the string is implicitly concatenated. So Deref is essentially hiding an allocation. I prefer that we do something explicit over an implicit allocation. It's not about the lifetimes or anything like that. It's intentionally more verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should then call the method to_str rather than as_str according to https://rust-lang.github.io/api-guidelines/naming.html#ad-hoc-conversions-follow-as_-to_-into_-conventions-c-conv Because I assume that a function called as to be free (or extremely cheap), whereas to might be cheap, but it depends, and into_ consumes self
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think using to_str makes more sense. I was not aware of that. I'll create a follow-up PR to rename that.
| self.deferred.string_type_definitions.push(( | ||
| expr.range(), | ||
| value, | ||
| value.as_str(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for all these changes. If we could keep the Deref implementation in addition to the explicit as_str method, then code that doesn't need the more relaxed lifetime could be left as is.

Summary
This PR is a follow-up to the AST refactor which does the following:
Derefimplementation onStringLiteralValueand use explicitas_strcalls instead. TheDerefimplementation would implicitly perform allocations in case of implicitly concatenated strings. This is to make sure the allocation is explicit.is_emptylencharsPartialEqimplementation to compare each characterTest Plan
Run the linter test suite and make sure all tests pass.