Tokenizer/PHP: add extra hardening to the (DNF) type handling + efficiency improvement #508
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Follow up on PR #507.
Tokenizer/PHP: efficiency improvement for DNF type handling
The
PHP::processAdditional()
method walks back from the end of the file to the beginning.With that in mind, and knowing that a type can never end on an open parenthesis, and an open parenthesis can never be seen in a type before the close parenthesis has been seen (at least for valid/non-parse error types), it makes no sense to trigger the type handling logic for open parentheses.
This should make the tokenizer slightly more efficient as (open) parentheses are used a lot in code ;-)
It also prevents the type handling layer from acting on these type of invalid/parse error types, while it previously would.
Includes tests safeguarding the behaviour of the type handling layer for this type of invalid/parse error types.
Of these tests, the type for the OO constant and for the property were previously not handled consistently/correctly.
The parameter type + the return type were fine.
Tokenizer/PHP: don't retokenize tokens in a broken type DNF declaration
A DNF type always needs to have matching open and close parentheses. If the parentheses are unbalanced, we can't be sure this is a DNF type and run the risk of retokenizing non-type tokens.
Even if the code under scan was intended as a DNF type, it will be invalid code/contain a parse error if the parentheses are unbalanced.
In that case, retokenizing to the type specific tokens is likely to make the effect of this parse error on sniffs worse.
With that in mind, I'm adding extra hardening to the type handling layer in the PHP tokenizer, which will ensure that the retokenization to type specific tokens only happens when there are balanced pairs of DNF parentheses.
Includes tests safeguarding the behaviour of the type handling layer for this type of invalid/parse error types.
Safe for two, all these tests would previously fail on containing at least some type specific tokens.
Tokenizer/PHP: efficiency fix
Reminder: the PHP::processAdditional()` method walks back from the end of the file to the beginning.
The type handling retokenization layer is triggered for each
&
,|
and)
the tokenizer encounters.When something is recognized as a valid type declaration, the relevant tokens will all be retokenized in one go the first time the type handling layer is triggered, which means that - as the type tokens will have been retokenized already -, the type handling layer will not be triggered again for any of the other type related tokens in the type.
However, if the type is not recognized as a valid type, the type handling layer will keep getting retriggered and will (correctly) keep concluding this is not a valid type.
The change in this PR, prevents the type handling layer from doing any work when it is retriggered on a token which was previously already seen and concluded to be, either not part of a type or part of an invalid type.
This should make the tokenizer marginally faster for complex types containing an error, like
(A&B)|(C&D|(E&F)
.Tokenizer/PHP: minor doc fix in type handling layer
Suggested changelog entry
Related issues/external references
Loosely related to #504, #505, #507