Skip to content

Tokenizer/PHP: namespaced names as single token, mirroring PHP 8.0+ #1020

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jrfnl
Copy link
Member

@jrfnl jrfnl commented Apr 17, 2025

Description

This is a re-do of PR squizlabs/PHP_CodeSniffer#3155 with lots of updates.

In all fairness, this should be one commit for it to be atomic, but I believe that would make reviewing this PR next to impossible, which is why I've split the PR (like before) into lots of smaller, reviewable commits.


Util/Tokens: introduce new $nameTokens array

As both tests as well as sniffs will need to refer to these tokens quite a lot, providing a predefined token array to make this more straight-forward both within PHPCS itself, as well as for external standards seems appropriate.

PHP 8.0 | Tokenizer/PHP: namespaced names as single token

As per the proposal in squizlabs/PHP_CodeSniffer#3041.

This effectively backfills the new PHP 8.0 tokenization of identifier names for all supported PHP versions in PHPCS 4.x.

Note: this backfills takes into account that reserved keywords are now allowed to be used in namespaced names.
For that reason no check is done of the token type of the token between namespace separators, only on the contents.

Includes extensive unit tests to ensure the correct re-tokenization as well as that the rest of the tokenization is not adversely affected by this change.

Includes making sure that the following (re-)tokenizations are still stable:

  • Nullable vs ternary.

Tokenizer/PHP: update union/intersection/DNF type token tokenization …

…to allow for PHP 8 identifier tokens

Includes updated test expectations for the typed constants tests.

Tokenizer/PHP: update the arrow function backfill to allow for PHP 8 identifier tokens

Includes updating the unit tests for the new reality and adding some additional tests.

Note: while the tests which were testing the handling of the fn keyword when used in namespaced names are now no longer testing the polyfill for arrow functions, I do believe the test in itself still has value, which is why I've kept it and adjusted the testNotAnArrowFunction() method to allow for expecting one of the name tokens.

Tokenizer/PHP: update the short array tokenization to allow for PHP 8 identifier tokens

Includes adding some additional tests.

Tokenizer/PHP: update various tests to allow for PHP 8 identifier tokens

Note: while some tests which were testing the handling of the specific keywords when used in namespaced names are now no longer testing the polyfill for that keyword, I do believe those tests in itself still have value, which is why I've kept them and adjusted the testNot[KEYWORD]() methods to allow for expecting one of the name tokens.

Includes removing one test which really didn't have value anymore.

File::getMethodParameters(): fix method to work with the PHP 8 identifier tokens

The existing unit tests already contain tests covering this change.

Includes updating token offset expectations in pre-existing tests for the change in tokenization.

Note: the T_NAMESPACE and T_NS_SEPARATOR cases should, by rights, be removed what with the new tokenization. However, leaving them in place provides some tolerance for types interlaced with comments/whitespace as allowed in PHP < 8.0.
As this is an often used utility method, I believe the method containing such tolerance to ensure the same results PHP cross-version is warranted.

File::getMethodProperties(): fix method to work with the PHP 8 identifier tokens

The existing unit tests already contain tests covering this change.

Includes updating token offset expectations in pre-existing tests for the change in tokenization.

File::getMemberProperties(): fix method to work with the PHP 8 identifier tokens

The existing unit tests already contain tests covering this change.

Includes updating token offset expectations in pre-existing tests for the change in tokenization.

File::isReference(): fix method to work with the PHP 8 identifier tokens

The existing unit tests already contain tests covering this change.

File::findExtendedClassName(): fix method to work with the PHP 8 identifier tokens

Includes adding an additional test for a PHP 8 identifier name type previously not covered by tests.

File::findImplementedInterfaceNames(): fix method to work with the PHP 8 identifier tokens

Includes adding an additional test for a PHP 8 identifier name type previously not covered by tests.

File::find[Start|End]OfStatement(): update tests to allow for PHP 8 identifier tokens

File::getTokenAsString(): update tests to allow for PHP 8 identifier tokens

Tokens::$functionNameTokens: add the new PHP 8 identifier name tokens

Includes adding/updating tests for various sniffs using the Tokens::$functionNameTokens token array to make sure the behaviour is unchanged (without adding the new tokens to the token array, the behaviour would have been changed).

.... and then a lot of other commits updating individual sniffs to allow for the new tokenization....

Suggested changelog entry

Added:

  • Tokens::NAME_TOKENS containing an array with the tokens used for identifier names.

Changed:

  • PHPCS now uses the PHP >= 8.0 native method for tokenizing (namespaced) identifier names.
    • Before PHP 8.0, PHP would tokenize namespaced names using T_STRING and T_NS_SEPARATOR.
    • From PHP 8.0, PHP uses the tokens T_NAME_FULLY_QUALIFIED, T_NAME_RELATIVE, and T_NAME_QUALIFIED instead.
    • PHPCS now uses these new PHP 8.0 tokens no matter what version of PHP is being used to run PHPCS.
    • Custom sniffs that use T_STRING and T_NS_SEPARATOR tokens to look for namespaced names will need to be modified.
  • Tokens::$functionNameTokens now includes the identifier name tokens.

Fixed:

  • Fixed bug : File::findExtendedClassName() will no longer break on namespace relative class names.
  • Fixed bug : File::findImplementedInterfaceNames() will no longer break on namespace relative interface names.
  • Fixed bug : Various sniffs now have better support for ignoring/examining qualified function calls.

Related issues/external references

Fixes squizlabs/PHP_CodeSniffer#3041
Fixes #133

jrfnl added 30 commits April 17, 2025 17:55
As both tests as well as sniffs will need to refer to these tokens quite a lot, providing a predefined token array to make this more straight-forward both within PHPCS itself, as well as for external standards seems appropriate.
As per the proposal in squizlabs/PHP_CodeSniffer 3041.

This effectively backfills the new PHP 8.0 tokenization of identifier names for all supported PHP versions in PHPCS 4.x.

Note: this backfills takes into account that reserved keywords are now allowed to be used in namespaced names.
For that reason no check is done of the token _type_ of the token between namespace separators, only on the contents.

Includes extensive unit tests to ensure the correct re-tokenization as well as that the rest of the tokenization is not adversely affected by this change.

Includes making sure that the following (re-)tokenizations are still stable:
* Nullable vs ternary.
…to allow for PHP 8 identifier tokens

Includes updated test expectations for the typed constants tests.
…identifier tokens

Includes updating the unit tests for the new reality and adding some additional tests.

Note: while the tests which were testing the handling of the `fn` keyword when used in namespaced names are now no longer testing the polyfill for arrow functions, I do believe the test in itself still has value, which is why I've kept it and adjusted the `testNotAnArrowFunction()` method to allow for expecting one of the name tokens.
… identifier tokens

Includes adding some additional tests.
Note: while some tests which were testing the handling of the specific keywords when used in namespaced names are now no longer testing the polyfill for that keyword, I do believe those tests in itself still have value, which is why I've kept them and adjusted the `testNot[KEYWORD]()` methods to allow for expecting one of the name tokens.

Includes removing one test which really didn't have value anymore.
…fier tokens

The existing unit tests already contain tests covering this change.

Includes updating token offset expectations in pre-existing tests for the change in tokenization.

Note: the `T_NAMESPACE` and `T_NS_SEPARATOR` cases should, by rights, be removed what with the new tokenization. However, leaving them in place provides some tolerance for types interlaced with comments/whitespace as allowed in PHP < 8.0.
As this is an often used utility method, I believe the method containing such tolerance to ensure the same results PHP cross-version is warranted.
…fier tokens

The existing unit tests already contain tests covering this change.

Includes updating token offset expectations in pre-existing tests for the change in tokenization.
…fier tokens

The existing unit tests already contain tests covering this change.

Includes updating token offset expectations in pre-existing tests for the change in tokenization.
The existing unit tests already contain tests covering this change.
…tifier tokens

Includes adding an additional test for a PHP 8 identifier name type previously not covered by tests.
…P 8 identifier tokens

Includes adding an additional test for a PHP 8 identifier name type previously not covered by tests.
Includes adding/updating tests for various sniffs using the `Tokens::$functionNameTokens` token array to make sure the behaviour is unchanged (without adding the new tokens to the token array, the behaviour would have been changed).
…er tokens

The existing unit tests already contain tests covering this change.

Includes updating one test.

That test was testing that a namespace name interlaced with comments and whitespace would still be recognized as the same namespace as one without.
As namespaced names are no longer allowed to have whitespace or comments within them as of PHP 8.0 and we now use the PHP 8.0 tokenization, the sniff will no longer handle that situation.
Aside from that, the test still has value, so keeping it, but making sure the PHP 8.0+ parse error is removed.
… identifier tokens

Includes extra unit tests covering the change in so far it wasn't already covered.
…tifier tokens

Includes extra unit tests covering the change in so far it wasn't already covered.
… identifier tokens

Includes extra unit tests covering the change in so far it wasn't already covered.
…ifier tokens

Includes extra unit tests covering the change in so far it wasn't already covered.
…er tokens

The existing unit tests already cover this change.
Includes extra unit tests covering the change in so far it wasn't already covered.

Includes minor efficiency tweak.
…dentifier tokens

Includes extra unit tests covering the change in so far it wasn't already covered.

Some of the new tests cover situations previously not handled by the sniff.
The fact that those are now handled, fixes issue 133.

Fixes 133
…entifier tokens

The existing unit tests already cover this change.
…tokens

Includes extra unit tests covering the change in so far it wasn't already covered.
Includes adjusting a pre-existing test to cover this change.
…kens

The existing unit tests already contain tests covering this change.

By extension, this also fixes the `PSR12.Classes.AnonClassDeclaration` and the `Squiz.Classes.ClassDeclaration` sniffs.

Includes some updated tests for those sniffs.
… tokens

Includes some additional tests to cover the change.
…kens

The existing unit tests already contain tests covering this change.
…ifier tokens

Includes minor update to pre-existing tests to ensure all identifier name types are covered.
jrfnl added 7 commits April 17, 2025 21:38
…kens

Includes extra unit tests covering the change in so far it wasn't already covered.
…identifier tokens

Includes extra unit tests covering the change in so far it wasn't already covered.

Note: this was previously a false positive for this sniff!
…dentifier tokens

The existing unit tests already contain tests covering this change.
…identifier tokens

Includes extra unit tests covering the change in so far it wasn't already covered.
…ier tokens

The existing unit tests already contain tests covering this change.

Includes updating two tests with code which is considered a parse error in PHP 8.0.
Handling that type of code is no longer supported by this sniff.
... to make sure that function calls using the new tokens are handled correctly by the sniff.
What with the change in the namespaced names tokenization, a sniff listening for `T_NAMESPACE` will no longer receive tokens for namespace relative names as those no longer use that token.

Keeping the existing tests for this in place as those are still a good safeguard to prevent a regression slipping in.
@jrfnl jrfnl force-pushed the phpcs-4.0/feature/3041-tokenizer-php-namespaced-name-tokenization branch from d9bbbe0 to 121c58c Compare April 17, 2025 19:39
@rodrigoprimo rodrigoprimo self-requested a review April 17, 2025 19:45
@jrfnl jrfnl merged commit 7055739 into 4.x Apr 17, 2025
64 checks passed
@jrfnl jrfnl deleted the phpcs-4.0/feature/3041-tokenizer-php-namespaced-name-tokenization branch April 17, 2025 19:53
jrfnl added a commit that referenced this pull request Apr 21, 2025
This sniff checks for `if`/`elseif` which only contain "true" or "false", i.e. don't actually compare to anything.

This commit fixes the sniff to still function correctly when it encounters "fully qualified true/false" after the merge of #1020.

Includes tests.
jrfnl added a commit that referenced this pull request Apr 21, 2025
This commit fixes the sniff to handle "fully qualified true/false" the same as unqualified true/false after the merge of #1020.

Includes tests.
jrfnl added a commit that referenced this pull request Apr 21, 2025
These are the only two sniffs explicitly targetting `true`, `false` and `null`.

This commit allows these sniffs to still function correctly on "fully qualified true/false/null" after the merge of #1020.

Includes tests.

---

In the PHPCS codebase, there are only a couple more places were the `T_TRUE`, `T_FALSE` and `T_NULL` tokens are checked for.
* In most cases, this involves checking for/skipping over type declarations, and as the token arrays containing the valid tokens for types will already include the `T_NAME_FULLY_QUALIFIED` token, no changes are needed there.
* Aside from that, there are two more sniffs which could be considered "broken" for FQN true/false/null:
    1. `Generic.CodeAnalysis.UnconditionalStatement`
    2. `Generic.ControlStructures.DisallowYodaConditions`
jrfnl added a commit that referenced this pull request Apr 21, 2025
This commit fixes the sniff to handle "fully qualified true/false" the same as unqualified true/false after the merge of #1020.

Includes tests.
jrfnl added a commit that referenced this pull request Apr 21, 2025
This sniff checks for `if`/`elseif` which only contain "true" or "false", i.e. don't actually compare to anything.

This commit fixes the sniff to still function correctly when it encounters "fully qualified true/false" after the merge of #1020.

Includes tests.
jrfnl added a commit that referenced this pull request Apr 21, 2025
This commit fixes the sniff to handle "fully qualified true/false" the same as unqualified true/false after the merge of #1020.

Includes tests.
jrfnl added a commit that referenced this pull request Apr 21, 2025
These are the only two sniffs explicitly targetting `true`, `false` and `null`.

This commit allows these sniffs to still function correctly on "fully qualified true/false/null" after the merge of #1020.

Includes tests.

---

In the PHPCS codebase, there are only a couple more places were the `T_TRUE`, `T_FALSE` and `T_NULL` tokens are checked for.
* In most cases, this involves checking for/skipping over type declarations, and as the token arrays containing the valid tokens for types will already include the `T_NAME_FULLY_QUALIFIED` token, no changes are needed there.
* Aside from that, there are two more sniffs which could be considered "broken" for FQN true/false/null:
    1. `Generic.CodeAnalysis.UnconditionalStatement`
    2. `Generic.ControlStructures.DisallowYodaConditions`
jrfnl added a commit that referenced this pull request Apr 21, 2025
This commit fixes the sniff to handle "fully qualified true/false" the same as unqualified true/false after the merge of #1020.

Includes tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants