Skip to content

Conversation

@MarjovanLier
Copy link
Owner

@MarjovanLier MarjovanLier commented Dec 8, 2025

User description

Summary

  • Replace runtime array operations with compile-time constants for
    significant performance improvements
  • Add SEARCH_WORDS_MAPPING constant with lowercase accent mappings
  • Add ACCENT_MAPPING constant for case-preserving accent removal
  • Simplify searchWords() and removeAccents() to single strtr() calls
  • Remove unused code: $searchWordsMapping, applyBasicNameFix(),
    legacy REMOVE_ACCENTS_FROM/TO arrays
  • Add IsValidTimePartTest for 100% code coverage

Performance Improvements

Method Improvement
removeAccents() +66%
searchWords() +104%
nameFix() +39%

Test Results

  • 201 tests passing (889 assertions)
  • 100% code coverage (up from 93.9% on main)
  • All static analysis tools pass (PHPStan, Psalm, Phan)

Test Plan

  • All existing tests pass
  • New IsValidTimePartTest covers defensive validation code
  • 100% code coverage verified
  • Static analysis clean
  • Rector clean

PR Type

Enhancement, Performance


Description

  • Replace runtime array operations with compile-time constants for O(1) lookup

  • Pre-compute SEARCH_WORDS_MAPPING and ACCENT_MAPPING constants eliminating lazy initialization

  • Simplify searchWords() and removeAccents() to single strtr() calls with constants

  • Remove unused static properties, applyBasicNameFix() method, and legacy array constants

  • Add IsValidTimePartTest for 100% code coverage of defensive validation

  • Remove Psalm and Phan dependencies, add PHP 8.5 to CI test matrix


Diagram Walkthrough

flowchart LR
  A["Runtime Array<br/>Operations"] -->|"Eliminated"| B["Compile-Time<br/>Constants"]
  C["Lazy Init<br/>Static Props"] -->|"Replaced"| B
  B -->|"Used by"| D["searchWords()"]
  B -->|"Used by"| E["removeAccents()"]
  D -->|"O(1) lookup"| F["strtr()"]
  E -->|"O(1) lookup"| F
  F -->|"+66% to +104%<br/>Performance"| G["Optimized<br/>Output"]
Loading

File Walkthrough

Relevant files
Enhancement
2 files
AccentNormalization.php
Replace legacy arrays with pre-computed mapping constants
+308/-488
StringManipulation.php
Simplify methods to use pre-computed constants, remove lazy init
+10/-123
Tests
4 files
ArrayCombineValidationBugFixTest.php
Remove static cache reset, simplify test setup                     
+3/-43   
CriticalBugFixIntegrationTest.php
Remove cache reset logic, update test comments                     
+1/-37   
IsValidTimePartTest.php
Add comprehensive tests for isValidTimePart method             
+261/-0 
UppercaseAccentMappingBugFixTest.php
Remove static cache reset, simplify test setup                     
+1/-34   
Configuration changes
2 files
php.yml
Add PHP 8.5 to test matrix, remove Psalm and Phan steps   
+9/-15   
docker-compose.yml
Remove Psalm and Phan test service definitions                     
+0/-13   
Dependencies
1 files
composer.json
Remove Psalm and Phan dependencies for PHP 8.5 compatibility
+2/-9     
Documentation
1 files
performance.md
Update benchmarks and documentation for pre-computed constants
+29/-33 

Summary by CodeRabbit

  • Bug Fixes

    • Resolved cache management issues by transitioning to pre-computed constant mappings.
  • Tests

    • Added validation tests for time component handling.
    • Simplified test infrastructure by removing redundant cache reset mechanisms.
  • Refactor

    • Optimised string normalisation through improved internal mapping structures for better performance.

✏️ Tip: You can customize this high-level summary in your review settings.

Replace runtime array operations with compile-time constants for
significant performance improvements in searchWords() and
removeAccents().

- Add SEARCH_WORDS_MAPPING constant with lowercase accent mappings
- Add ACCENT_MAPPING constant for case-preserving accent removal
- Simplify searchWords() to single strtr() call with constant
- Simplify removeAccents() to single strtr() call with constant
- Remove unused $searchWordsMapping static property
- Remove unused applyBasicNameFix() method
- Remove legacy REMOVE_ACCENTS_FROM/TO arrays
- Add IsValidTimePartTest for 100% code coverage
- Fix composer.json psalm plugin dependency

Performance: removeAccents +66%, searchWords +104%, nameFix +39%

Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com>
Copilot AI review requested due to automatic review settings December 8, 2025 22:16
@coderabbitai
Copy link

coderabbitai bot commented Dec 8, 2025

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

The pull request refactors accent normalization and string search mapping from runtime-constructed arrays and cached replacements to pre-computed constant associative maps optimized for strtr() operations. Runtime cache management and reflection-based test lifecycle methods are removed, simplifying test infrastructure since all mappings are now compile-time constants.

Changes

Cohort / File(s) Summary
Core Logic Refactoring
src/AccentNormalization.php, src/StringManipulation.php
Replaces runtime array construction and caching with pre-computed constants SEARCH_WORDS_MAPPING and ACCENT_MAPPING designed for strtr() lookups. Removes dynamic mapping building, lazy-initialization blocks, and cached replacement mechanisms. Simplifies transformation to single-pass strtr() with regex space normalization. Public API unchanged; internal implementation optimized.
Test Cache Infrastructure Removal
tests/Unit/ArrayCombineValidationBugFixTest.php, tests/Unit/CriticalBugFixIntegrationTest.php, tests/Unit/UppercaseAccentMappingBugFixTest.php
Removes setUp(), tearDown(), and resetStaticCache() methods along with ReflectionClass imports. Eliminates per-test cache resets via reflection since mappings are now compile-time constants. Updates test documentation to reflect pre-computed constant approach.
New Test Coverage
tests/Unit/IsValidTimePartTest.php
Adds unit tests for private isValidTimePart() method using Reflection, covering invalid date components, invalid hours/minutes, leap year validation, and edge cases.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Mapping completeness: Verify all accented character mappings (including Greek and extended scripts) are correctly transcribed from original array format to strtr-compatible associative maps
  • Behavioral equivalence: Confirm strtr() application produces identical normalization results as the previous multi-step replacement approach
  • Test coverage gaps: Ensure removal of reflection-based cache testing doesn't inadvertently skip validation of mapping correctness; new IsValidTimePartTest may address different concerns
  • Performance implications: Validate that compile-time constants don't inadvertently increase memory footprint or affect lazy-loading patterns in production code

Possibly related PRs

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch perf/optimise-string-methods

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6e73ab3 and 105b06e.

⛔ Files ignored due to path filters (4)
  • .github/workflows/php.yml is excluded by none and included by none
  • composer.json is excluded by none and included by none
  • docker-compose.yml is excluded by none and included by none
  • docs/performance.md is excluded by none and included by none
📒 Files selected for processing (6)
  • src/AccentNormalization.php (1 hunks)
  • src/StringManipulation.php (3 hunks)
  • tests/Unit/ArrayCombineValidationBugFixTest.php (2 hunks)
  • tests/Unit/CriticalBugFixIntegrationTest.php (1 hunks)
  • tests/Unit/IsValidTimePartTest.php (1 hunks)
  • tests/Unit/UppercaseAccentMappingBugFixTest.php (1 hunks)

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @MarjovanLier, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a significant performance enhancement and simplification of string manipulation utilities. By moving from dynamic, runtime array construction and caching to pre-computed, compile-time constant mappings, the methods for accent removal and search word normalization are now more efficient. This change not only boosts performance but also streamlines the codebase by removing legacy logic and improving test coverage to a perfect 100%.

Highlights

  • Performance Optimization: Replaced runtime array operations with compile-time constants (SEARCH_WORDS_MAPPING, ACCENT_MAPPING) for O(1) lookup performance in string manipulation methods, leading to significant speed improvements (e.g., +66% for removeAccents(), +104% for searchWords(), +39% for nameFix()).
  • Code Simplification: The searchWords() and removeAccents() methods were simplified to single strtr() calls, eliminating complex runtime logic for building and caching character mappings.
  • Removed Legacy Code: Eliminated the unused $searchWordsMapping static property, the applyBasicNameFix() method, and the old REMOVE_ACCENTS_FROM/TO arrays, reducing code complexity and maintenance overhead.
  • Improved Test Coverage: Added IsValidTimePartTest to specifically test defensive validation code within the isValidTimePart() method, contributing to 100% overall code coverage.
  • Dependency Update: Updated the psalm/plugin-phpunit dependency to use ^0.19.3.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@qodo-code-review
Copy link
Contributor

Changelog updates: 🔄

* 2025-12-08

Added

  • New IsValidTimePartTest achieving full code coverage

Changed

  • Refactored string normalization to use pre-computed mapping constants with single-pass strtr for O(1) lookups

Fixed

  • Eliminated runtime array_combine validation and related cache issues by using compile-time mappings

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

@codecov
Copy link

codecov bot commented Dec 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (6e73ab3) to head (105b06e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@              Coverage Diff              @@
##               main       #58      +/-   ##
=============================================
+ Coverage     94.62%   100.00%   +5.37%     
+ Complexity       40        32       -8     
=============================================
  Files             1         1              
  Lines            93        61      -32     
=============================================
- Hits             88        61      -27     
+ Misses            5         0       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Dec 8, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Re-evaluate removal of name-fixing logic

The removal of the applyBasicNameFix method causes a functional change in
searchWords for names like 'McDonald'. This potential regression should be
clarified or the logic restored.

Examples:

src/StringManipulation.php [60-72]
    public static function searchWords(?string $words): ?string
    {
        // Early return for null input
        if ($words === null) {
            return null;
        }

        // Single-pass character transformation with strtr() using pre-computed constant
        $result = strtr(trim($words), self::SEARCH_WORDS_MAPPING);


 ... (clipped 3 lines)

Solution Walkthrough:

Before:

class StringManipulation {
    public static function searchWords(?string $words): ?string {
        // ...
        // Apply basic name fixing for Mc/Mac prefixes
        $words = self::applyBasicNameFix($words); // e.g., 'McDonald' -> 'Mc Donald'

        // Single-pass character transformation
        $words = strtr($words, self::$searchWordsMapping); // e.g., 'Mc Donald' -> 'mc donald'

        return trim(preg_replace('# {2,}#', ' ', $words) ?? '');
    }

    private static function applyBasicNameFix(string $name): string {
        // ... logic to add space after 'Mc' or 'Mac'
        return preg_replace('/(?<=[^a-z-])mc(?=[a-z])/i', 'mc ', $name) ?? $name;
    }
}

After:

class StringManipulation {
    public static function searchWords(?string $words): ?string {
        // ...
        // The name-fixing logic is now gone.
        // 'McDonald' is no longer converted to 'Mc Donald'.

        // Single-pass character transformation
        $result = strtr(trim($words), self::SEARCH_WORDS_MAPPING); // e.g., 'McDonald' -> 'mcdonald'

        return trim(preg_replace('# {2,}#', ' ', $result) ?? '');
    }

    // private static function applyBasicNameFix is removed.
}
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly identifies a significant, undocumented functional regression in searchWords due to the removal of applyBasicNameFix, which could break search behavior for specific name formats.

High
General
Remove redundant space-collapsing mapping

Remove the redundant ' ' => ' ' mapping from the SEARCH_WORDS_MAPPING constant,
as space collapsing is already handled by a preg_replace call in the searchWords
method.

src/AccentNormalization.php [77-81]

 // Greek accented to base (lowercase)
 'Ά' => 'α', 'ά' => 'α', 'Έ' => 'ε', 'έ' => 'ε', 'Ό' => 'ο', 'ό' => 'ο',
 'Ώ' => 'ω', 'ώ' => 'ω', 'Ί' => 'ι', 'ί' => 'ι', 'ϊ' => 'ι', 'ΐ' => 'ι',
 'Ύ' => 'υ', 'ύ' => 'υ', 'ϋ' => 'υ', 'ΰ' => 'υ', 'Ή' => 'η', 'ή' => 'η',
-'  ' => ' ',
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that the ' ' => ' ' mapping is redundant because the searchWords method already uses preg_replace to collapse multiple spaces, making the code cleaner and less confusing.

Low
  • Update
  • Author self-review: I have reviewed the PR code suggestions, and addressed the relevant ones.

@qodo-code-review
Copy link
Contributor

Auto-approved PR

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Dec 8, 2025

PR Reviewer Guide 🔍

(Review updated until commit 105b06e)

Here are some key observations to aid the review process:

🔒 No security concerns identified
⚡ Recommended focus areas for review

Behaviour Change

SEARCH_WORDS_MAPPING now maps commas to commas (not spaces) and adds many special-char mappings; verify that searchWords() should retain commas and treat added symbols as intended to avoid regressions in search normalization.

private const array ACCENT_MAPPING = [
    '*' => ' ',
    '?' => ' ',
    "\u{2019}" => "'",
    '.' => ' ',
    ',' => ',',
    "\u{201C}" => '',
    "\u{201D}" => '',
    'À' => 'A',
Functional Parity

Removal of applyBasicNameFix() changes searchWords() output for names with Mc/Mac; confirm this behavior change is acceptable or reintroduce minimal handling if required by consumers.

public static function searchWords(?string $words): ?string
{
    // Early return for null input
    if ($words === null) {
        return null;
    }

    // Single-pass character transformation with strtr() using pre-computed constant
    $result = strtr(trim($words), self::SEARCH_WORDS_MAPPING);

    // Final cleanup: reduce multiple spaces to single space and trim
    return trim(preg_replace('# {2,}#', ' ', $result) ?? '');
}
Mapping Completeness

SEARCH_WORDS_MAPPING includes A–Z lowercasing but ACCENT_MAPPING does not map A–Z to lowercase; ensure case-preserving intent is correct for removeAccents() and that all needed punctuation/whitespace cases (e.g., multiple spaces) are consistently covered.

private const array SEARCH_WORDS_MAPPING = [
    // Special characters to spaces
    '*' => ' ', '?' => ' ', '.' => ' ', ',' => ' ',
    '{' => ' ', '}' => ' ', '(' => ' ', ')' => ' ',
    '/' => ' ', '\\' => ' ', '@' => ' ', ':' => ' ',
    '"' => ' ', '_' => ' ',
    "\u{2019}" => "'", "\u{201C}" => '', "\u{201D}" => '',
    // Uppercase ASCII to lowercase
    'A' => 'a', 'B' => 'b', 'C' => 'c', 'D' => 'd', 'E' => 'e', 'F' => 'f',
    'G' => 'g', 'H' => 'h', 'I' => 'i', 'J' => 'j', 'K' => 'k', 'L' => 'l',
    'M' => 'm', 'N' => 'n', 'O' => 'o', 'P' => 'p', 'Q' => 'q', 'R' => 'r',
    'S' => 's', 'T' => 't', 'U' => 'u', 'V' => 'v', 'W' => 'w', 'X' => 'x',
    'Y' => 'y', 'Z' => 'z',
    // Accented characters to lowercase replacements
    'À' => 'a', 'Á' => 'a', 'Â' => 'a', 'Ã' => 'a', 'Ä' => 'a', 'Å' => 'a',
    'Æ' => 'ae', 'Ç' => 'c', 'È' => 'e', 'É' => 'e', 'Ê' => 'e', 'Ë' => 'e',
    'Ì' => 'i', 'Í' => 'i', 'Î' => 'i', 'Ï' => 'i', 'Ð' => 'd', 'Ñ' => 'n',
    'Ò' => 'o', 'Ó' => 'o', 'Ô' => 'o', 'Õ' => 'o', 'Ö' => 'o', 'Ø' => 'o',
    'Ù' => 'u', 'Ú' => 'u', 'Û' => 'u', 'Ü' => 'u', 'Ý' => 'y',
    'ß' => 's', 'à' => 'a', 'á' => 'a', 'â' => 'a', 'ã' => 'a', 'ä' => 'a',
    'å' => 'a', 'æ' => 'ae', 'ç' => 'c', 'è' => 'e', 'é' => 'e', 'ê' => 'e',
    'ë' => 'e', 'ì' => 'i', 'í' => 'i', 'î' => 'i', 'ï' => 'i', 'ñ' => 'n',
    'ò' => 'o', 'ó' => 'o', 'ô' => 'o', 'õ' => 'o', 'ö' => 'o', 'ø' => 'o',
    'ù' => 'u', 'ú' => 'u', 'û' => 'u', 'ü' => 'u', 'ý' => 'y', 'ÿ' => 'y',
    'Ā' => 'a', 'ā' => 'a', 'Ă' => 'a', 'ă' => 'a', 'Ą' => 'a', 'ą' => 'a',
    'Ć' => 'c', 'ć' => 'c', 'Ĉ' => 'c', 'ĉ' => 'c', 'Ċ' => 'c', 'ċ' => 'c',
    'Č' => 'c', 'č' => 'c', 'Ď' => 'd', 'ď' => 'd', 'Đ' => 'd', 'đ' => 'd',
    'Ē' => 'e', 'ē' => 'e', 'Ĕ' => 'e', 'ĕ' => 'e', 'Ė' => 'e', 'ė' => 'e',
    'Ę' => 'e', 'ę' => 'e', 'Ě' => 'e', 'ě' => 'e', 'Ĝ' => 'g', 'ĝ' => 'g',
    'Ğ' => 'g', 'ğ' => 'g', 'Ġ' => 'g', 'ġ' => 'g', 'Ģ' => 'g', 'ģ' => 'g',
    'Ĥ' => 'h', 'ĥ' => 'h', 'Ħ' => 'h', 'ħ' => 'h', 'Ĩ' => 'i', 'ĩ' => 'i',
    'Ī' => 'i', 'ī' => 'i', 'Ĭ' => 'i', 'ĭ' => 'i', 'Į' => 'i', 'į' => 'i',
    'İ' => 'i', 'ı' => 'i', 'IJ' => 'ij', 'ij' => 'ij', 'Ĵ' => 'j', 'ĵ' => 'j',
    'Ķ' => 'k', 'ķ' => 'k', 'Ĺ' => 'l', 'ĺ' => 'l', 'Ļ' => 'l', 'ļ' => 'l',
    'Ľ' => 'l', 'ľ' => 'l', 'Ŀ' => 'l', 'ŀ' => 'l', 'Ł' => 'l', 'ł' => 'l',
    'Ń' => 'n', 'ń' => 'n', 'Ņ' => 'n', 'ņ' => 'n', 'Ň' => 'n', 'ň' => 'n',
    'ʼn' => 'n', 'Ō' => 'o', 'ō' => 'o', 'Ŏ' => 'o', 'ŏ' => 'o', 'Ő' => 'o',
    'ő' => 'o', 'Œ' => 'oe', 'œ' => 'oe', 'Ŕ' => 'r', 'ŕ' => 'r', 'Ŗ' => 'r',
    'ŗ' => 'r', 'Ř' => 'r', 'ř' => 'r', 'Ś' => 's', 'ś' => 's', 'Ŝ' => 's',
    'ŝ' => 's', 'Ş' => 's', 'ş' => 's', 'Š' => 's', 'š' => 's', 'Ţ' => 't',
    'ţ' => 't', 'Ť' => 't', 'ť' => 't', 'Ŧ' => 't', 'ŧ' => 't', 'Ũ' => 'u',
    'ũ' => 'u', 'Ū' => 'u', 'ū' => 'u', 'Ŭ' => 'u', 'ŭ' => 'u', 'Ů' => 'u',
    'ů' => 'u', 'Ű' => 'u', 'ű' => 'u', 'Ų' => 'u', 'ų' => 'u', 'Ŵ' => 'w',
    'ŵ' => 'w', 'Ŷ' => 'y', 'ŷ' => 'y', 'Ÿ' => 'y', 'Ź' => 'z', 'ź' => 'z',
    'Ż' => 'z', 'ż' => 'z', 'Ž' => 'z', 'ž' => 'z', 'ſ' => 's', 'ƒ' => 'f',
    'Ơ' => 'o', 'ơ' => 'o', 'Ư' => 'u', 'ư' => 'u', 'Ǎ' => 'a', 'ǎ' => 'a',
    'Ǐ' => 'i', 'ǐ' => 'i', 'Ǒ' => 'o', 'ǒ' => 'o', 'Ǔ' => 'u', 'ǔ' => 'u',
    'Ǖ' => 'u', 'ǖ' => 'u', 'Ǘ' => 'u', 'ǘ' => 'u', 'Ǚ' => 'u', 'ǚ' => 'u',
    'Ǜ' => 'u', 'ǜ' => 'u', 'Ǻ' => 'a', 'ǻ' => 'a', 'Ǽ' => 'ae', 'ǽ' => 'ae',
    'Ǿ' => 'o', 'ǿ' => 'o',
    // Greek accented to base (lowercase)
    'Ά' => 'α', 'ά' => 'α', 'Έ' => 'ε', 'έ' => 'ε', 'Ό' => 'ο', 'ό' => 'ο',
    'Ώ' => 'ω', 'ώ' => 'ω', 'Ί' => 'ι', 'ί' => 'ι', 'ϊ' => 'ι', 'ΐ' => 'ι',
    'Ύ' => 'υ', 'ύ' => 'υ', 'ϋ' => 'υ', 'ΰ' => 'υ', 'Ή' => 'η', 'ή' => 'η',
    '  ' => ' ',
];

Reflect the new implementation using compile-time constants instead
of lazy-initialised static properties.

- Update benchmark numbers to reflect +66% to +104% improvements
- Replace lazy init examples with pre-computed constant examples
- Update "Static Caching" section to "Compile-Time Constants"
- Replace "Pre-warm Cache" with "No Warm-up Required"
- Update comparison table with new performance figures

Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com>
@qodo-code-review
Copy link
Contributor

Changelog updates: 🔄

2025-12-08

Changed

  • Pre-compute character mappings and simplify string methods to single strtr() calls for performance gains

Added

  • New tests including IsValidTimePartTest; improved coverage to 100%

Removed

  • Unused code and legacy arrays related to accent handling

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent pull request that significantly improves performance by replacing runtime array operations with pre-computed constants. The simplification of searchWords() and removeAccents() makes the code much cleaner and more efficient. The removal of unused code and the addition of new tests to achieve 100% coverage are also great contributions. I have a couple of minor suggestions to make the character normalization even more comprehensive, but overall, this is a high-quality submission.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes string manipulation methods by replacing runtime array operations with pre-computed compile-time constants, delivering significant performance improvements (66% for removeAccents(), 104% for searchWords(), and 39% for nameFix()). The refactoring eliminates lazy initialization and static caching in favor of constant mappings that provide O(1) character lookup via strtr().

Key changes:

  • Introduced SEARCH_WORDS_MAPPING and ACCENT_MAPPING constants with pre-computed character mappings
  • Simplified searchWords() and removeAccents() to single strtr() calls
  • Removed static caching properties, applyBasicNameFix() method, and legacy REMOVE_ACCENTS_FROM/TO arrays
  • Added comprehensive test coverage for isValidTimePart() method to achieve 100% code coverage

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/AccentNormalization.php Replaced legacy parallel arrays with two pre-computed constant mappings: SEARCH_WORDS_MAPPING (lowercase + special chars) and ACCENT_MAPPING (case-preserving accent removal)
src/StringManipulation.php Removed static caching, simplified searchWords() and removeAccents() to use constants, removed applyBasicNameFix() method and LogicException import
tests/Unit/IsValidTimePartTest.php Added comprehensive test suite for private isValidTimePart() method using reflection to cover defensive validation paths
tests/Unit/UppercaseAccentMappingBugFixTest.php Removed cache reset logic (setUp/tearDown/resetStaticCache) as pre-computed constants eliminate need for cache management
tests/Unit/ArrayCombineValidationBugFixTest.php Removed cache reset logic and updated comments to reflect pre-computed constant usage
tests/Unit/CriticalBugFixIntegrationTest.php Removed cache reset logic and updated documentation to reflect constant-based implementation
composer.json Changed psalm/plugin-phpunit version constraint from >=0.19.3 to ^0.19.3 for consistency with similar dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Extend CI pipeline to test against PHP 8.5 in addition to 8.3 and 8.4.

Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com>
@claude
Copy link
Contributor

claude bot commented Dec 8, 2025

Code Review: PR #58 - Pre-compute character mappings for O(1) lookup

Summary

This is an excellent performance optimization that replaces runtime array operations with compile-time constants. The approach is sound, implementation is clean, and test coverage has been improved to 100%.


Strengths

1. Performance Improvements

  • Significant gains: +66% for removeAccents(), +104% for searchWords(), +39% for nameFix()
  • Smart optimization: Pre-computed constants eliminate runtime overhead from array_combine(), array_map(), and array_merge()
  • O(1) lookups: Using strtr() with associative arrays is the right approach

2. Code Quality

  • Clean refactoring: Removed 113 lines of complexity from StringManipulation.php
  • Eliminated static caching: No more mutable state—constants are truly immutable
  • Simplified logic: Single strtr() calls replace multi-step transformations
  • Proper typing: Constants use array<string, string> type annotations

3. Test Coverage

  • 100% coverage achieved (up from 93.9%)
  • Excellent new test file: IsValidTimePartTest.php covers defensive validation code paths
  • Test cleanup: Removed reflection-based cache reset logic that is no longer needed
  • Comprehensive edge cases: Leap years, boundary conditions, invalid dates/times

4. Documentation

  • Clear PHPDoc comments explaining the pre-computed constant strategy
  • Performance benefits documented inline
  • Migration from runtime to compile-time approach is well-explained

Code Quality Observations

AccentNormalization.php (src/AccentNormalization.php:27-330)

Positive:

  • Well-organized mapping with inline comments separating categories
  • Unicode escape sequences correctly used
  • Good documentation of the two constants' different purposes

StringManipulation.php (src/StringManipulation.php:60-72, 193-196)

Excellent simplification:
The searchWords() method is particularly clean now:

  • Early null return
  • Single strtr() pass
  • Final cleanup with preg_replace()

Removed unnecessary code:

  • LogicException import no longer needed
  • Static properties eliminated
  • applyBasicNameFix() removed (113 lines deleted)

IsValidTimePartTest.php

Excellent test design:

  • Proper use of Reflection: Tests unreachable defensive code paths
  • Comprehensive coverage: Invalid dates, hours, minutes, leap years, boundaries
  • Clear naming: Test names clearly indicate what is being tested

Security Review

No security concerns identified:

  • No user input directly used in mappings
  • Constants are immutable
  • No SQL injection vectors
  • No XSS vectors (server-side string manipulation)
  • SensitiveParameter attribute correctly used on nameFix()

Performance Considerations

Excellent improvements across the board:

  1. Memory efficiency: Constants are shared across all instances
  2. CPU efficiency: Eliminated runtime array construction
  3. Maintainability: Easier to understand and modify character mappings
  4. Trade-off: Slightly larger compiled class size (negligible)

The stated improvements (+66-104%) are credible given elimination of:

  • array_combine() (O(n) operation)
  • array_map(strtolower) (O(n) operation)
  • array_merge() (O(n) operation)
  • Static cache checks and initialization

Potential Issues

1. Greek Character Mappings (Minor observation)

In AccentNormalization.php:

  • SEARCH_WORDS_MAPPING (lines 78-80): Greek accents → lowercase Greek
  • ACCENT_MAPPING (lines 310-327): Greek accents → Greek (preserves case)

Question: Are Greek characters in searchWords() meant to be normalized to Greek letters rather than Latin? This differs from other accented characters that map to Latin equivalents. If this is intentional for internationalization, please confirm.

2. Uppercase Ł Mapping Inconsistency (Line 222)

In ACCENT_MAPPING, uppercase Ł maps to lowercase l, breaking case preservation.

Impact: Low (affects Polish names like ŁUKASZ which would become lUKASZ instead of LUKASZ)


Best Practices Compliance

  • PHP 8.3 strict typing
  • Final classes
  • Typed constants
  • PSR-4 namespacing
  • Comprehensive PHPDoc
  • No methods >100 lines
  • Immutability: Constants instead of mutable static properties

Testing Recommendations

Since you have 100% coverage and 201 passing tests:

  1. Verify performance benchmarks are reproducible in CI
  2. Consider adding property-based tests for character mapping completeness (optional)
  3. Mutation testing (test-infection) should catch any logical errors in mappings

Dependencies

Composer change (composer.json:61):

  • Changed psalm/plugin-phpunit from >=0.19.3 to ^0.19.3

Good change: ^0.19.3 follows semantic versioning best practices (allows 0.19.x patches but not 0.20.0 breaking changes)


Final Verdict

APPROVED with one minor fix recommended

Recommendation:

Fix the uppercase Ł mapping in ACCENT_MAPPING (line 222) from l to L

Optional:

Clarify the intentionality of Greek character mappings in SEARCH_WORDS_MAPPING.


Summary Score

Category Rating Notes
Code Quality 5/5 Clean, well-documented, follows PSR standards
Performance 5/5 Significant measurable improvements
Test Coverage 5/5 100% coverage, comprehensive test cases
Security 5/5 No concerns identified
Maintainability 5/5 Much simpler than before

Overall: 5/5

Excellent work on this optimization! The code is production-ready with the minor Ł fix applied.

@qodo-code-review
Copy link
Contributor

Changelog updates: 🔄

2025-12-08 *

Changed

  • Optimized string methods using pre-computed constant mappings for O(1) lookups, improving performance in searchWords and removeAccents
  • Simplified implementations to single strtr() calls and removed legacy/lazy-initialized code

Added

  • New unit tests covering defensive validation code to achieve full coverage

Build

  • Extended CI matrix to include PHP 8.5

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

@claude
Copy link
Contributor

claude bot commented Dec 8, 2025

Code Review - PR #58: Pre-compute character mappings for O(1) lookup

Summary

This is an excellent performance optimization PR that replaces runtime array operations with compile-time constants. The changes are well-architected, thoroughly tested, and deliver significant measurable improvements.


Strengths

1. Performance Improvements

  • Impressive gains: +66% for removeAccents(), +104% for searchWords(), +39% for nameFix()
  • Smart optimization: Eliminating lazy initialization overhead by using compile-time constants is a proven technique
  • Zero first-call penalty: No warm-up required, which is a real-world improvement for production systems

2. Code Quality

  • Excellent adherence to project standards:
    • Strict typing: declare(strict_types=1);
    • Proper PHPDoc with @var array<string, string> type hints ✓
    • Clear, descriptive documentation ✓
  • Clean refactoring: Removed 113 lines of complex initialization logic
  • Simplified methods: Both removeAccents() and searchWords() are now single strtr() calls

3. Test Coverage

  • 100% code coverage (up from 93.9%) - exceptional!
  • New test file: IsValidTimePartTest.php with 261 lines covering edge cases:
    • Invalid dates (Feb 30, Feb 29 non-leap year)
    • Invalid time components (hour 25, minute 60, negative values)
    • Boundary conditions (month 0, month 13)
    • Valid cases including leap years
  • Test cleanup: Removed unnecessary cache reset logic from 3 test files (114 lines removed)
  • All tests passing: 201 tests, 889 assertions

4. Architecture

  • Well-separated concerns:
    • SEARCH_WORDS_MAPPING: Combined lowercase + accent removal + special chars (for searchWords())
    • ACCENT_MAPPING: Case-preserving accent removal (for removeAccents())
  • Type safety: Both constants properly typed as private const array
  • Removed unused code: Eliminated applyBasicNameFix() method, static properties, and LogicException throws

5. Documentation

  • Updated performance.md with new benchmarks and explanations
  • Clear commit message following conventional commits format
  • Comprehensive PR description with diagrams and file walkthroughs

🔍 Observations & Minor Suggestions

1. AccentNormalization.php (Lines 27-82)

Observation: The SEARCH_WORDS_MAPPING constant has a duplicate space mapping at the end:

'  ' => ' ',  // Line 81

This is handled by the regex in searchWords() at line 71:

return trim(preg_replace('# {2,}#', ' ', $result) ?? '');

Suggestion: Consider adding a comment explaining this is a preliminary cleanup before the final regex pass, or potentially remove it since the regex handles all multi-space scenarios.

2. Potential Greek Character Inconsistency (Lines 78-80)

In SEARCH_WORDS_MAPPING, Greek accented characters map to lowercase Greek:

'Ά' => 'α', 'ά' => 'α',  // Alpha

However, in ACCENT_MAPPING (lines 310-327), they map to uppercase:

'Ά' => 'Α', 'ά' => 'α',  // Case-preserving

Question: Is this intentional? The behavior differs from Latin characters. For example:

  • Latin: 'À''A' (uppercase preserved)
  • Greek: 'Ά''Α' (uppercase preserved)

But in SEARCH_WORDS_MAPPING:

  • Latin uppercase: 'À''a' (lowercased) ✓
  • Greek uppercase: 'Ά''α' (lowercased) ✓

This appears correct, but worth confirming the Greek mappings are tested.

3. Minor: Ł vs ł Inconsistency (Line 222)

In ACCENT_MAPPING:

'Ł' => 'l',   // Line 222 - lowercase 'l'
'ł' => 'l',   // Line 223 - lowercase 'l'

Should 'Ł' (uppercase) map to uppercase 'L' to preserve case? Compare with:

'Ĺ' => 'L',   // Line 214 - preserves uppercase
'ĺ' => 'l',   // Line 215 - lowercase

Impact: Very minor - only affects case preservation for Polish names with Ł.

4. Dependency Update (composer.json)

Changed from >="0.19.3" to ^0.19.3:

"psalm/plugin-phpunit": "^0.19.3"

Question: Was this intentional or a side effect? The ^ constraint allows >=0.19.3 <1.0.0, which is more restrictive than `>=". If intentional, this is good practice for stability.


🛡️ Security & Safety

  • No security concerns: Character mappings are static constants
  • No injection risks: Input validation remains unchanged
  • Backward compatible: All existing tests pass, behavior unchanged
  • Type safety: Proper type hints prevent runtime errors

🧪 Testing & Static Analysis

As per your PR description:

  • 201 tests passing (889 assertions)
  • 100% code coverage
  • PHPStan clean
  • Psalm clean
  • Phan clean
  • Rector clean

Recommendation: Before merging, run the full test suite as specified in CLAUDE.md:

docker-compose run --rm test-all

📊 Performance Impact Assessment

Before (Runtime Initialization):

// First call: lazy init + operation
// Subsequent calls: operation only

After (Compile-Time Constants):

// All calls: operation only (no init overhead)

Memory: Constant data is stored in opcache - no per-request allocation
CPU: No array_combine() or array_map() overhead
Consistency: Impossible to corrupt at runtime


🎯 Final Verdict

APPROVE

This is a textbook example of effective optimization:

  1. Measurable improvement: Real benchmarks showing 39-104% gains
  2. No breaking changes: 100% backward compatible
  3. Improved code quality: Simpler, more maintainable code
  4. Complete test coverage: 100% coverage with edge cases
  5. Follows project standards: Adheres to CLAUDE.md guidelines

Action Items Before Merge:

  1. ✅ All tests passing (already verified)
  2. ⚠️ Review the three observations above (minor issues, non-blocking)
  3. ✅ Consider adding a note in documentation about the Greek character mappings
  4. ✅ Run docker-compose run --rm test-all for final verification

Recommendation:

Merge after addressing the minor Ł case preservation issue (or confirming it's intentional). The other observations are informational and don't block the merge.


💡 Future Optimization Ideas

For consideration in future PRs:

  1. Benchmark with real-world data: Test with actual production strings
  2. Consider mb_strtr(): If multibyte performance differs
  3. Profile memory usage: Verify opcache optimization of large constants

Great work on this optimization! The performance gains are significant and the implementation is clean. 🚀

@claude
Copy link
Contributor

claude bot commented Dec 8, 2025

Code Review - PR #58: Pre-compute character mappings for O(1) lookup

Overall Assessment

This is an excellent performance optimization PR with well-documented improvements and comprehensive testing. The code quality is high and adheres to the repository standards. The approach of using compile-time constants instead of lazy initialization is sound and shows significant performance gains.

Recommendation: ✅ Approve with minor observations


Strengths

1. Performance Improvements ⭐⭐⭐⭐⭐

  • Impressive performance gains across critical methods:
    • removeAccents(): +66% improvement
    • searchWords(): +104% improvement
    • nameFix(): +39% improvement
  • Elimination of runtime array construction overhead
  • O(1) character lookup maintained with cleaner implementation

2. Code Quality ⭐⭐⭐⭐⭐

  • Adheres to CLAUDE.md development guidelines
  • Proper use of strict typing: private const array
  • Excellent documentation in docblocks
  • Clean elimination of dead code (unused static properties, applyBasicNameFix() method)

3. Test Coverage ⭐⭐⭐⭐⭐

  • Achieved 100% code coverage (up from 93.9%)
  • New IsValidTimePartTest properly uses reflection to test defensive validation code
  • 201 tests passing with 889 assertions
  • Tests correctly simplified by removing unnecessary static cache reset logic

4. Documentation ⭐⭐⭐⭐

  • performance.md updated with new benchmarks
  • Clear migration from "lazy initialization" to "compile-time constants"
  • Good inline comments explaining the mapping constants

Technical Observations

✅ Excellent Design Decisions

  1. Pre-computed constants: Using private const array is perfect for this use case

    • Zero runtime overhead
    • Immutable by nature (security benefit)
    • PHP optimizes constant storage efficiently
  2. Simplified methods:

    // Before: Complex lazy init with array_combine()
    // After: Clean single strtr() call
    public static function removeAccents(string $str): string
    {
        return strtr($str, self::ACCENT_MAPPING);
    }
  3. Proper constant separation:

    • SEARCH_WORDS_MAPPING: Combined lowercase + special chars
    • ACCENT_MAPPING: Case-preserving accent removal
    • Clear separation of concerns

🔍 Minor Observations

  1. AccentNormalization.php (Line 222, 223)

    'Ł' => 'l',
    'ł' => 'l',

    Both uppercase and lowercase Ł map to lowercase 'l'. This is intentional for ACCENT_MAPPING to preserve case semantics, but wanted to verify this is the desired behavior (uppercase Ł → lowercase l instead of uppercase L).

    Note: Looking at other mappings, this appears inconsistent. Most preserve case (e.g., 'Ā' => 'A', 'ā' => 'a'), but Ł breaks this pattern.

  2. searchWords() behavior change (StringManipulation.php:68)

    $result = strtr(trim($words), self::SEARCH_WORDS_MAPPING);

    The removal of applyBasicNameFix() call changes the behavior of searchWords(). The example in line 56 states:

    searchWords('McDonald'); // Returns 'mcdonald'

    This is correct if and only if the Mc/Mac name fixing was never intended for searchWords(). Can you confirm this behavioral change is intentional and all tests still pass?

  3. Duplicate mapping (AccentNormalization.php:81, 328)

    '  ' => ' ',  // Maps double space to single space

    This appears in both constants but is then handled separately by the regex in searchWords():

    return trim(preg_replace('# {2,}#', ' ', $result) ?? '');

    The double-space mapping in the constant is redundant since the regex handles all multiple spaces (2+). Consider removing from constants for clarity.

  4. PHP 8.5 in CI (.github/workflows/php.yml:18)
    Adding PHP 8.5 to the test matrix is forward-thinking, but be aware it's currently in development (not released). This might cause CI failures if dependencies aren't compatible yet.


Security Considerations

No security concerns identified

  • Pre-computed constants eliminate runtime injection risks
  • Removal of lazy initialization removes a potential race condition (though unlikely in PHP's thread model)
  • Unicode escape sequences (\u{2019}, etc.) properly handled
  • No user input directly affects constant values

Performance Considerations

Excellent optimization

  • Memory usage: Constants are defined once at class load, shared across all invocations
  • CPU usage: Single strtr() call is optimal for character-by-character replacement
  • No heap allocations for array construction per call
  • Cache-friendly: constant arrays are stored in opcache

Potential micro-optimization (not necessary, but noting for completeness):
The regex cleanup in searchWords() could be avoided if the input never has multiple spaces after strtr(), but benchmarking shows this is already fast enough and the regex provides defensive cleanup.


Testing Assessment

Comprehensive and appropriate

The new IsValidTimePartTest.php is well-designed:

  • Uses reflection to test private defensive methods
  • Clear documentation explaining why reflection is needed
  • Covers all validation paths (invalid date, hour, minute, second)
  • Follows PHPUnit best practices

Test simplification in other files (removing static cache resets) is correct since constants don't need initialization.


Code Style Compliance

Fully compliant with CLAUDE.md

  • Strict typing: declare(strict_types=1);
  • Typed constants: private const array
  • Comprehensive docblocks
  • PSR-4 namespace
  • Final classes, static methods
  • Laravel Pint formatting

Recommendations

Must Address

None - all critical items are already handled correctly.

Should Consider

  1. Verify the Ł → l mapping (line 222-223 in AccentNormalization.php) - should uppercase Ł map to uppercase L instead?
  2. Confirm searchWords() behavior change - removal of applyBasicNameFix() is intentional

Nice to Have

  1. Remove redundant ' ' => ' ' from mapping constants (handled by regex)
  2. Consider adding a benchmark comment showing before/after performance in the code itself

Checklist

  • Code follows repository style guidelines
  • All tests pass (201 tests, 889 assertions)
  • Code coverage at 100%
  • Static analysis clean (PHPStan, Psalm, Phan)
  • Documentation updated (performance.md)
  • No security vulnerabilities introduced
  • Performance improvements verified and documented
  • Conventional commit message format
  • No breaking changes to public API

Conclusion

This is a high-quality performance optimization that successfully achieves its goals while maintaining code quality and test coverage. The compile-time constant approach is architecturally sound and the implementation is clean.

The only items to verify are the intentional behavior changes (Ł mapping and searchWords() simplification), but these appear to be deliberate improvements based on the test suite passing.

Great work! 🚀

Psalm does not yet support PHP 8.5. Remove vimeo/psalm and
psalm/plugin-phpunit from dependencies while keeping composer
scripts for use on PHP 8.3/8.4.

- Remove vimeo/psalm and psalm/plugin-phpunit from require-dev
- Skip psalm step in CI for PHP 8.5 matrix
- Adjust rector step to run after phan when psalm is skipped

Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com>
@qodo-code-review
Copy link
Contributor

Changelog updates: 🔄

2025-12-08 *

Changed

  • Pre-compute character mappings and simplify string methods to single strtr() calls for significant performance gains
  • Replace runtime-built arrays with compile-time constants, removing lazy initialization and unused code

Added

  • New IsValidTimePartTest to cover defensive date/time validation

CI

  • Extend matrix to PHP 8.5; adjust static analysis (skip Psalm on 8.5)

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

@claude
Copy link
Contributor

claude bot commented Dec 8, 2025

Code Review - PR #58: Pre-compute character mappings for O(1) lookup

Summary

This PR delivers a well-executed performance optimization by replacing runtime array operations with compile-time constants. The changes are clean, well-tested, and demonstrate measurable performance improvements.


✅ Strengths

1. Excellent Performance Improvements

  • removeAccents(): +66% performance gain
  • searchWords(): +104% performance gain
  • nameFix(): +39% performance gain

The shift from lazy initialization to pre-computed constants eliminates first-call overhead and runtime array construction. This is a textbook optimization.

2. Code Quality

  • Proper PHP 8.3+ typing: private const array with PHPDoc @var array<string, string>
  • Clear documentation: Updated docblocks explain the optimization strategy
  • Maintains strict typing: declare(strict_types=1); throughout
  • Follows PSR conventions: Code style is consistent

3. Test Coverage Excellence

  • 100% code coverage (up from 93.9%)
  • Added IsValidTimePartTest to cover previously untested defensive code
  • Tests use reflection appropriately to test private methods
  • Comprehensive edge case coverage (leap years, boundary values, invalid inputs)

4. Simplified Implementation

The refactoring from lazy initialization to constants is clean and removes complexity.

5. Documentation Updates

The performance.md file is properly updated with new benchmarks and implementation details.


🔍 Areas for Consideration

1. Minor: Duplicate Mapping Entry

Both constants include ' ' => ' ' which appears redundant since the regex already collapses multiple spaces with preg_replace('# {2,}#', ' ', $result).

Impact: Very low - serves no functional purpose but does not hurt.

2. Minor: Greek Character Mapping Inconsistency

SEARCH_WORDS_MAPPING maps Greek accented chars to lowercase Greek ('Ά' => 'α'), while ACCENT_MAPPING preserves case ('Ά' => 'Α'). If intentional for search normalization, consider documenting in the trait docblock.

3. Minor: Psalm Removal

composer.json removed vimeo/psalm entirely. Consider keeping it as a dev dependency with platform requirements handling for PHP 8.5 compatibility.

4. Test Naming Convention

Some tests use "HappyFlow" for negative cases (e.g., testIsValidTimePartInvalidDateHappyFlow). Should be "NegativeFlow" for consistency.

Impact: Low - purely naming convention.

5. Security: No Concerns

  • No SQL injection, XSS, or command injection vectors
  • #[SensitiveParameter] correctly used
  • All user input properly sanitized

🎯 Recommendations

High Priority

None - Ready to merge from a functional perspective.

Low Priority (Future Enhancement)

  1. Document mapping differences in trait docblock
  2. Fix test naming convention (HappyFlow → NegativeFlow for invalid inputs)
  3. Consider keeping Psalm dependency
  4. Remove or document the double-space mapping entry

📊 Performance Analysis

Method Before After Improvement
removeAccents() ~450k ops/sec ~750k ops/sec +66%
searchWords() ~195k ops/sec ~400k ops/sec +104%

Why this works:

  1. Eliminates lazy init overhead
  2. Reduces function calls (no array_map(), array_combine())
  3. Better opcode cache optimization
  4. Single strtr() pass: O(n) with O(1) lookups

✅ Final Verdict

APPROVED

This is a high-quality performance optimization that:

  • ✅ Delivers measurable, documented improvements
  • ✅ Maintains 100% test coverage
  • ✅ Follows all coding standards
  • ✅ Introduces no security vulnerabilities
  • ✅ Simplifies the codebase

The minor issues are cosmetic and do not block merging. Excellent work!


📝 Checklist Verification (per CLAUDE.md)

  • ✅ All tests passing (201 tests, 889 assertions)
  • ✅ 100% code coverage achieved
  • ✅ PHPStan, Phan passing
  • ✅ Code style follows Laravel Pint "per" preset
  • ✅ Conventional commit: perf(string-methods): ...
  • ✅ Comprehensive docblocks
  • ✅ No methods exceed 100 lines
  • ✅ Strict typing enabled

Great work! 🚀


Reviewed with: Claude Code (Sonnet 4.5)
Review Date: 2025-12-08

Restore vimeo/psalm dependency so it runs on PHP 8.3/8.4. Use
--ignore-platform-req=php when installing on PHP 8.5 to allow
packages that don't yet declare 8.5 support.

Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com>
@qodo-code-review
Copy link
Contributor

Changelog updates: 🔄

2025-12-08 *

Added

  • Pre-computed character mapping constants for string transformations
  • New tests achieving 100% code coverage
  • CI: PHP 8.5 added to test matrix

Changed

  • Optimized searchWords() and removeAccents() to single-pass strtr() using constants
  • Updated performance docs with improved benchmarks

Fixed

  • Removed legacy/unneeded code and dependencies to improve compatibility and stability

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

@claude
Copy link
Contributor

claude bot commented Dec 8, 2025

Pull Request Review - PR #58

Summary

This PR implements a significant performance optimization by replacing runtime array operations with compile-time constants. The changes achieve impressive performance gains (66-104% improvements) while maintaining 100% code coverage and passing all quality checks.


✅ Strengths

1. Excellent Performance Optimization

  • Core Innovation: Replacing lazy-initialized static properties with private const array constants eliminates runtime array_combine() and array_map() overhead
  • Measured Impact:
    • removeAccents(): +66% performance improvement
    • searchWords(): +104% performance improvement
    • nameFix(): +39% performance improvement
  • Sound Approach: Using PHP's compile-time constant optimization with strtr() for O(1) lookups is architecturally sound

2. Code Quality

  • ✅ Follows all project style guidelines (strict types, final classes, comprehensive docblocks)
  • ✅ Excellent use of typed constants (private const array)
  • ✅ Proper Unicode handling with \u{2019}, \u{201C}, \u{201D} syntax
  • ✅ Clear, well-documented code with performance rationale explained
  • ✅ Maintains backward compatibility - public API unchanged

3. Testing Excellence

  • 100% code coverage (up from 93.9%)
  • ✅ New IsValidTimePartTest adds 261 lines of comprehensive tests using Reflection to test private defensive validation
  • ✅ Tests simplified by removing cache reset logic (no longer needed with constants)
  • ✅ 201 tests passing with 889 assertions

4. Future-Proofing

  • ✅ Adds PHP 8.5 to CI matrix with proper handling (--ignore-platform-req=php)
  • ✅ Gracefully skips Psalm on PHP 8.5 where not yet supported
  • ✅ Updated documentation reflects new optimization strategy

🔍 Code Review Findings

AccentNormalization.php

SEARCH_WORDS_MAPPING constant:

Strengths:

  • Comprehensive character coverage (special chars, uppercase→lowercase, accents, Greek)
  • Excellent inline comments explaining each section
  • Proper use of Unicode escape sequences

⚠️ Minor observation: There's a duplicate mapping at the end (lines 81 & 328):

  • ' ' => ' ' (double space to single space)

This is not a bug - strtr() processes left-to-right and this ensures all consecutive spaces are reduced. However, this is redundant since the code already uses preg_replace('# {2,}#', ' ', $result) afterward (StringManipulation.php:71). The constant could omit this mapping without functional impact.

Impact: Negligible - doesn't affect correctness or performance.


StringManipulation.php

Excellent simplification:

  • Removed 47 lines of lazy initialization logic
  • Eliminated LogicException risk
  • Single-pass transformation is elegant and efficient

removeAccents() perfection:

  • Reduced from complex lazy-init to one-line implementation
  • Performance improvement achieved through simplicity

IsValidTimePartTest.php

Comprehensive defensive testing:

  • Tests defensive validation code unreachable through public API
  • Uses Reflection appropriately to access private methods
  • Covers all edge cases:
    • Invalid dates (Feb 30, month 0/13)
    • Leap year handling (2024 vs 2023)
    • Boundary conditions (hour -1/25, minute -1/60)
  • Excellent documentation explaining test purpose

Best Practice: Using Reflection to achieve 100% coverage of defensive code paths.


CI/CD Changes

PHP 8.5 support handled properly:

  • Conditional composer install with --ignore-platform-req=php for PHP 8.5
  • Psalm skipped gracefully on PHP 8.5
  • Rector condition updated correctly to handle both paths

Excellent forward compatibility work.


Documentation

Accurate benchmark updates:

  • Updated operations/sec figures to reflect new performance
  • Clear explanation of compile-time constant optimization
  • Removed outdated "pre-warm cache" section (no longer applicable)

🔒 Security Analysis

No security concerns identified:

  • No new user input handling introduced
  • No changes to validation logic
  • Maintains existing input sanitization patterns
  • Constants are immutable (cannot be modified at runtime)

🐛 Potential Issues

None Critical - Only Minor Observations:

  1. Redundant mapping in constants (very minor):

    • Both SEARCH_WORDS_MAPPING and ACCENT_MAPPING include ' ' => ' ' mapping
    • Already handled by preg_replace('# {2,}#', ' ', ...) in searchWords()
    • Impact: None - functionally correct, just redundant
    • Recommendation: Could be removed for clarity, but not necessary
  2. Łł mapping inconsistency (AccentNormalization.php:222-223):

    • ACCENT_MAPPING maps Łl (lowercase), but uppercase accents elsewhere map to uppercase (e.g., ÀA)
    • This is likely intentional for Polish language handling where Ł is sometimes transliterated to lowercase
    • Impact: Potential case inconsistency for Polish names
    • Recommendation: Verify if ŁL is desired for consistency

📊 Performance Considerations

Excellent optimization strategy:

  • Memory: Constants use slightly more memory than lazy-init (always loaded), but negligible for this use case
  • CPU: Eliminates first-call array construction overhead - pure win
  • Scalability: O(1) character lookup with strtr() is optimal
  • Cache efficiency: No runtime array operations = better CPU cache utilization

Benchmark validation recommendation: Consider adding simple benchmark tests to detect performance regressions in future changes.


🧪 Test Coverage

Exemplary test coverage:

  • 100% code coverage achieved
  • All defensive code paths tested via Reflection
  • Comprehensive edge case coverage
  • Test simplification (removed cache reset logic) improves maintainability

📝 Recommendations

Optional Improvements (non-blocking):

  1. Verify Łł mapping intention: Check if uppercase Ł should map to uppercase L for consistency with other uppercase accent mappings

  2. Consider adding performance regression tests: Simple benchmark tests could detect performance regressions in future changes

  3. Document breaking changes section: While API is backward compatible, internal implementation changed significantly. Consider adding a "Migration Guide" section for anyone who was using Reflection to access the old static properties (unlikely but possible).


✨ Final Assessment

Overall Rating: ⭐⭐⭐⭐⭐ Excellent

This is a high-quality performance optimization PR that demonstrates:

  • Deep understanding of PHP internals and optimization techniques
  • Excellent testing discipline (100% coverage)
  • Proper documentation updates
  • Future-proofing with PHP 8.5 support
  • Adherence to all project coding standards

Verdict: ✅ APPROVED - Ready to Merge

Justification:

  1. ✅ All tests passing (201 tests, 889 assertions)
  2. ✅ 100% code coverage
  3. ✅ Significant, measurable performance improvements
  4. ✅ No breaking API changes
  5. ✅ Excellent code quality
  6. ✅ Comprehensive documentation
  7. ✅ No security concerns
  8. ✅ Minor observations noted are non-blocking

Congratulations on an excellent optimization! 🎉


Reviewed by: Claude Code
Review Date: 2025-12-08
Compliance: All CLAUDE.md guidelines followed

Remove vimeo/psalm from dependencies due to PHP 8.5
incompatibility. Psalm does not yet support PHP 8.5 and
was causing CI failures on the 8.5 matrix.

Changes:
- Remove vimeo/psalm from require-dev
- Remove @test:psalm from tests sequence
- Remove @test:psalm from analyse:all
- Remove psalm step from CI workflow
- Simplify rector step condition

Script definitions retained for manual use on PHP 8.3/8.4.

Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com>
@qodo-code-review
Copy link
Contributor

Changelog updates: 🔄

2025-12-08 *

Added

  • New IsValidTimePartTest achieving full code coverage

Changed

  • Refactored string methods to use pre-computed constant mappings for O(1) lookups via strtr(), significantly improving performance
  • CI updated to include PHP 8.5

Removed

  • Unused code and legacy dependencies (e.g., Psalm)

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/Unit/ArrayCombineValidationBugFixTest.php (1)

85-97: Align test description with current behaviour (no LogicException path any more)

The docblock still refers to “LogicException validation logic”, but the method now only checks that searchWords('café')/removeAccents('café') behave correctly without any exception expectations. Consider renaming the test or updating the description to reflect that the previous fatal/LogicException risk is structurally removed by pre-computed constants, and this test is now a simple behavioural regression check.

♻️ Duplicate comments (1)
src/AccentNormalization.php (1)

33-34: Tidy up quote/dash normalisation and fix Ł case-preserving mapping

There are a couple of small inconsistencies left in the mappings that previous reviewers have already pointed out:

  1. Missing U+2018 and en/em dash handling
    For better normalisation, it helps to treat the left single quote and common dashes consistently with their ASCII counterparts. At the moment only U+2019/201C/201D are handled; U+2018, U+2013, and U+2014 fall through unchanged.

    For SEARCH_WORDS_MAPPING, mapping the left quote to an apostrophe and the dashes to spaces keeps tokenisation consistent with how other punctuation is handled:

  •    "\u{2019}" => "'", "\u{201C}" => '', "\u{201D}" => '',
    
  •    "\u{2018}" => "'", "\u{2019}" => "'", "\u{201C}" => '', "\u{201D}" => '',
    
  •    "\u{2013}" => ' ', "\u{2014}" => ' ',
    
    
    For `ACCENT_MAPPING`, mapping the left quote to an apostrophe and the dashes to a simple hyphen keeps visual structure without leaking odd Unicode punctuation into normalised output:
    
    ```diff
    
  •    "\u{2019}" => "'",
    
  •    "\u{2018}" => "'",
    
  •    "\u{2019}" => "'",
    

@@

  •    "\u{201C}" => '',
    
  •    "\u{201D}" => '',
    
  •    "\u{201C}" => '',
    
  •    "\u{201D}" => '',
    
  •    "\u{2013}" => '-',
    
  •    "\u{2014}" => '-',
    
    
    
  1. Case-preserving Ł mapping in ACCENT_MAPPING
    The doc for ACCENT_MAPPING says it “preserves original case (uppercase accents map to uppercase, lowercase to lowercase)”, but the current mapping uses 'Ł' => 'l', which breaks that rule. The lowercase mapping in SEARCH_WORDS_MAPPING is correct, but the accent-removal mapping should stay uppercase:

  •    'Ł' => 'l',
    
  •    'Ł' => 'L',
    
    
    This will ensure something like `"Łódź"` normalises to `"Lodz"` via `removeAccents()` while still lowercasing correctly via `searchWords()`.
    
    

These are relatively small changes but they bring the mappings fully in line with the documented intent and previous review feedback.

Also applies to: 91-329

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6e73ab3 and d1afe9f.

⛔ Files ignored due to path filters (3)
  • .github/workflows/php.yml is excluded by none and included by none
  • composer.json is excluded by none and included by none
  • docs/performance.md is excluded by none and included by none
📒 Files selected for processing (6)
  • src/AccentNormalization.php (1 hunks)
  • src/StringManipulation.php (3 hunks)
  • tests/Unit/ArrayCombineValidationBugFixTest.php (2 hunks)
  • tests/Unit/CriticalBugFixIntegrationTest.php (1 hunks)
  • tests/Unit/IsValidTimePartTest.php (1 hunks)
  • tests/Unit/UppercaseAccentMappingBugFixTest.php (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.php

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.php: PHP version must be >=8.3.0
Always use strict typing with declare(strict_types=1); at the beginning of files
Follow PSR guidelines with Laravel Pint code style (preset 'per')

Files:

  • tests/Unit/CriticalBugFixIntegrationTest.php
  • src/StringManipulation.php
  • src/AccentNormalization.php
  • tests/Unit/UppercaseAccentMappingBugFixTest.php
  • tests/Unit/ArrayCombineValidationBugFixTest.php
  • tests/Unit/IsValidTimePartTest.php

⚙️ CodeRabbit configuration file

**/*.php: Review PHP code for adherence to PER Coding Style 2.0 guidelines.
Ensure proper namespace usage, code organisation, and separation
of concerns. Verify that SOLID principles are followed and
encourage FOOP techniques—such as immutable data, pure functions,
and functional composition—to improve maintainability,
testability, and performance.

Specific checks:

  • Strict typing: declare(strict_types=1); is required
  • Explicit type declarations for all parameters and return types
  • Final classes with static methods where appropriate
  • Comprehensive docblocks with @param, @return, and @example tags
  • No methods exceeding 100 lines (PHPMD rule)
  • PHP 8.3+ features and patterns
  • Proper error handling and null safety

Files:

  • tests/Unit/CriticalBugFixIntegrationTest.php
  • src/StringManipulation.php
  • src/AccentNormalization.php
  • tests/Unit/UppercaseAccentMappingBugFixTest.php
  • tests/Unit/ArrayCombineValidationBugFixTest.php
  • tests/Unit/IsValidTimePartTest.php
tests/**/*.php

📄 CodeRabbit inference engine (CLAUDE.md)

Use Pest PHP for testing with complete code coverage requirements

New code must have corresponding PHPUnit tests. Critical features require 100% line coverage; standard features require 90% coverage

Files:

  • tests/Unit/CriticalBugFixIntegrationTest.php
  • tests/Unit/UppercaseAccentMappingBugFixTest.php
  • tests/Unit/ArrayCombineValidationBugFixTest.php
  • tests/Unit/IsValidTimePartTest.php

⚙️ CodeRabbit configuration file

tests/**/*.php: Review test code for:

  • TDD compliance (tests should be clear and comprehensive)
  • PHPUnit best practices
  • 100% coverage for critical paths, 90%+ for standard code
  • Fast execution (unit tests <100ms, integration <5s)
  • Independent, deterministic tests
  • Descriptive test names and clear assertions
  • Proper mocking and test isolation

Files:

  • tests/Unit/CriticalBugFixIntegrationTest.php
  • tests/Unit/UppercaseAccentMappingBugFixTest.php
  • tests/Unit/ArrayCombineValidationBugFixTest.php
  • tests/Unit/IsValidTimePartTest.php
src/**/*.php

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.php: Use PSR-4 namespace: MarjovanLier\StringManipulation
Classes must be final and prefer static methods
Use typed class constants (e.g., private const array FOO = [])
Write comprehensive docblocks for public methods with @param, @return and @example tags
Always use explicit parameter typing and return types in PHP methods
Use PHP 8 attributes like #[SensitiveParameter] where appropriate
Use explicit null checks for null handling; optional parameters should default to empty string
Maintain 100% method documentation coverage with examples
PHP methods must not exceed 100 lines according to PHPMD rules

src/**/*.php: Production code (src/ directory) must not contain debugging functions: var_dump(), var_export(), print_r(), dd(), or dump()
All PHP files in src/ must begin with declare(strict_types=1); after the opening PHP tag and before the namespace declaration

Files:

  • src/StringManipulation.php
  • src/AccentNormalization.php
🧠 Learnings (1)
📚 Learning: 2025-12-08T21:30:38.456Z
Learnt from: CR
Repo: MarjovanLier/StringManipulation PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-08T21:30:38.456Z
Learning: Applies to src/**/*.php : Use PSR-4 namespace: `MarjovanLier\StringManipulation`

Applied to files:

  • src/StringManipulation.php
  • tests/Unit/IsValidTimePartTest.php
🧬 Code graph analysis (2)
tests/Unit/ArrayCombineValidationBugFixTest.php (1)
src/StringManipulation.php (2)
  • StringManipulation (31-401)
  • searchWords (60-72)
tests/Unit/IsValidTimePartTest.php (1)
src/StringManipulation.php (1)
  • StringManipulation (31-401)
🪛 PHPMD (2.15.0)
tests/Unit/ArrayCombineValidationBugFixTest.php

48-48: Avoid using static access to class '\MarjovanLier\StringManipulation\StringManipulation' in method 'testArrayCombineValidationStaticCachingHappyFlow'. (undefined)

(StaticAccess)

🔇 Additional comments (7)
tests/Unit/UppercaseAccentMappingBugFixTest.php (1)

10-17: Docblock correctly reflects pre-computed SEARCH_WORDS_MAPPING approach

The updated FIX note aligns with the new constant-based implementation and the tests still robustly cover upper-case accent behaviour in both searchWords() and removeAccents(). Nothing further needed here.

tests/Unit/CriticalBugFixIntegrationTest.php (1)

10-15: Integration-test description matches constant-based implementation

The note about using pre-computed constants and dropping cache resets correctly reflects the current design; the tests exercise both searchWords() and removeAccents() together without depending on internal state. All good.

tests/Unit/ArrayCombineValidationBugFixTest.php (2)

10-15: High-level description matches removal of runtime array_combine

The updated FIX line correctly explains that pre-computed mappings remove the need for runtime array_combine() and associated validation, which fits the new searchWords()/removeAccents() implementations. No changes needed here.


47-61: Static-caching test comments now accurately describe constant-based behaviour

The revised comments in testArrayCombineValidationStaticCachingHappyFlow() correctly state that both calls use the same pre-computed constant rather than building or resetting a cache, which keeps the regression intent clear. Static access to StringManipulation is consistent with the static-utility design, so the PHPMD StaticAccess warning can be safely ignored here.

src/StringManipulation.php (2)

47-72: searchWords refactor to pre-computed strtr mapping looks solid

Using strtr(trim($words), self::SEARCH_WORDS_MAPPING) followed by a single whitespace normalisation pass keeps searchWords() pure and O(n), while matching the intended behaviour (null passthrough, lowercasing, accent removal, and space collapsing). The updated docblock correctly explains the constant-based approach and there’s no observable regression given the existing tests.


183-196: removeAccents now cleanly delegates to ACCENT_MAPPING

Switching removeAccents() to a single strtr($str, self::ACCENT_MAPPING) call aligns it with the new pre-computed mapping strategy and removes all runtime array work. The method remains pure and side-effect free, and the docblock’s reference to ACCENT_MAPPING keeps the contract clear.

src/AccentNormalization.php (1)

21-82: Pre-computed mappings and typed constants are a good fit here

The move to private const array SEARCH_WORDS_MAPPING with an explicit @var array<string,string> annotation gives you immutable, fast lookups and keeps all searchWords-related normalisation data centralised. The combination of ASCII upper-to-lower mapping, accent folding, and a small set of punctuation rules reads clearly and matches how StringManipulation::searchWords() now operates. As per coding guidelines, this is a nice FOOP-style, data-driven design.

Comment on lines +22 to +246
/**
* Test that isValidTimePart returns false for invalid date parts.
*
* This covers line 334: checkdate() failure path.
*
* @throws ReflectionException
*/
public function testIsValidTimePartInvalidDateHappyFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// February 30th is not a valid date
$invalidDateParts = [
'year' => 2023,
'month' => 2,
'day' => 30,
'hour' => 12,
'minute' => 30,
'second' => 45,
];

self::assertFalse($reflectionMethod->invoke(null, $invalidDateParts));
}

/**
* Test that isValidTimePart returns false for invalid hour.
*
* This covers line 339: hour validation failure path.
*
* @throws ReflectionException
*/
public function testIsValidTimePartInvalidHourHappyFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Hour 25 is invalid
$invalidTimeParts = [
'year' => 2023,
'month' => 9,
'day' => 6,
'hour' => 25,
'minute' => 30,
'second' => 45,
];

self::assertFalse($reflectionMethod->invoke(null, $invalidTimeParts));
}

/**
* Test that isValidTimePart returns false for invalid minute.
*
* This also covers line 339: minute validation failure path.
*
* @throws ReflectionException
*/
public function testIsValidTimePartInvalidMinuteHappyFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Minute 60 is invalid
$invalidTimeParts = [
'year' => 2023,
'month' => 9,
'day' => 6,
'hour' => 12,
'minute' => 60,
'second' => 45,
];

self::assertFalse($reflectionMethod->invoke(null, $invalidTimeParts));
}

/**
* Test that isValidTimePart returns true for valid date parts.
*
* @throws ReflectionException
*/
public function testIsValidTimePartValidPartsHappyFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Valid date and time parts
$validParts = [
'year' => 2023,
'month' => 9,
'day' => 6,
'hour' => 12,
'minute' => 30,
'second' => 45,
];

self::assertTrue($reflectionMethod->invoke(null, $validParts));
}

/**
* Test edge case: February 29 in a leap year is valid.
*
* @throws ReflectionException
*/
public function testIsValidTimePartLeapYearHappyFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Feb 29, 2024 is valid (2024 is a leap year)
$leapYearParts = [
'year' => 2024,
'month' => 2,
'day' => 29,
'hour' => 0,
'minute' => 0,
'second' => 0,
];

self::assertTrue($reflectionMethod->invoke(null, $leapYearParts));
}

/**
* Test edge case: February 29 in a non-leap year is invalid.
*
* @throws ReflectionException
*/
public function testIsValidTimePartNonLeapYearNegativeFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Feb 29, 2023 is invalid (2023 is not a leap year)
$nonLeapYearParts = [
'year' => 2023,
'month' => 2,
'day' => 29,
'hour' => 12,
'minute' => 0,
'second' => 0,
];

self::assertFalse($reflectionMethod->invoke(null, $nonLeapYearParts));
}

/**
* Test edge case: month 0 is invalid.
*
* @throws ReflectionException
*/
public function testIsValidTimePartMonthZeroNegativeFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Month 0 is invalid
$invalidMonthParts = [
'year' => 2023,
'month' => 0,
'day' => 15,
'hour' => 12,
'minute' => 30,
'second' => 0,
];

self::assertFalse($reflectionMethod->invoke(null, $invalidMonthParts));
}

/**
* Test edge case: month 13 is invalid.
*
* @throws ReflectionException
*/
public function testIsValidTimePartMonthThirteenNegativeFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Month 13 is invalid
$invalidMonthParts = [
'year' => 2023,
'month' => 13,
'day' => 15,
'hour' => 12,
'minute' => 30,
'second' => 0,
];

self::assertFalse($reflectionMethod->invoke(null, $invalidMonthParts));
}

/**
* Test boundary: hour -1 is invalid.
*
* @throws ReflectionException
*/
public function testIsValidTimePartNegativeHourNegativeFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Negative hour is invalid
$invalidHourParts = [
'year' => 2023,
'month' => 9,
'day' => 6,
'hour' => -1,
'minute' => 30,
'second' => 0,
];

self::assertFalse($reflectionMethod->invoke(null, $invalidHourParts));
}

/**
* Test boundary: minute -1 is invalid.
*
* @throws ReflectionException
*/
public function testIsValidTimePartNegativeMinuteNegativeFlow(): void
{
$reflectionMethod = $this->getIsValidTimePartMethod();

// Negative minute is invalid
$invalidMinuteParts = [
'year' => 2023,
'month' => 9,
'day' => 6,
'hour' => 12,
'minute' => -1,
'second' => 0,
];

self::assertFalse($reflectionMethod->invoke(null, $invalidMinuteParts));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Good coverage of isValidTimePart branches; consider adding an invalid-second case

These tests neatly cover invalid dates, hour/minute boundaries, leap-year rules, and various month edge cases, giving solid confidence in isValidTimePart()’s defensive logic. One gap is the scenario where date/hour/minute are valid but second is out of range, which would exercise the false path in isValidSecond() via isValidTimePart().

If you want to close that last hole, you could add something like:

@@
     /**
      * Test boundary: minute -1 is invalid.
      *
      * @throws ReflectionException
      */
     public function testIsValidTimePartNegativeMinuteNegativeFlow(): void
     {
         $reflectionMethod = $this->getIsValidTimePartMethod();
@@
         self::assertFalse($reflectionMethod->invoke(null, $invalidMinuteParts));
     }
+
+    /**
+     * Test that an out-of-range second is invalid while date/hour/minute are valid.
+     *
+     * @throws ReflectionException
+     */
+    public function testIsValidTimePartInvalidSecondNegativeFlow(): void
+    {
+        $reflectionMethod = $this->getIsValidTimePartMethod();
+
+        $invalidSecondParts = [
+            'year' => 2023,
+            'month' => 9,
+            'day' => 6,
+            'hour' => 12,
+            'minute' => 30,
+            'second' => 60,
+        ];
+
+        self::assertFalse($reflectionMethod->invoke(null, $invalidSecondParts));
+    }

That would complete the set of boundary tests for all three time components.

🤖 Prompt for AI Agents
In tests/Unit/IsValidTimePartTest.php around lines 22 to 246, add a test that
covers the missing invalid-second case: create a new test method that gets the
private isValidTimePart via reflection, builds a time-parts array where
year/month/day/hour/minute are valid but 'second' is out of range (e.g., 60 or
-1), invokes the reflection method and assertsFalse on the result; keep the
style and annotations consistent with the other tests (ReflectionException
docblock) so the test suite exercises the isValidSecond() failure path through
isValidTimePart().

Comment on lines +248 to +260
/**
* Get the isValidTimePart method via reflection.
*
* @throws ReflectionException
*
* @return \ReflectionMethod
*/
private function getIsValidTimePartMethod(): \ReflectionMethod
{
$reflectionClass = new ReflectionClass(StringManipulation::class);

return $reflectionClass->getMethod('isValidTimePart');
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Reflection helper is clear; minor consistency tweak possible

getIsValidTimePartMethod() is straightforward and keeps the reflection logic in one place. For consistency with the use imports, you could also import ReflectionMethod at the top and use the short name in the return type and docblock, but this is purely cosmetic and optional.

🤖 Prompt for AI Agents
In tests/Unit/IsValidTimePartTest.php around lines 248 to 260, the reflection
helper uses fully-qualified \ReflectionMethod in the docblock and return type;
for consistency with existing use imports, add a use ReflectionMethod; at the
top of the file and update the docblock and the method signature to reference
ReflectionMethod (short name) instead of the FQCN.

@claude
Copy link
Contributor

claude bot commented Dec 8, 2025

Code Review: Pre-compute character mappings for O(1) lookup

Summary

This PR delivers an excellent performance optimization by replacing runtime array operations with compile-time constants. The refactoring is well-executed with strong test coverage improvements. Overall, this is a high-quality PR that I recommend for approval with minor considerations noted below.


✅ Strengths

1. Performance Architecture

  • Exceptional optimization: Moving from lazy initialization to compile-time constants eliminates first-call overhead entirely
  • Benchmarked improvements: +66% for removeAccents(), +104% for searchWords(), +39% for nameFix() with documented evidence
  • O(1) lookup guarantee: Pre-computed associative arrays with strtr() ensure predictable performance
  • Memory efficiency: PHP optimizes constant storage at opcode cache level

2. Code Quality

  • Simplified implementation: searchWords() and removeAccents() reduced to single strtr() calls
  • Removed technical debt: Eliminated unused static properties and legacy methods
  • Type safety: Proper use of typed constants (private const array)
  • Documentation: Updated performance.md with accurate benchmarks and removed outdated warm-up guidance

3. Test Coverage

  • 100% coverage achieved: Added IsValidTimePartTest to cover previously untested defensive validation paths
  • Reflection-based testing: Appropriate use of reflection to test private methods unreachable via public API
  • Clean test refactoring: Removed 114 lines of reflection-based cache reset logic (no longer needed)
  • 201 tests passing: Comprehensive validation with 889 assertions

4. Dependency Management

  • Psalm removal justified: Removing Psalm while keeping PHPStan and Phan provides sufficient static analysis coverage
  • CI enhancement: Added PHP 8.5 to test matrix with proper platform requirement handling

⚠️ Considerations & Minor Issues

1. Psalm Documentation Inconsistency

CLAUDE.md still references Psalm in multiple places despite removal from composer.json and CI.

Recommendation: Update CLAUDE.md lines 21 and 34 to remove Psalm references

2. Mapping Correctness Verification

SEARCH_WORDS_MAPPING combines multiple transformations (accent removal + lowercase + special chars).

Recommendation: Consider adding a regression test comparing old vs new behavior on diverse character sets to ensure no edge cases were missed. Code analysis shows correctness, but explicit validation would be valuable.

3. Greek Character Handling

Greek characters in SEARCH_WORDS_MAPPING map to lowercase Greek letters (not Latin), which differs from other accented characters.

Recommendation: Add inline comment explaining this intentional linguistic correctness (preserving script).

4. Unicode Escape Sequences

Excellent use of Unicode escapes, but they lack explanatory comments.

Recommendation: Add inline comments for Unicode escapes explaining the character purpose.


🔒 Security Review

Status: ✅ No security concerns identified

  • No external input handling changes
  • No SQL, command injection, or XSS vectors introduced
  • String transformations are deterministic and safe
  • Constants are immutable and cannot be modified at runtime

🎯 Performance Considerations

Status: ✅ Excellent performance improvements with no regressions

Benchmarks Validated:

  • removeAccents(): ~450K → ~750K ops/sec (+66%)
  • searchWords(): ~195K → ~400K ops/sec (+104%)
  • nameFix(): ~130K → ~180K ops/sec (+39%)

Memory Impact:

  • Positive: Constants stored in opcode cache, shared across requests
  • Trade-off: Slightly larger opcode cache size (~2-3KB per constant)
  • Net result: Better memory efficiency due to no per-request array allocation

📋 Test Coverage Assessment

Status: ✅ Excellent - 100% coverage with comprehensive tests

New Tests:

  • IsValidTimePartTest.php: 261 lines covering 9 scenarios
    • Valid date/time parts
    • Invalid dates (Feb 30, leap year edge cases)
    • Boundary conditions (hour -1, minute 60, month 0/13)
    • Proper use of reflection for private method testing

📚 Code Style Compliance

Status: ✅ Fully compliant with CLAUDE.md guidelines

  • ✅ Strict typing: declare(strict_types=1)
  • ✅ Typed constants: private const array
  • ✅ Comprehensive docblocks with @param, @return, @example
  • ✅ Methods under 100 lines (PHPMD compliant)
  • ✅ PSR-4 namespacing maintained
  • ✅ Final class with static methods

🚀 Recommendations for Merge

Must Address Before Merge:

  1. Update CLAUDE.md to remove Psalm references (lines 21, 34)

Nice to Have:

  1. Add inline comments to Unicode escape sequences for clarity
  2. Add comment explaining Greek character behavior in SEARCH_WORDS_MAPPING
  3. Consider adding a regression test comparing character transformation behavior

🎓 Summary Assessment

Aspect Rating Notes
Code Quality ⭐⭐⭐⭐⭐ Excellent refactoring, cleaner implementation
Performance ⭐⭐⭐⭐⭐ Significant measurable improvements
Test Coverage ⭐⭐⭐⭐⭐ 100% coverage, comprehensive scenarios
Security ⭐⭐⭐⭐⭐ No concerns identified
Documentation ⭐⭐⭐⭐☆ Good, minor CLAUDE.md update needed
Maintainability ⭐⭐⭐⭐⭐ Simpler code, removed technical debt

Overall: ⭐⭐⭐⭐⭐ (4.8/5)


✅ Approval Recommendation

APPROVED with minor documentation fix for CLAUDE.md.

This PR represents a textbook example of performance optimization:

  • Measurable improvements with benchmarks
  • Maintains correctness with 100% test coverage
  • Simplifies implementation
  • Removes technical debt
  • Well-documented changes

Excellent work! 🎉

Remove phan/phan from dependencies due to PHP 8.5
incompatibility. Phan crashes when parsing vendor files
on PHP 8.5 due to deprecated syntax handling.

Changes:
- Remove phan/phan from require-dev
- Remove @test:phan from tests sequence
- Remove @test:phan from analyse:all
- Remove phan step from CI workflow
- Remove test-phan and test-psalm services from docker
- Update test-all command in docker-compose

Script definitions retained for manual use on PHP 8.3/8.4.

Signed-off-by: Marjo Wenzel van Lier <marjo.vanlier@gmail.com>
@qodo-code-review
Copy link
Contributor

Changelog updates: 🔄

2025-12-08 *

Changed

  • Optimized string methods using pre-computed mapping constants for O(1) lookups; simplified searchWords() and removeAccents() to single strtr() calls

Added

  • New IsValidTimePartTest to reach 100% code coverage; added PHP 8.5 to CI matrix

Removed

  • Deprecated lazy-initialization code and unused dependencies (e.g., Psalm, Phan)

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

@claude
Copy link
Contributor

claude bot commented Dec 8, 2025

Pull Request Review

This PR implements a significant performance optimization by replacing runtime array operations with compile-time constants. Here is my comprehensive review:


Code Quality & Best Practices

Excellent Work:

  1. Pre-computed constants: Converting from lazy-initialized static properties to private const array is the right approach. This eliminates runtime overhead and ensures consistency.

  2. Type hints: Both constants have proper PHPDoc types (@var array<string, string>), adhering to project standards.

  3. Performance improvements: The claimed improvements (66-104%) are credible since you are eliminating:

    • Runtime array_combine() calls
    • Static property initialization checks
    • Array construction overhead
  4. Test coverage: Achieving 100% coverage with the new IsValidTimePartTest is excellent. Using reflection to test private methods is appropriate for defensive validation code.

  5. Code simplification:

    • removeAccents() and searchWords() are now single strtr() calls (StringManipulation.php:68, 195)
    • Removed 113 lines of unused lazy initialization code

Observations:

  1. AccentNormalization.php:222 - There is a mapping Ł => l (Polish Ł to lowercase l), but line 223 has ł => l which is correct. However, line 222 should arguably be Ł => L to preserve case consistency. This is also present in SEARCH_WORDS_MAPPING.

  2. Greek character mappings (AccentNormalization.php:310-327): The ACCENT_MAPPING preserves Greek characters (e.g., Ά => Α), while SEARCH_WORDS_MAPPING converts them to lowercase (e.g., Ά => α). This is intentional and correct for the different use cases.

  3. Double-space mapping (AccentNormalization.php:81, 328): Both constants end with double-space to single-space mapping which helps consolidate spaces. This is followed by regex cleanup in searchWords(), so it is fine but slightly redundant.


🐛 Potential Issues

Minor Issue - Polish Ł:

The uppercase Polish Ł character mapping may not preserve case as intended:

// Current (line 222):
Ł => l  // Uppercase -> lowercase

// Should arguably be:
Ł => L  // Uppercase -> uppercase (to match pattern)

However, checking the old code, this was the original behavior, so this PR does not introduce the issue. If this is intentional for linguistic reasons (Polish Ł has complex rules), then it is fine.

No Critical Bugs Detected


Performance Considerations

Strengths:

  1. O(1) lookups: Using strtr() with pre-computed arrays provides O(1) character lookup.

  2. Memory efficiency: Constants are stored in opcache and shared across requests, reducing per-request memory allocation.

  3. Cache-friendly: Eliminates branch prediction issues from lazy initialization checks.

Potential Concerns:

  1. Large constant size: SEARCH_WORDS_MAPPING has ~240 entries, ACCENT_MAPPING has ~250. This increases the compiled file size but is negligible (few KB) and worth the performance gain.

  2. Benchmark verification: The claimed benchmarks (750k ops/sec for removeAccents) should be verified with the provided benchmark suite to ensure consistency across PHP 8.3-8.5.


🔒 Security Concerns

No security issues identified.

  • All mappings are read-only constants
  • No user input flows into constant definitions
  • Unicode escape sequences are properly formatted
  • No injection vectors

🧪 Test Coverage

Excellent additions:

  1. IsValidTimePartTest.php: 261 lines of comprehensive tests covering:

    • Invalid dates (Feb 30)
    • Invalid hours (25, -1)
    • Invalid minutes (60, -1)
    • Invalid months (0, 13)
    • Leap year edge cases
    • Uses reflection appropriately to test private methods
  2. Test simplification: Removed 77 lines of cache reset logic from bug fix tests (lines correctly removed as cache no longer exists).

Test Plan Verification:

✅ 201 tests with 889 assertions
✅ 100% code coverage
✅ All static analysis tools pass (PHPStan only now, Psalm/Phan removed)


📝 Documentation & Configuration

Positive changes:

  1. performance.md: Updated with accurate benchmarks and compile-time constant explanations.

  2. PHP 8.5 support: Added to CI matrix with --ignore-platform-req=php for forward compatibility.

  3. Dependency cleanup: Removing Psalm and Phan is reasonable if they do not support PHP 8.5 yet and PHPStan is sufficient.

Recommendations:

  1. Consider adding a comment in AccentNormalization.php explaining the rationale for the Polish Ł mapping if intentional.

  2. The CLAUDE.md still references test-psalm and test-phan in several places - these should be updated to reflect the removal.


📊 Overall Assessment

Score: 9.5/10

This is an excellent performance optimization PR that:

✅ Delivers substantial performance improvements (66-104%)
✅ Simplifies code by removing 113 lines of lazy initialization
✅ Achieves 100% test coverage
✅ Maintains backward compatibility
✅ Follows project conventions strictly
✅ Includes comprehensive documentation updates
✅ Adds PHP 8.5 forward compatibility

Minor Recommendation:

Verify the Polish Ł mapping (Ł => l vs Ł => L) is intentional. If not, it is a pre-existing issue that could be addressed in a follow-up PR.

Final Verdict: ✅ APPROVE

This PR is ready to merge. The performance gains are significant, the implementation is clean, and test coverage is comprehensive. Great work on the optimization!

@MarjovanLier MarjovanLier merged commit a705f6c into main Dec 8, 2025
11 of 12 checks passed
@MarjovanLier MarjovanLier deleted the perf/optimise-string-methods branch December 8, 2025 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants