⚡️ Speed up function _replace_booleans by 18%
#92
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 18% (0.18x) speedup for
_replace_booleansinpandas/core/computation/expr.py⏱️ Runtime :
27.4 microseconds→23.3 microseconds(best of115runs)📝 Explanation and details
The optimization introduces a pre-computed dictionary lookup (
_BOOLEAN_OPS = {"&": "and", "|": "or"}) to replace the sequential if-elif chain for boolean operator replacement.Key changes:
if tokval == "&": ... elif tokval == "|":withif tokval in _BOOLEAN_OPS:_BOOLEAN_OPS[tokval]instead of hardcoded return valuesWhy this is faster:
tokval in _BOOLEAN_OPS) is O(1) average case vs O(k) for sequential comparisons where k is the number of conditionsPerformance characteristics from tests:
The 17% overall speedup demonstrates that most real-world token streams contain predominantly non-boolean operators, making this optimization particularly effective.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-_replace_booleans-mhbnb3pjand push.