Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Let mypyc optimise os.path.join (python#17949)
See python#17948 There's one call site which has varargs that I leave as os.path.join, it doesn't show up on my profile. I do see the `endswith` on the profile, we could try `path[-1] == '/'` instead (could save a few dozen milliseconds) In my work environment, this is about a 10% speedup: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 30.842 s ± 0.119 s [User: 26.383 s, System: 4.396 s] Range (min … max): 30.706 s … 30.927 s 3 runs ``` Compared to: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 34.161 s ± 0.163 s [User: 29.818 s, System: 4.289 s] Range (min … max): 34.013 s … 34.336 s 3 runs ``` In the toy "long" environment mentioned in the issue, this is about a 7% speedup: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable long/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable long/bin/python Time (mean ± σ): 23.177 s ± 0.317 s [User: 20.265 s, System: 2.873 s] Range (min … max): 22.815 s … 23.407 s 3 runs ``` Compared to: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental' Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental Time (mean ± σ): 24.838 s ± 0.237 s [User: 22.038 s, System: 2.750 s] Range (min … max): 24.598 s … 25.073 s 3 runs ``` In the "clean" environment, this is a 1% speedup, but below the noise floor.
- Loading branch information