Skip to content

Improve speed of stdlib functions by replacing re uses #130167

Open
@donBarbos

Description

We can often find the module re in the standard library modules but it can be replaced (if it is possible). I don't suggest removing it everywhere, there are places where its use is appropriate, but there are also places where it is an unnecessary solution and leads to unpleasant consequences (they can be found below)

Cons of regular expressions and reasons to replace regular expressions with functions and methods:

  1. We spend time to compile re pattern (one time, but anyway we spend it)
  2. In most cases simple string methods are faster (according to my benchmarks about 2x)
  3. We can remove import re which will affect import time
  4. Additionally: I think for those who don't know regular expressions, the code is more difficult to read and therefore difficult to maintain.

Important

For those who want to work on the issue, please:

  • Read https://devguide.python.org/getting-started/pull-request-lifecycle/ before anything else.
  • Select one function to improve. It's easier to review and possibly backport.
  • Always report benchmarks using pyperf, hyperfine, and tuna together with -X importtime to compare import times and execution time.
  • Open a pull request with the following title: gh-130167: Improve speed of module.function by replacing re

Linked PRs

Metadata

Assignees

No one assigned

    Labels

    performancePerformance or resource usagestdlibPython modules in the Lib dirtype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions