Skip to content

Do we still need asciilib for better performance? #98229

Open
@sobolevn

Description

@sobolevn

Feature or enhancement

asciilib was introduced in c3cec78 as a performance feature for ascii strings.

11 years have passed since then.

While working on #98025 we with @vstinner experimented on unicode_count with and without asciilib_count calls.

Results were clear: it does not bring any benefits on our benchmarks.
And this commit was made: df3a6d9

Later, while working on #98228 I've noticed that asciilib_rsplit_whitespace also does not provide significant performance gains on my platform and my simple data input.

So, maybe this should be analyzed deeply?

Pitch

  • I think that these calls should be further investigated: do we really need this?
  • We should come up with better data to make a final decision, including:
    • What pytohn methods / c-api functions are affected?
    • Short and long ascii / non-ascii strings (because this check slows down all non-ascii strings by calling extra PyUnicode_IS_ASCII(str)
    • Windows / MacOS / Linux platforms, maybe the results will be different

I don't have access to Windows, but I can do the research for other platforms.

If it does not provide any performance benefits, it should be removed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions