Skip to content

Conversation

@neutrinoceros
Copy link
Contributor

@neutrinoceros neutrinoceros commented Jul 19, 2022

References to other Issues or PRs

None.

Brief description of what is fixed or changed

I noticed that validation of transformations takes about half of parse_expr runtime (specifically sympy.core.function.arity is very expensive)
In unyt, this function is called in an intensive loop (several thousands calls) at import time, with hard coded transformations, see

https://github.com/yt-project/unyt/blob/577fd554f3171b85589690f4e563bad30a84340c/unyt/_parsing.py#L92-L95

In the end, this builds up to a few 0.1s wasted every single time unyt is imported. For reference, it's on par with the import time of numpy.
So I propose allowing to skip validation as an opt-out.

Other comments

I'm hoping this small patch is an acceptable solution for this external problem.
If not, I'll be happy to discuss and work on other possible solutions with you.

Release Notes

  • parsing
    • Removed the validation of transformers in parse_expr, for performance purposes.

@sympy-bot
Copy link

sympy-bot commented Jul 19, 2022

Hi, I am the SymPy bot (v167). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

  • parsing
    • Removed the validation of transformers in parse_expr, for performance purposes. (#23804 by @neutrinoceros)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.12.

Click here to see the pull request description that was parsed.
#### References to other Issues or PRs

None.

#### Brief description of what is fixed or changed

I noticed that validation of transformations takes about half of `parse_expr` runtime (specifically `sympy.core.function.arity` is very expensive)
In unyt, this function is called in an intensive loop (several thousands calls) at import time, with hard coded transformations, see

https://github.com/yt-project/unyt/blob/577fd554f3171b85589690f4e563bad30a84340c/unyt/_parsing.py#L92-L95

In the end, this builds up to a few 0.1s wasted every single time unyt is imported. For reference, it's on par with the import time of numpy.
So I propose allowing to skip validation as an opt-out.

#### Other comments

I'm hoping this small patch is an acceptable solution for this external problem.
If not, I'll be happy to discuss and work on other possible solutions with you.

#### Release Notes

<!-- BEGIN RELEASE NOTES -->
* parsing
    * Removed the validation of transformers in `parse_expr`, for performance purposes.
<!-- END RELEASE NOTES -->

Update

The release notes on the wiki have been updated.

@asmeurer
Copy link
Member

Do you only use the built-in transformations? I wonder if a quick if transformation in all_transformations check would fix your performance issues.

@neutrinoceros
Copy link
Contributor Author

No we don't, we use a custom transformation defined here

https://github.com/yt-project/unyt/blob/577fd554f3171b85589690f4e563bad30a84340c/unyt/_parsing.py#L26-L69

@neutrinoceros neutrinoceros marked this pull request as ready for review July 19, 2022 21:39
@github-actions
Copy link

Benchmark results from GitHub Actions

Lower numbers are good, higher numbers are bad. A ratio less than 1
means a speed up and greater than 1 means a slowdown. Green lines
beginning with + are slowdowns (the PR is slower then master or
master is slower than the previous release). Red lines beginning
with - are speedups.

Significantly changed benchmark results (PR vs master)

Significantly changed benchmark results (master vs previous release)

Full benchmark results can be found as artifacts in GitHub Actions
(click on checks at the top of the PR).

@oscarbenjamin
Copy link
Collaborator

Other possibilities:

  1. Make arity faster.
  2. Don't check transformations at all. I'm not sure what benefit the checks provide.

@asmeurer
Copy link
Member

I guess the purpose is to help with user defined transformations. It doesn't need to be done with the ones in SymPy.

For your custom one, I support adding the ability to define assumptions to SymPy's own symbol transformer (that's useful regardless of performance).

@neutrinoceros
Copy link
Contributor Author

Other possibilities:

  1. Make arity faster.

This looks very hard because most of its cost is from inspect.signature, and I don't know of any faster replacement.

  1. Don't check transformations at all. I'm not sure what benefit the checks provide.

That would be my preferred solution actually, but I didn't think it would be received well !

I guess the purpose is to help with user defined transformations. It doesn't need to be done with the ones in SymPy.

For your custom one, I support adding the ability to define assumptions to SymPy's own symbol transformer (that's useful regardless of performance).

Wouldn't that affect the meaning of transformations='all' ? I seems to me that it would either make 'all' restricted to builtin transforms (so not actually all transforms), or modify the global state and affect any further calls, depending on how exactly it's implemented. Both of these possibilities seem problematic to me. Or am I not understanding what you mean ?

@asmeurer
Copy link
Member

'all' would still do what it does now with my suggestion. I am suggesting to add a custom transformer like

symbols_transformer_with_assumptions(positive=True) <- returns a symbol transformer function that adds `positive=True` to the symbols it creates

For the validation, there should be an internal list of all the transformers created by the sympy parser code, which it can skip validation for. This list would include any transformers created by the symbol_with_assumptions transformer.

I'm also fine with removing validation. It was added in 4839dbe by @smichr, so we should see if he has any comments.

I would prefer both of these options to adding a new keyword argument. I don't think it's worth polluting the user options for something like this. We should either be faster about how we do it, or not do it at all.

@smichr
Copy link
Member

smichr commented Jul 21, 2022

removing validation.

It is checking args count and whether it is callable. If either of those don't apply, the transformer will fail on its own so it seems like the validation could be removed.

@neutrinoceros neutrinoceros force-pushed the parsing_optional_trans_validation branch from b19e276 to 80793fa Compare July 21, 2022 13:37
@neutrinoceros neutrinoceros changed the title ENH: add option to skip transformations validation in parse_expr ENH: drop validation of transformations in parse_expr to save time when called in intensive loops Jul 21, 2022
@neutrinoceros
Copy link
Contributor Author

thank you @smichr ! I just repurposed the PR (changed the title and commit history) to drop validation altogether

@neutrinoceros neutrinoceros force-pushed the parsing_optional_trans_validation branch from 80793fa to 023dd24 Compare July 21, 2022 13:42
@neutrinoceros
Copy link
Contributor Author

(pushed again to fix flake8 issues)

@neutrinoceros
Copy link
Contributor Author

@asmeurer @oscarbenjamin @smichr is there anything I should do here ? I apologize for pinging you after just a couple days of inactivity, I am not familiar with the typical triaging timescales for sympy.

@oscarbenjamin
Copy link
Collaborator

I don't know this code so well but this looks good to me.

@asmeurer asmeurer merged commit afe4f80 into sympy:master Jul 26, 2022
@neutrinoceros neutrinoceros deleted the parsing_optional_trans_validation branch July 26, 2022 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants