-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate properties module #17590
Separate properties module #17590
Conversation
why do you think we should do this? |
Because in upcoming |
ok this looks fine then. ping on green. |
Codecov Report
@@ Coverage Diff @@
## master #17590 +/- ##
==========================================
- Coverage 91.22% 91.2% -0.02%
==========================================
Files 163 163
Lines 49625 49625
==========================================
- Hits 45270 45261 -9
- Misses 4355 4364 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17590 +/- ##
==========================================
- Coverage 91.19% 91.18% -0.02%
==========================================
Files 163 163
Lines 49627 49627
==========================================
- Hits 45259 45250 -9
- Misses 4368 4377 +9
Continue to review full report at Codecov.
|
ping. |
can u do an import time test before and after to see if adding another module actually matters |
Let me know if there's a less hacky way of measuring this.
master --> 248.97569394111633 That last statistic is bothersome. Would it make a difference that 0.20.3 is installed in site-packages? |
Hmm looking at cProfile output I'm leaning towards disregarding the numbers above. Need to find a better way to measure this. |
using a profile that is not by default to avoid any startup files.
|
OK, ran master:
branch:
Branch has an edge by .06s, but that's all noise. FWIW If import time is a priority, cProfile results for a single run:
~5% could be trimmed by making Lazy import of matplotlib would be a win. Not sure how hard that would be. I think I recall @TomAugspurger mentioning this at some point. A bunch of subprocess calls made by No idea where are the regexps are getting compiled. |
yes mpl is the big one anyway that's for another issue |
pandas/_libs/lib.pyx
Outdated
@@ -67,6 +67,7 @@ import tslib | |||
from tslib import NaT, Timestamp, Timedelta | |||
import interval | |||
from interval import Interval | |||
from properties import AxisProperty, cache_readonly # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may be able to remove these imports entity and just change where they r called from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. It looks like the only direct import of lib.cache_readonly
is by util._decorators
, and other imports get it from there. So that would only require changing it in one place. And lib.AxisProperty
is only used in core.generic
, so also an easy change. Pls confirm before I make this new change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep jose would be good
Matplotlib dealeyd import is at
TomAugspurger@7cc0c54
I can't recall if it's complete. Are you able to cherry pick it and take
that up? For testing I recommend sticking a `raise TypeError` somehwere in
the top level of `matplotlib/__init__.py`
That branch also has a non-working attempt to delay importing numexpr, but
that got really messy.
…On Thu, Sep 21, 2017 at 5:36 PM, jbrockmendel ***@***.***> wrote:
OK, ran %time import pandas as pd four times apiece for master and branch.
master:
CPU times: user 978 ms, sys: 191 ms, total: 1.17 s
Wall time: 1.35 s
CPU times: user 1.09 s, sys: 153 ms, total: 1.24 s
Wall time: 1.36 s
CPU times: user 962 ms, sys: 121 ms, total: 1.08 s
Wall time: 1.15 s
CPU times: user 1.02 s, sys: 131 ms, total: 1.15 s
Wall time: 1.22 s
branch:
CPU times: user 1.02 s, sys: 163 ms, total: 1.18 s
Wall time: 1.36 s
CPU times: user 928 ms, sys: 120 ms, total: 1.05 s
Wall time: 1.12 s
CPU times: user 1.04 s, sys: 128 ms, total: 1.17 s
Wall time: 1.23 s
CPU times: user 952 ms, sys: 121 ms, total: 1.07 s
Wall time: 1.13 s
Branch has an edge by .06s, but that's all noise.
FWIW If import time is a priority, cProfile results for a single run:
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.017 0.017 1.213 1.213 pandas/__init__.py:5(<module>)
1 0.076 0.076 0.742 0.742 pandas/core/api.py:5(<module>)
1 0.046 0.046 0.589 0.589 pandas/core/groupby.py:1(<module>)
1 0.041 0.041 0.326 0.326 pandas/core/frame.py:10(<module>)
1 0.002 0.002 0.205 0.205 pandas/core/index.py:2(<module>)
1 0.063 0.063 0.203 0.203 pandas/core/indexes/api.py:1(<module>)
1 0.001 0.001 0.171 0.171 pandas/core/series.py:3(<module>)
324/47 0.007 0.000 0.152 0.003 {__import__}
1 0.030 0.030 0.148 0.148 pandas/plotting/__init__.py:3(<module>)
1 0.090 0.090 0.116 0.116 pandas/io/api.py:3(<module>)
1 0.002 0.002 0.112 0.112 pandas/plotting/_converter.py:1(<module>)
784 0.003 0.000 0.108 0.000 [..]/re.py:230(_compile)
1 0.003 0.003 0.104 0.104 [..]/numpy/__init__.py:106(<module>)
135 0.001 0.000 0.104 0.001 [..]/sre_compile.py:567(compile)
135 0.000 0.000 0.102 0.001 [..]/re.py:192(compile)
1 0.069 0.069 0.093 0.093 pandas/core/generic.py:2(<module>)
1 0.000 0.000 0.084 0.084 [..]/numpy/add_newdocs.py:10(<module>)
1 0.002 0.002 0.083 0.083 [..]/matplotlib/__init__.py:101(<module>)
1 0.003 0.003 0.081 0.081 [..]/numpy/lib/__init__.py:1(<module>)
1 0.000 0.000 0.071 0.071 [..]/numpy/lib/type_check.py:3(<module>)
1 0.023 0.023 0.071 0.071 pandas/core/indexes/base.py:1(<module>)
1 0.027 0.027 0.070 0.070 [..]/numpy/core/__init__.py:1(<module>)
1 0.017 0.017 0.066 0.066 pandas/core/indexes/interval.py:1(<module>)
1 0.000 0.000 0.061 0.061 pandas/util/_tester.py:3(<module>)
1 0.002 0.002 0.060 0.060 [..]/pytest.py:4(<module>)
15 0.000 0.000 0.059 0.004 [..]/subprocess.py:118(_eintr_retry_call)
135 0.000 0.000 0.058 0.000 [..]/sre_compile.py:552(_code)
1 0.006 0.006 0.053 0.053 [..]/matplotlib/rcsetup.py:15(<module>)
1 0.003 0.003 0.052 0.052 pandas/compat/__init__.py:25(<module>)
groupby imports frame and series, so we can't attribute anything to that
(though it does make a bunch of exec calls)
~5% could be trimmed by making pytest a lazy import in util._tester.
Lazy import of matplotlib would be a win. Not sure how hard that would be.
I think I recall @TomAugspurger <https://github.com/tomaugspurger>
mentioning this at some point.
A bunch of subprocess calls made by io.clipboards at import could
probably be lazified.
No idea where are the regexps are getting compiled.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17590 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIoKJMsHKiQxdaDu0Iz8TW6tq7vkXks5skuTxgaJpZM4PciGW>
.
|
@TomAugspurger do you have an issue for reducing import times? let's move discussion there |
thanks! |
git diff upstream/master -u -- "*.py" | flake8 --diff