-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf check on upcoming v0.11.0 vs 0.10.1 #3326
Comments
Regression TODO (GH concurrent editing sucks, post a comment, I'll update): update: sorry GH, we still <3 u.
|
can you show the start/end commits that you used? |
|
Would be good if you could indep. confirm the results on your own machine with the script above. |
did you check series_ndarray_constructor? |
Is there an easy way to run vbench only for one benchmark, like frame_wide_repr? |
In bash, from repo root dir
|
@jreback , just bisected down to the offending commit, not the cause, go for it. |
ahh...that's what the check is for! |
#3329 should fix ctor_index_array_string (and helps on some others actually) |
you're right. dechecked it. |
Reason why frame_wide_repr takes longer, compared to 0.10.1 is that now, outside interactive mode there is never a info/wrapped repr, always full view. Probably best to run the benchmark in context of simulated interactive mode, i`ll see if i can change it.. |
@y-p any way i added the test such that a wide frame repr is tested both with/without interactive mode.
|
@lodagro maye %prun this and see where its spending the time? |
repr(df) display the concise format, how do I force it do display all rows? which I think is what this is testing? (frame_wide_repr) |
Try pd.options.display.max_rows= infinity |
@jreback i also think that on 0.10.1 repr(df) results in a concise format (given that there is no performance difference between between with/without interactive mode on v.0.10.1 -- see above), on HEAD this is no longer the case. concise formats are only used in interactive mode. |
one of the big timesinks in this is finding the max len of an array of strings, I have a function in cython to do this |
When forcing full view outside interactive mode on v.0.10.0, the ratio for frame_wide_repr drops from 50 to 1.2 (still a 20% increase though).
So yes 7db1af4 definitely has an impact. Since i made sure that concise views are only used in interactive mode (before this was not always the case - i mentioned this also in #2881, this was not discussed further though). |
ok, 7db1af4 made long frames print in full when in non-interactive mode, |
I recall complaints about un-unix like behaviour in that we previously considered terminal specifically, prior to 7db1af4 the user could opt in to full display by setting |
@y-p can you post a rerun of what you did above when you have a chance? to just confirm the fixes? |
I reran test_perf with your fixes and confirmed that the mean time for the vbench, over N runs, backed down with a fresh checkout of test_perf.py
will do 100 runs of the named vbench and will print out a Use the build cache (now in scripts/use_build_cache.py) to jump around revisions quickly. |
We need to decide where this discussion is taking place. |
#1610 was about console width and made sense (I implemented the fix), I disagree that the same |
In fact, if you read #1610, it basically states the exact opposite then what you're suggesting. |
#1610 requests that the the default in non-interactive mode is that there is no limit on the width/max_columns, this is still the case and i extended this to rows also. User can change it also. For me no problem to keep only a vertical limit, so in non-interactive mode there can be a limit on height/max_rows but not on on width/max_columns, this will fix the script mentioned in #3337. |
The problem was that we were relying on terminal dimensions in non-interactive mode. From #1610 I agree. |
So both max_rows/max_columns should always be used and options display.width, display.height, display.expand_frame_repr only in interactive mode. |
max_rows and max_columns should always be respected, just like in v0.10.1. height and width are new options, we're free to choose their meaning. terminal height/width detection should play no role in non-interactive mode. That's the breakdown as I see it. |
remaining issue to be discussed/closed in #3395 |
Here are all the vbenches which differ by more then 15 percent,
best of 5.
I hope to have this automated and realtime by the time the
next release comes around, along with bisection.
1st run
2nd run
Until test_perf gets validated in it's compare mode, re instability
The text was updated successfully, but these errors were encountered: