Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: GH14882 Incorrect index label displayed on MultiIndex DataFrame #14975

Closed
wants to merge 1 commit into from

Conversation

Dr-Irv
Copy link
Contributor

@Dr-Irv Dr-Irv commented Dec 23, 2016

@codecov-io
Copy link

Current coverage is 84.67% (diff: 100%)

Merging #14975 into master will increase coverage by 0.01%

@@             master     #14975   diff @@
==========================================
  Files           144        144          
  Lines         51047      51057    +10   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43216      43230    +14   
+ Misses         7831       7827     -4   
  Partials          0          0          

Powered by Codecov. Last update 6bea827...7d47c21

@jreback
Copy link
Contributor

jreback commented Dec 24, 2016

hmm, this seems to duplicate lots of code. I think a better soln is to borrow what @TomAugspurger
did in pandas.formats.style where this all works as expected (maybe need to move some code from there to a more common location, e.g. _get_level_lengths though)

@jreback jreback added IO HTML read_html, to_html, Styler.apply, Styler.applymap Output-Formatting __repr__ of pandas objects, to_string labels Dec 24, 2016
@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Dec 27, 2016

@jreback Not sure I agree that it works as expected in pandas.format.style. If you do pd.set_option('display.max_rows', r) for some value of r, Dataframe.style ignores the value of max_rows. The fix in this PR addresses the issue of truncation due to the value of max_rows.

@jreback
Copy link
Contributor

jreback commented Dec 27, 2016

@Dr-Irv that's the point, you may need to adjust .style and/or to_html to fix this. otherwise just creating even more code.

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Dec 27, 2016

@jreback The current PR fixes the truncation issue for to_html. Shouldn't the lack of support for display.max_rows in .style be a separate issue? Then there is how to have code that is not duplicated between how to_html does things and how .style does things (and I'm guessing you've discussed having the .style implementation replace the current to_html implementation in the future??).

@jreback
Copy link
Contributor

jreback commented Dec 27, 2016

@Dr-Irv yes, my point is that rather than fixing 'to_html', I would rather try to write new code in .style and just fix the issue that way. otherwise this will just go on, creating more and more code.

IIRC @TomAugspurger didn't think this was very difficult

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Dec 27, 2016

@jreback I guess we should hear what @TomAugspurger has to say? You're suggesting that the implementation in .style will eventually replace the current to_html implementation, if I understand you correctly. IMHO, that's fine for the future, but at least for now, this PR can fix the bug, and then getting max_rows support in .style, as well as removing the current to_html implementation to use .style instead (which reduces the size of the code) should be handled as a separate issue.

How about this idea. You merge in this PR, and I'll create a new issue for max_rows and max_columns support in .style, and I can separately work on figuring that out. And maybe there should be another issue to reduce code size by removing the current to_html code and using .style instead??

@jorisvandenbossche
Copy link
Member

Didn't look at the code, but regarding the "implement in style or not" issue: IMO as long as .style has not replaced our default html formatter, separate fixes to the html formatter should be OK. As fixing it in .style will not yet fix our notebook repr.

I don't know if we are ready for such a switch (@TomAugspurger ? do you have an idea on what would be the hurdles? missing features?), but I would see this as a separate issue.

@TomAugspurger
Copy link
Contributor

Yes, separate issues.

I don't think there are any blockers for rewriting to_html with the style stuff, other than formally adding jinja2 as a dependency.

@jreback
Copy link
Contributor

jreback commented Jan 2, 2017

The issue I have with this is we are adding lots of code (with if-thens), which is simply going to be ripped out when this is converted to .style.

So, is it easier to convert to .style, then fix this?

@jorisvandenbossche
Copy link
Member

The issue I have with this is we are adding lots of code (with if-thens), which is simply going to be ripped out when this is converted to .style.

Not only those added lines, but the full html formatting code will be ripped out, which is a lot longer as the added lines.

So, is it easier to convert to .style, then fix this?

The thing is, converting the repr to be based on .style is a larger PR I think. And indeed this would make this fix not needed. But the work still has to be done. And do we have the guarantee that someone will do it before 0.20?
Until then, this PR fixes a very obvious bug (totally wrong displayed index)

I don't think there are any blockers for rewriting to_html with the style stuff, other than formally adding jinja2 as a dependency.

@TomAugspurger Does style already support truncated output? (I don't see a way to get this)

Is jinja2 in a possibly problematic dependency?

@TomAugspurger
Copy link
Contributor

The thing is, converting the repr to be based on .style is a larger PR I think

Agreed, which means I'm +1 for merging this PR, even if it's obsolete in the near future. In any event, the tests from this PR would be great to have.

And do we have the guarantee that someone will do it before 0.20?

I'm not going to have time to work on it before 0.20.

@TomAugspurger Does style already support truncated output?

No, not yet. It wouldn't be terribly difficult code-wise. I just don't know if it's appropriate for .style (what if you highlihgt the max, and the max is in the truncated section? But that's a separate issue).

Is jinja2 in a possibly problematic dependency?

I'm OK with it (and its dependency on MarkupSafe), but willing to be convinced otherwise.

@jreback
Copy link
Contributor

jreback commented Jan 3, 2017

ok then. @jorisvandenbossche or @TomAugspurger pls merge when satisfied

though I would add a big TODO in general on the HTMLFormatter class (about this being converted eventually).

@jreback jreback added this to the 0.20.0 milestone Jan 3, 2017
@jorisvandenbossche
Copy link
Member

Seems we have already an issue for this: #11700. Will move some of the discussion from here over there

@jreback
Copy link
Contributor

jreback commented Jan 21, 2017

thanks @Dr-Irv !

thought we had merged this a while back........

@Dr-Irv Dr-Irv deleted the issue14882 branch January 23, 2017 15:59
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO HTML read_html, to_html, Styler.apply, Styler.applymap Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect index label displayed on MultiIndex DataFrame
5 participants