Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DateParseError on bar plot #8803

Open
vfilimonov opened this issue Nov 13, 2014 · 8 comments
Open

DateParseError on bar plot #8803

vfilimonov opened this issue Nov 13, 2014 · 8 comments
Labels

Comments

@vfilimonov
Copy link
Contributor

DateParseError is raised when Series with DateTimeIndex is plotted using bar-plot on axis that already contains another plot:

import pandas as pd
df = pd.Series([4,2,1,3,5], index=pd.to_datetime(['2004','2005','2006','2007','2008']))
df.plot()
df.plot(kind='bar')

Both of plots work independently, but being called one after another they result in DateParseError: day is out of range for month

If called in different order:

df.plot(kind='bar')
df.plot()

exception is not raised, but x-axis scale is incorrect

pandas version: 0.15.1
numpy verson: 1.9.1

@jreback jreback added Visualization plotting Compat pandas objects compatability with Numpy or Python functions labels Nov 14, 2014
@jreback
Copy link
Contributor

jreback commented Nov 14, 2014

cc @onesandzeroes
@TomAugspurger

@vfilimonov hmm, can you tell if this is coming from matplotlib or pandas? (e.g. I don't think pandas knows that it is multiple plotting).

@jorisvandenbossche
Copy link
Member

It are series, so they are plotting on the current existing (so in this case the same) axis.

I think the reason for the error is (simplified) that they are plotted differently: df.plot() is plotted as a timeseries (which has custom axis formatting, index values are converted to periods if regular), while with df.plot(kind='label') it just plots the actual values in the index. And these both kinds are not compatible, and give rise to that error.

@vfilimonov
Copy link
Contributor Author

Right, its difference in plotting, and xvalues for these plots are different:

df.plot()
print plt.gca().get_xlim()

results in (34.0, 38.0) and

df.plot(kind='bar')
print plt.gca().get_xlim()

results in (-0.5, 4.5)

@TomAugspurger
Copy link
Contributor

@vfilimonov what version of matplotlib are you using? I don't get a ValueError, but the scale is off.

@TomAugspurger
Copy link
Contributor

As far as a solution, I'm going to hopefully be cleaning up a bunch of DatetimeIndex plotting stuff soon.

@vfilimonov
Copy link
Contributor Author

My matplotlib's version is 1.4.1

I get ValueError if I plot df.plot(kind='bar') after df.plot(). If I plot it in other way round (df.plot() after df.plot(kind='bar')), there's no error, but the scale is off

@wesm
Copy link
Member

wesm commented Jul 6, 2018

This is still a bit borked. The first example works correctly now, but the reverse order plotting is not correct (it is still a bar plot after the second df.plot())

@eipiminus1
Copy link

@wabu I can not reproduce what you are saying. While I don't get an error I don't see the expected behaviour, running the example provided above:

image

While there is no error, I expect to see both the line and the bar chart. Maybe there has been an improvement in the latest releases? I am running pandas 0.23.1 right now.

I had a look into the code and I assume the core problem is in line https://github.com/pandas-dev/pandas/blob/master/pandas/plotting/_core.py#L1197

So the x axis of a bar chart is just enumerated instead of keeping the actual scale. This is useful especially if the index of data is trivial to map to a real axis. It also simplifies the calculation of the position and width of the bars significantly. But it is annoying if you try to join the plot with a line plot. Also, this only works if the original index is equidistant (something you usually want in a bar chart).

The best I could come up with, to join a multicolumn DataFrame bar plot with a line plot is to rescale the line plot to match the enumeration of the bar chart.

df = pd.DataFrame({'a':[4,2,1,3,5], 'b':[4,5,2,3,1]}, index=pd.to_datetime(['2004','2005','2006','2007','2008']))
ax = df.plot(kind='bar')
ax.plot(np.arange(len(df)), df.a, 'g')

image

So I guess we need to handle data with a DatetimeIndex separately. I don't have the time (and expertise) to come up with a PR right now but I would be interested in your thoughts on how to tackle this, to make any effort easier.

@mroeschke mroeschke added Bug and removed Compat pandas objects compatability with Numpy or Python functions labels Apr 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants