Skip to content

Datetime resolution coercion #10249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 8, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.16.2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -152,3 +152,4 @@ Bug Fixes

- Bug to handle masking empty ``DataFrame``(:issue:`10126`)
- Bug where MySQL interface could not handle numeric table/column names (:issue:`10255`)
- Bug where ``read_csv`` and similar failed if making ``MultiIndex`` and ``date_parser`` returned ``datetime64`` array of other time resolution than ``[ns]``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number at the end

12 changes: 7 additions & 5 deletions pandas/io/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2057,18 +2057,20 @@ def converter(*date_cols):
infer_datetime_format=infer_datetime_format
)
except:
return lib.try_parse_dates(strs, dayfirst=dayfirst)
return tools.to_datetime(
lib.try_parse_dates(strs, dayfirst=dayfirst))
else:
try:
result = date_parser(*date_cols)
result = tools.to_datetime(date_parser(*date_cols))
if isinstance(result, datetime.datetime):
raise Exception('scalar parser')
return result
except Exception:
try:
return lib.try_parse_dates(_concat_date_cols(date_cols),
parser=date_parser,
dayfirst=dayfirst)
return tools.to_datetime(
lib.try_parse_dates(_concat_date_cols(date_cols),
parser=date_parser,
dayfirst=dayfirst))
except Exception:
return generic_parser(date_parser, *date_cols)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

techincally we should put the to_datetime around this too (though I don't know of anything that hits this line). Can you see if anything does? (e.g. put a halt and run the test suite for the parsers) and see.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? It fails if I do so:

$ py.test pandas/io/tests/test_date_converters.py
============================= test session starts =============================
platform win32 -- Python 2.7.6 -- py-1.4.27 -- pytest-2.7.1
rootdir: c:\Users\Christer\code\python\pandas, inifile:
plugins: cov, mock
collected 6 items

pandas\io\tests\test_date_converters.py ...F..

================================== FAILURES ===================================
_________________________ TestConverters.test_generic _________________________

self = <pandas.io.tests.test_date_converters.TestConverters testMethod=test_generic>

    def test_generic(self):
        data = "year, month, day, a\n 2001, 01, 10, 10.\n 2001, 02, 1, 11."
        datecols = {'ym': [0, 1]}
        dateconverter = lambda y, m: date(year=int(y), month=int(m), day=1)
        df = read_table(StringIO(data), sep=',', header=0,
                        parse_dates=datecols,
                        date_parser=dateconverter)
        self.assertIn('ym', df)
>       self.assertEqual(df.ym.ix[0], date(2001, 1, 1))
E       AssertionError: Timestamp('2001-01-01 00:00:00') != datetime.date(2001, 1, 1)

pandas\io\tests\test_date_converters.py:120: AssertionError
===================== 1 failed, 5 passed in 0.63 seconds ======================

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I guess that is some old behavior. datetime.date is generally not used / supported, so this is a very inefficient way of doing things.


Expand Down
27 changes: 26 additions & 1 deletion pandas/io/tests/test_date_converters.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
import numpy as np
from numpy.testing.decorators import slow

from pandas import DataFrame, Series, Index, isnull
from pandas import DataFrame, Series, Index, MultiIndex, isnull
import pandas.io.parsers as parsers
from pandas.io.parsers import (read_csv, read_table, read_fwf,
TextParser)
Expand Down Expand Up @@ -119,6 +119,31 @@ def test_generic(self):
self.assertIn('ym', df)
self.assertEqual(df.ym.ix[0], date(2001, 1, 1))

def test_dateparser_resolution_if_not_ns(self):
# issue 10245
data = """\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number here

date,time,prn,rxstatus
2013-11-03,19:00:00,126,00E80000
2013-11-03,19:00:00,23,00E80000
2013-11-03,19:00:00,13,00E80000
"""

def date_parser(date, time):
datetime = np.array(date + 'T' + time + 'Z', dtype='datetime64[s]')
return datetime

df = read_csv(StringIO(data), date_parser=date_parser,
parse_dates={'datetime': ['date', 'time']},
index_col=['datetime', 'prn'])

datetimes = np.array(['2013-11-03T19:00:00Z']*3, dtype='datetime64[s]')
df_correct = DataFrame(data={'rxstatus': ['00E80000']*3},
index=MultiIndex.from_tuples(
[(datetimes[0], 126),
(datetimes[1], 23),
(datetimes[2], 13)],
names=['datetime', 'prn']))
assert_frame_equal(df, df_correct)

if __name__ == '__main__':
import nose
Expand Down