Skip to content

Improve freq argument in matrix's freq argument #106

@wangsen992

Description

@wangsen992

This is probably a very simple bug to fix. If nobody tackles it I will probably do it some time when I'm free and give a pull request. The error is that the freq argument from msno.matrix(), when activated in missingno.py (the code is shown below), it initiates a date_range starting from the beginning of the day. When it does not get a index value from df.index.get_loc(value), the KeyError is catched and the operation is halted.

The issue is many times this timeseries data might not begin or end on a full day cycle, aka 00:00 am. So maybe simply cut off with the range of input df will solve the problem.

if freq:
        ts_list = []

        if type(df.index) == pd.PeriodIndex:
            ts_array = pd.date_range(df.index.to_timestamp().date[0],
                                     df.index.to_timestamp().date[-1],
                                     freq=freq).values

            ts_ticks = pd.date_range(df.index.to_timestamp().date[0],
                                     df.index.to_timestamp().date[-1],
                                     freq=freq).map(lambda t:
                                                    t.strftime('%Y-%m-%d'))

        elif type(df.index) == pd.DatetimeIndex:
            ts_array = pd.date_range(df.index.date[0], df.index.date[-1],
                                     freq=freq).values

            ts_ticks = pd.date_range(df.index.date[0], df.index.date[-1],
                                     freq=freq).map(lambda t:
                                                    t.strftime('%Y-%m-%d'))
        else:
            raise KeyError('Dataframe index must be PeriodIndex or DatetimeIndex.')
        try:
            for value in ts_array:
                ts_list.append(df.index.get_loc(value))
        except KeyError:
            raise KeyError('Could not divide time index into desired frequency.')

PS: Hopefully the format of the issue is clear. This is my first time to raise issue so any suggestion on modifying this issue would be welcomed.

And great work with this project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions