[FEATURE] Add support for monthly and yearly data in TimeGapSplit

First of all: `TimeGapSplit` is a super useful feature!

Unfortunately, if I'm not mistaken, `TimeGapSplit` currently does not support monthly or yearly data. This follows from the design choice to use `timedelta` to construct the train and validation sets (`timedelta` does not support months or years). For example:

```
df = (
    pd.DataFrame(
        data=np.random.randint(0, 30, size=(30, 4)),
        columns=list('ABCy')
    )
    .assign(
        date=pd.date_range(start='1/1/2018', end='1/30/2018')[::-1]
    )
)

cv = TimeGapSplit(
    df=df,
    date_col='date',
    train_duration=timedelta(months=1),
    valid_duration=timedelta(months=1),
)
```

raises `TypeError: 'months' is an invalid keyword argument for __new__()`. 
This could maybe be fixed by using `pd.DateOffset` over `timedelta`. 

Tomorrow, I probably have time to look into this, so any guidance or feedback would be very  much appreciated @kayhoogland @stephanecollot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add support for monthly and yearly data in TimeGapSplit #190

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[FEATURE] Add support for monthly and yearly data in TimeGapSplit #190

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions