You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation at Advanced: Tutorial to create a custom dataset describes how custom datasets can be created. However, the documentation still lacks some details on how to "register" your own dataset, so it can be imported and used by the catalog.load method.
After configuring the catalog.yml as described in the docs and running the catalog.load method in a jupyter notebook, the custom dataset didn't get recognized. The following errors occured:
DatasetError: An exception occurred when parsing config for dataset 'catalogname': Class 'projectname.datasets.CustomDataset' not found, is this a typo?
create the custom Dataset according to the recent docs in src/projectname/datasets/datasetname.py with the __init__.py alongside.
Exclude a possible pitfall by explicitely mentioning the (in my case required) lines in __init__.py, this file probably needs
from .custom_dataset.py import CustomDataset
__all__ = ["SFTPCapableCSVDataset"]
cd projectname
pip install .
In conf/base/catalog.yml the type should be projectname.datasets.CustomDataset
Whereas projectname and CustomDataset have to be exchanged with the according names of the respective project.
With these steps I was able to sucessfully call catalog.load within the jupyter notebook.
The text was updated successfully, but these errors were encountered:
@avonarret Could you create a minimal project that we can reproduce? I have done this many times and it doesn't require pip install ., so I am surprised if this does not work now. I test it quickly with the current main branch and work as expected.
If the modules are importable, datasets is just one of the module so there is nothing special about it. i.e. 'projectname.datasets.CustomDataset' is this an importable object if you run it from kedro ipython?
Description
The documentation at Advanced: Tutorial to create a custom dataset describes how custom datasets can be created. However, the documentation still lacks some details on how to "register" your own dataset, so it can be imported and used by the catalog.load method.
Documentation page (if applicable)
https://docs.kedro.org/en/stable/data/how_to_create_a_custom_dataset.html
Context
According to the current documentation I have created the following file structure:
After configuring the catalog.yml as described in the docs and running the catalog.load method in a jupyter notebook, the custom dataset didn't get recognized. The following errors occured:
DatasetError: An exception occurred when parsing config for dataset 'catalogname': Class 'projectname.datasets.CustomDataset' not found, is this a typo?
When using this structure, according to @astrojuanlu there is a
pip install .
required Slack Conversation.Possible steps to consider for the docs:
src/projectname/datasets/datasetname.py
with the__init__.py
alongside.__init__.py
, this file probably needscd projectname
pip install .
conf/base/catalog.yml
thetype
should beprojectname.datasets.CustomDataset
Whereas projectname and CustomDataset have to be exchanged with the according names of the respective project.
With these steps I was able to sucessfully call catalog.load within the jupyter notebook.
The text was updated successfully, but these errors were encountered: