Skip to content

Commit

Permalink
Validate code snippets in datasets documentation - Part 1 (#1962)
Browse files Browse the repository at this point in the history
* Release/0.18.3 (#1856)

* Update release version and release notes

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Update missing release notes

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* update vresion

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* update release notes

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Remove comment from code example

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Remove more comments

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Add YAML formatting

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Add missing import

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Remove even more comments

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Remove more even more comments

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Add pickle requirement to extras_require

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Try fix YAML docs

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Try fix YAML docs pt 2

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Fix code snippets in docs (#1876)

* Fix code snippets

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Separate code blocks

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Lint

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Fix issue with specifying format for SparkHiveDataSet (#1857)

Signed-off-by: jstammers <jimmy.stammers@cgastrategy.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update RELEASE.md (#1883)

* Update RELEASE.md

* fix broken link

* Update RELEASE.md

Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com>

Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Deprecate `kedro test` and `kedro lint` (#1873)

* Deprecating `kedro test` and `kedro lint`

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Deprecate commands

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Make kedro looks prettier

* Update Linting

Signed-off-by: Nok <nok_lam_chan@mckinsey.com>

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok_lam_chan@mckinsey.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Fix micro package pull from PyPI (#1848)

Signed-off-by: Florian Gaudin-Delrieu <florian.gaudindelrieu@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update Error message for `VersionNotFoundError` to handle Permission related issues better (#1881)

* Update message for VersionNotFoundError

Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>

* Add test for VersionNotFoundError for cloud protocols

* Update test_data_catalog.py

Update NoVersionFoundError test

* minor linting update

* update docs link + styling changes

* Revert "update docs link + styling changes"

This reverts commit 6088e00.

* Update test with styling changes

* Update RELEASE.md

Signed-off-by: ankatiyar <ankitakatiyar2401@gmail.com>

Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: ankatiyar <ankitakatiyar2401@gmail.com>
Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update experiment tracking documentation with working examples (#1893)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Add NHS AI Lab and ReSpo.Vision to companies list (#1878)

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Document how users can use pytest instead of kedro test (#1879)

* Add best_practices.md with introductory sections

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add pytest and pytest-cov sections

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add pytest-cov coverage report

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add sections on pytest-cov

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add automated_testing to index.rst

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Reformat third-party library names and clean grammar.

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add link to virtual environment docs

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add example of good test naming

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Improve link accessibility

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Improve pytest docs link accessibility

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Add reminder link to virtual environment docs

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Fix formatting in link to coverage docs

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Remove reference to /src under 'Run your tests'

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Modify references to <project_name> to <package_name>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Fix sentence structure

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

* Fix broken databricks doc link

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>

Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Capitalise Kedro-Viz in the "Visualize layers" section (#1899)

* Capitalised kedro-viz

Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in>

* capitalised Kedro viz

Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in>

* Updated set_up_experiment_tracking.md

Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in>

Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Fix linting on autmated test page (#1906)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Add _SINGLE_PROCESS property to CachedDataSet (#1905)

Signed-off-by: Carla Vieira <carlaprv@hotmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update the tutorial of "Visualise pipelines" (#1913)

* Change a file extention to match the previous article

Signed-off-by: dinotuku <kuan.tung@epfl.ch>

* Add a missing import

Signed-off-by: dinotuku <kuan.tung@epfl.ch>

* Change both preprocessed datasets to parquet files

Signed-off-by: dinotuku <kuan.tung@epfl.ch>

* Change data type to ParquetDataSet for parquet files

Signed-off-by: dinotuku <kuan.tung@epfl.ch>

* Add a note for installing seaborn if it is not installed

Signed-off-by: dinotuku <kuan.tung@epfl.ch>

Signed-off-by: dinotuku <kuan.tung@epfl.ch>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Document how users can use linting tools instead of `kedro lint` (#1904)

* Add documentation for linting tools

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Revert changes to commands_reference.md

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update linting docs with suggestions

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Update linting doc

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Make core config accessible in dict get way  (#1870)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Create dependabot.yml configuration file for version updates (#1862)

* Create dependabot.yml configuration file

* Update dependabot.yml

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* add target-branch

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* Update dependabot.yml

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* limit dependabot to just dependency folder

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* Update test_requirements.txt

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* Update MANIFEST.in

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* fix e2e

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* Update continue_config.yml

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* Update requirements.txt

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* Update requirements.txt

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* fix link

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* revert

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* Delete requirements.txt

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update dependabot config (#1928)

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update robots.txt (#1929)

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* fix broken link (#1950)

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update dependabot.yml config  (#1938)

* Update dependabot.yml

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* pin jupyterlab_services to requirments

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

* lint

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update setup.py Jinja2 dependencies (#1954)

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update pip-tools requirement from ~=6.5 to ~=6.9 in /dependency (#1957)

Updates the requirements on [pip-tools](https://github.com/jazzband/pip-tools) to permit the latest version.
- [Release notes](https://github.com/jazzband/pip-tools/releases)
- [Changelog](https://github.com/jazzband/pip-tools/blob/master/CHANGELOG.md)
- [Commits](jazzband/pip-tools@6.5.0...6.9.0)

---
updated-dependencies:
- dependency-name: pip-tools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Update toposort requirement from ~=1.5 to ~=1.7 in /dependency (#1956)

Updates the requirements on [toposort]() to permit the latest version.

---
updated-dependencies:
- dependency-name: toposort
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Add deprecation warning to package_name argument in session create() (#1953)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Remove redundant `resolve_load_version` call (#1911)

* remove a redundant function call

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Remove redundant resolove_load_version & fix test

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Fix HoloviewWriter tests with more specific error message pattern & Lint

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

* Rename tests

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Make docstring in test starter match real starters (#1916)

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>

* Try to fix formatting error

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Specify pickle import

Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: jstammers <jimmy.stammers@cgastrategy.com>
Signed-off-by: Nok <nok_lam_chan@mckinsey.com>
Signed-off-by: Florian Gaudin-Delrieu <florian.gaudindelrieu@gmail.com>
Signed-off-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Signed-off-by: ankatiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com>
Signed-off-by: yash6318 <yash.agrawal.cse21@iitbhu.ac.in>
Signed-off-by: Carla Vieira <carlaprv@hotmail.com>
Signed-off-by: dinotuku <kuan.tung@epfl.ch>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Nok <mediumnok@gmail.com>
Co-authored-by: Jimmy Stammers <jimmy.stammers@gmail.com>
Co-authored-by: Merel Theisen <49397448+MerelTheisenQB@users.noreply.github.com>
Co-authored-by: Florian Gaudin-Delrieu <9217921+FlorianGD@users.noreply.github.com>
Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com>
Co-authored-by: Yetunde Dada <43755008+yetudada@users.noreply.github.com>
Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com>
Co-authored-by: Yash Agrawal <96697569+yash6318@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Carla Vieira <carlaprv@hotmail.com>
Co-authored-by: Kuan Tung <kuan.tung@epfl.ch>
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Co-authored-by: Merel Theisen <merel.theisen@quantumblack.com>
  • Loading branch information
16 people authored Nov 9, 2022
1 parent 4f42708 commit f52249b
Show file tree
Hide file tree
Showing 20 changed files with 19 additions and 38 deletions.
1 change: 0 additions & 1 deletion kedro/extras/datasets/email/message_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ class EmailMessageDataSet(
>>> msg["From"] = '"sin studly17"'
>>> msg["To"] = '"strong bad"'
>>>
>>> # data_set = EmailMessageDataSet(filepath="gcs://bucket/test")
>>> data_set = EmailMessageDataSet(filepath="test")
>>> data_set.save(msg)
>>> reloaded = data_set.load()
Expand Down
5 changes: 1 addition & 4 deletions kedro/extras/datasets/geopandas/geojson_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,7 @@ class GeoJSONDataSet(
>>>
>>> data = gpd.GeoDataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]}, geometry=[Point(1,1), Point(2,4)])
>>> # data_set = GeoJSONDataSet(filepath="gcs://bucket/test.geojson",
>>> save_args=None)
>>> data_set = GeoJSONDataSet(filepath="test.geojson",
>>> save_args=None)
>>> data_set = GeoJSONDataSet(filepath="test.geojson", save_args=None)
>>> data_set.save(data)
>>> reloaded = data_set.load()
>>>
Expand Down
5 changes: 0 additions & 5 deletions kedro/extras/datasets/json/json_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,17 +32,13 @@ class JSONDataSet(AbstractVersionedDataSet[Any, Any]):
>>> json_dataset:
>>> type: json.JSONDataSet
>>> filepath: data/01_raw/location.json
>>> load_args:
>>> lines: True
>>>
>>> cars:
>>> type: json.JSONDataSet
>>> filepath: gcs://your_bucket/cars.json
>>> fs_args:
>>> project: my-project
>>> credentials: my_gcp_credentials
>>> load_args:
>>> lines: True
Example using Python API:
::
Expand All @@ -51,7 +47,6 @@ class JSONDataSet(AbstractVersionedDataSet[Any, Any]):
>>>
>>> data = {'col1': [1, 2], 'col2': [4, 5], 'col3': [5, 6]}
>>>
>>> # data_set = JSONDataSet(filepath="gcs://bucket/test.json")
>>> data_set = JSONDataSet(filepath="test.json")
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pandas/csv_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,6 @@ class CSVDataSet(AbstractVersionedDataSet[pd.DataFrame, pd.DataFrame]):
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = CSVDataSet(filepath="gcs://bucket/test.csv")
>>> data_set = CSVDataSet(filepath="test.csv")
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pandas/excel_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,6 @@ class ExcelDataSet(
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = ExcelDataSet(filepath="gcs://bucket/test.xlsx")
>>> data_set = ExcelDataSet(filepath="test.xlsx")
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pandas/feather_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ class FeatherDataSet(AbstractVersionedDataSet[pd.DataFrame, pd.DataFrame]):
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = FeatherDataSet(filepath="gcs://bucket/test.feather")
>>> data_set = FeatherDataSet(filepath="test.feather")
>>>
>>> data_set.save(data)
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pandas/generic_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,6 @@ class GenericDataSet(AbstractVersionedDataSet[pd.DataFrame, pd.DataFrame]):
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = GenericDataSet(filepath="s3://test.csv", file_format='csv')
>>> data_set = GenericDataSet(filepath="test.csv", file_format='csv')
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pandas/hdf_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ class HDFDataSet(AbstractVersionedDataSet[pd.DataFrame, pd.DataFrame]):
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = HDFDataSet(filepath="gcs://bucket/test.hdf", key='data')
>>> data_set = HDFDataSet(filepath="test.h5", key='data')
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pandas/json_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,6 @@ class JSONDataSet(AbstractVersionedDataSet[pd.DataFrame, pd.DataFrame]):
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = JSONDataSet(filepath="gcs://bucket/test.json")
>>> data_set = JSONDataSet(filepath="test.json")
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pandas/parquet_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,6 @@ class ParquetDataSet(AbstractVersionedDataSet[pd.DataFrame, pd.DataFrame]):
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = ParquetDataSet(filepath="gcs://bucket/test.parquet")
>>> data_set = ParquetDataSet(filepath="test.parquet")
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pandas/xml_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ class XMLDataSet(AbstractVersionedDataSet[pd.DataFrame, pd.DataFrame]):
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = XMLDataSet(filepath="gcs://bucket/test.xml")
>>> data_set = XMLDataSet(filepath="test.xml")
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
2 changes: 0 additions & 2 deletions kedro/extras/datasets/pickle/pickle_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,11 @@ class PickleDataSet(AbstractVersionedDataSet[Any, Any]):
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
>>>
>>> # data_set = PickleDataSet(filepath="gcs://bucket/test.pkl")
>>> data_set = PickleDataSet(filepath="test.pkl", backend="pickle")
>>> data_set.save(data)
>>> reloaded = data_set.load()
>>> assert data.equals(reloaded)
>>>
>>> # Add "compress_pickle[lz4]" to requirements.txt
>>> data_set = PickleDataSet(filepath="test.pickle.lz4",
>>> backend="compress_pickle",
>>> load_args={"compression":"lz4"},
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/pillow/image_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ class ImageDataSet(AbstractVersionedDataSet[Image.Image, Image.Image]):
>>> from kedro.extras.datasets.pillow import ImageDataSet
>>>
>>> # data_set = ImageDataSet(filepath="gcs://bucket/test.png")
>>> data_set = ImageDataSet(filepath="test.png")
>>> image = data_set.load()
>>> image.show()
Expand Down
27 changes: 14 additions & 13 deletions kedro/extras/datasets/plotly/plotly_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,21 +27,22 @@ class PlotlyDataSet(JSONDataSet):
the JSON file directly from a pandas DataFrame through ``plotly_args``.
Example configuration for a PlotlyDataSet in the catalog:
::
.. code-block:: yaml
>>> bar_plot:
>>> type: plotly.PlotlyDataSet
>>> filepath: data/08_reporting/bar_plot.json
>>> plotly_args:
>>> type: bar
>>> fig:
>>> x: features
>>> y: importance
>>> orientation: h
>>> layout:
>>> xaxis_title: x
>>> yaxis_title: y
>>> title: Test
>>> type: plotly.PlotlyDataSet
>>> filepath: data/08_reporting/bar_plot.json
>>> plotly_args:
>>> type: bar
>>> fig:
>>> x: features
>>> y: importance
>>> orientation: h
>>> layout:
>>> xaxis_title: x
>>> yaxis_title: y
>>> title: Title
"""

# pylint: disable=too-many-arguments
Expand Down
1 change: 1 addition & 0 deletions kedro/extras/datasets/redis/redis_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ class PickleDataSet(AbstractDataSet[Any, Any]):
::
>>> from kedro.extras.datasets.redis import PickleDataSet
>>> import pandas as pd
>>>
>>> data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
>>> 'col3': [5, 6]})
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/text/text_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ class TextDataSet(AbstractVersionedDataSet[str, str]):
>>>
>>> string_to_write = "This will go in a file."
>>>
>>> # data_set = TextDataSet(filepath="gcs://bucket/test.md")
>>> data_set = TextDataSet(filepath="test.md")
>>> data_set.save(string_to_write)
>>> reloaded = data_set.load()
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/tracking/json_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ class JSONDataSet(JDS):
>>>
>>> data = {'col1': 1, 'col2': 0.23, 'col3': 0.002}
>>>
>>> # data_set = JSONDataSet(filepath="gcs://bucket/test.json")
>>> data_set = JSONDataSet(filepath="test.json")
>>> data_set.save(data)
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/tracking/metrics_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ class MetricsDataSet(JSONDataSet):
>>>
>>> data = {'col1': 1, 'col2': 0.23, 'col3': 0.002}
>>>
>>> # data_set = MetricsDataSet(filepath="gcs://bucket/test.json")
>>> data_set = MetricsDataSet(filepath="test.json")
>>> data_set.save(data)
Expand Down
1 change: 0 additions & 1 deletion kedro/extras/datasets/yaml/yaml_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ class YAMLDataSet(AbstractVersionedDataSet[Dict, Dict]):
>>>
>>> data = {'col1': [1, 2], 'col2': [4, 5], 'col3': [5, 6]}
>>>
>>> # data_set = YAMLDataSet(filepath="gcs://bucket/test.yaml")
>>> data_set = YAMLDataSet(filepath="test.yaml")
>>> data_set.save(data)
>>> reloaded = data_set.load()
Expand Down
3 changes: 3 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ def _collect_requirements(requires):
"pandas.XMLDataSet": [PANDAS, "lxml~=4.6"],
"pandas.GenericDataSet": [PANDAS],
}
pickle_require = {"pickle.PickleDataSet": ["compress-pickle[lz4]~=2.1.0"]}
pillow_require = {"pillow.ImageDataSet": ["Pillow~=9.0"]}
plotly_require = {
"plotly.PlotlyDataSet": [PANDAS, "plotly>=4.8.0, <6.0"],
Expand Down Expand Up @@ -121,6 +122,7 @@ def _collect_requirements(requires):
"holoviews": _collect_requirements(holoviews_require),
"networkx": _collect_requirements(networkx_require),
"pandas": _collect_requirements(pandas_require),
"pickle": _collect_requirements(pickle_require),
"pillow": _collect_requirements(pillow_require),
"plotly": _collect_requirements(plotly_require),
"redis": _collect_requirements(redis_require),
Expand All @@ -135,6 +137,7 @@ def _collect_requirements(requires):
**holoviews_require,
**networkx_require,
**pandas_require,
**pickle_require,
**pillow_require,
**plotly_require,
**spark_require,
Expand Down

0 comments on commit f52249b

Please sign in to comment.