ci: port lint, unit test, and e2e tests to Actions #155

ankatiyar · 2023-04-03T13:22:55Z

Description

Resolves kedro-org/kedro#2153

Development notes

The migration from CircleCI to Github Actions will be have to done in parts. This PR addresses the following parts -

Unit tests on Linux for kedro-docker, kedro-datasets, kedro-telemetry & kedro-airflow
Lint test on Linux for kedro-docker, kedro-datasets, kedro-telemetry & kedro-airflow
[EDITED TO ADD] end to end test on Linux for kedro-docker, kedro-telemetry & kedro-airflow
[EDITED TO ADD] windows tests for all plugins - I've been able to modify the unit-test job to work with both linux and windows

TO DOs

The following will be added in separate PRs

~~Unit tests on windows for kedro-docker, kedro-datasets, kedro-telemetry & kedro-airflow~~
~~End-to-end tests for kedro-docker, kedro-datasets, kedro-telemetry & kedro-airflow~~
sync and release workflows - To be done in a follow up task - Migrate the release workflow from CircleCI to GitHub Actions #176

Notes

I've been experimenting with Github Actions on a fork of this repository. You can check the demo PRs on that repository that trigger these tests - https://github.com/ankatiyar/kedro-plugins/pulls
The lint test is only run on Python 3.8 - This is what we had on CircleCI and also because the lint tests fails for any other version of python (for kedro-datasets) because of this line in kedro-datasets/test_requirements.txt
I've also added kedro-datasets/tests/tensorflow/test_tensorflow_model_dataset.py to trufflehog-ignore.txt because secret-scan was complaining about this part
Also pushing some temporary changes to the README.md files of all plugins to trigger these tests and will revert them before merging.
[Question] Adding macOS test to this would be fairly trivial - just adding macos-latest to matrix.os list in the unit-tests job. Should we do this? The Github action workflows are slow mostly for kedro-datasets (at least the way it's set up now) but they run in parallel.
[Question] Should we also remove the corresponding CircleCI workflows as a part of the same PR?

Checklist

Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added a description of this change in the relevant RELEASE.md file
Added tests to cover my changes

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

merelcht · 2023-04-04T08:59:24Z

FYI @SajidAlamQB and @AhdraMeraliQB

noklam

Few questions:

Are the unit tests triggered for all dataset? Can we trigger it conditionally with file changes like the current setup?
for the secret-scan, is there a way to disable by line level instead of file level? (similar to pylint: disable=xxx
Not entirely sure why lint fails only 3.8, is that because of missing imports? Can you share the related CI run.

[Question] Adding macOS test to this would be fairly trivial - just adding macos-latest to matrix.os list in the unit-tests job. Should we do this? The Github action workflows are slow mostly for kedro-datasets (at least the way it's set up now) but they run in parallel.
I'd love to see them but exclude it from our merge requirements.

[Question] Should we also remove the corresponding CircleCI workflows as a part of the same PR?
No strong opinion about this, we can keep it run in parallel for a sprint if we want to take the safe side as long as it's not slowing us down. (I don't think it will anyway)

Haven't look into the config in details, so I just have high-level questions. I will have another look at this again.

ankatiyar · 2023-04-06T08:37:27Z

Thanks @noklam for taking a look.

Are the unit tests triggered for all dataset? Can we trigger it conditionally with file changes like the current setup?

The current setup with circleCI also runs the tests for all datasets if the changes are made in kedro-datasets. I think it might be too complicated to only run specific tests for the datasets changed but we can look into it when we iterate over the GitHub actions setup once it's done.

for the secret-scan, is there a way to disable by line level instead of file level? (similar to pylint: disable=xxx

I couldn't find a way to disable it for that specific line.

Not entirely sure why lint fails only 3.8, is that because of missing imports? Can you share the related CI run.

Lint test currently on circleCi also is only set up to run with python 3.8. For any other version snowflake-snowpark-python is not installed and then the lint test complains about not being able to import snowflake.snowpark. See this CI run. Although error can be disabled. Either way it makes sense to run lint test only for one python version instead of all four.

[Question] Adding macOS test to this would be fairly trivial - just adding macos-latest to matrix.os list in the unit-tests job. Should we do this? The Github action workflows are slow mostly for kedro-datasets (at least the way it's set up now) but they run in parallel.

I'd love to see them but exclude it from our merge requirements.

I'll add it. As a side note, I think none of the tests currently are required to pass before merging, only two approvals are required. This is part of the repo settings not CI config.

merelcht

Is it necessary to have a separate CI file for each plugin? The content is pretty much the same apart from the specific plugin name, so I'm wondering if it's possible to generalise and just trigger the right plugin build depending on the changes made?

If that's not straightforward, I'd be happy to get it setup like this first, and then we can try optimise later 🙂

ankatiyar · 2023-04-06T09:55:03Z

Is it necessary to have a separate CI file for each plugin? The content is pretty much the same apart from the specific plugin name, so I'm wondering if it's possible to generalise and just trigger the right plugin build depending on the changes made?

If that's not straightforward, I'd be happy to get it setup like this first, and then we can try optimise later 🙂

Thanks @merelcht!
From what I understand, Github Actions is supposed to have one config file per workflow as opposed to CircleCI which has all workflows the config in one config.yml file. So my logic was to separate out the workflow/jobs which can be reused by all plugins in unit-test.yml.
I'm sure there is a way to do path filtering in the same workflow but doing it with separate workflows and filtering paths under on.push.paths seemed like the way it is intended to be done in Github Actions. I'm happy to look into it but I think it makes sense to also get the e2e tests, windows tests and the sync & release workflows in first and then optimising the whole thing.

noklam · 2023-04-06T10:08:24Z

Path filtering is very much possible, I can help with that later.

ankatiyar · 2023-04-06T14:12:07Z

Update: Thanks to @SajidAlamQB for getting the end to end tests running! 🎉 I've also added them to this PR!

deepyaman · 2023-04-06T14:39:14Z

Is it necessary to have a separate CI file for each plugin? The content is pretty much the same apart from the specific plugin name, so I'm wondering if it's possible to generalise and just trigger the right plugin build depending on the changes made?
If that's not straightforward, I'd be happy to get it setup like this first, and then we can try optimise later 🙂

Thanks @merelcht! From what I understand, Github Actions is supposed to have one config file per workflow as opposed to CircleCI which has all workflows the config in one config.yml file. So my logic was to separate out the workflow/jobs which can be reused by all plugins in unit-test.yml. I'm sure there is a way to do path filtering in the same workflow but doing it with separate workflows and filtering paths under on.push.paths seemed like the way it is intended to be done in Github Actions. I'm happy to look into it but I think it makes sense to also get the e2e tests, windows tests and the sync & release workflows in first and then optimising the whole thing.

~~You can have a reusable workflow to test a single, parametrized plugin and call that for each plugin; see https://docs.github.com/en/actions/using-workflows/reusing-workflows.~~ Never mind, I see you already have a reusable workflow defined. Ignore me.

deepyaman · 2023-04-06T14:41:57Z

Can we make sure the job names are unique? Can see https://futurestud.io/tutorials/github-actions-customize-the-job-name for an example of how to use a variable in the name (matrix already does that for you). This may also resolve itself if using a reusable workflow (as in #155 (comment)), so maybe can delay doing the name overrides until then.

(But I think it is important they don't have these autogenerated suffixes to avoid conflicts, because that is very difficult to work with, and is why it's hard to modify more than one plugin right now even in our current setup; if you remove changes from a plugin, you still have that artifact of failed jobs potentially.)

I'm dumb/haven't had my coffee; this is the existing CircleCI. 🤦 Well, glad the move to GitHub Actions fixes that.

.github/workflows/unit-test.yml

deepyaman · 2023-04-06T14:55:47Z

.github/workflows/unit-test.yml

+          pip install git+https://github.com/kedro-org/kedro@main
+          pip install -r test_requirements.txt


Suggested change

pip install git+https://github.com/kedro-org/kedro@main

pip install -r test_requirements.txt

pip install git+https://github.com/kedro-org/kedro@main -r test_requirements.txt

Didn't check if this is also there is the existing CI, but we should not have multiple pip install commands, because they can override dependencies installed in a previous step.

The current CI setup does it this way as well, kedro is installed first and then the test_requirements.txt for the plugin being tested. I tried making this change in my forked repo but the tests start failing at the "Installing dependencies" stage for kedro-datasets because of dependency resolution conflict. (See this failed run)

I think this is a critical problem that needs resolving--whether it's in scope of this PR or not is a separate issue.

What this means is that, in reality, we don't have resolvable requirements, and we're only able to get to a resolvable state by overwriting some of the previously-installed requirements. Some of the stuff installed in the pip install -r test_requirements.txt are not actually going to be compatible with Kedro on main, it seems.

This is definitely something we need to look into more. I'd suggest creating a separate ticket.

deepyaman · 2023-04-06T14:56:20Z

.github/workflows/unit-test.yml

+            pip install git+https://github.com/kedro-org/kedro@main
+            pip install -r test_requirements.txt


Same

Suggested change

pip install git+https://github.com/kedro-org/kedro@main

pip install -r test_requirements.txt

pip install git+https://github.com/kedro-org/kedro@main -r test_requirements.txt

.github/workflows/unit-test.yml

.github/workflows/airflow-ci.yml

.github/workflows/docker-ci.yml

.github/workflows/airflow-ci.yml

.github/workflows/unit-test.yml

deepyaman · 2023-04-06T15:24:45Z

.github/workflows/unit-test.yml

+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v3
+      - name: Set up Python 3.8


Nit: Any reason not to lint on the latest supported version instead of 3.8?

Current setup also runs lint on 3.8. The lint tests fail for other versions of python for kedro-datasets because of this snowflake-snowpark-python is only installed for python version 3.8 so pylint throws import errors. We can change the python version to 3.10 and suppress the error

Thanks for answering, I'm fine with either route (or doing this in a separate task in the future); I was just wondering. :)

deepyaman · 2023-04-06T15:30:58Z

Sorry for haphazardly dropping comments; I'm sure I've violated all sorts of PR review etiquette! Looks great overall. :) Left a number of comments, but I think a few that are particularly important to address (IMO):

Make sure the cache keys are not all the same across plugins, jobs (they seem to only be differentiated by Python version right now)
Make sure requirements for a plugin are installed in one go, rather than successive pip install commands

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

merelcht · 2023-04-13T09:50:23Z

@ankatiyar To answer your questions on the PR description:

[Question] Adding macOS test to this would be fairly trivial - just adding macos-latest to matrix.os list in the unit-tests job. Should we do this? The Github action workflows are slow mostly for kedro-datasets (at least the way it's set up now) but they run in parallel.

What is the difference between the macOS and ubuntu tests? Does Kedro behave significantly different for these systems like we've seen with Windows? And what do you think are the pros are of adding this?

[Question] Should we also remove the corresponding CircleCI workflows as a part of the same PR?

I would do this in a separate PR to make it easier to review. It would be nice if you could do a small show and tell for the team to talk about what's different and how we access the builds etc 🙂

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

ankatiyar · 2023-04-18T15:14:00Z

Update: I've been able make the windows tests work by modifying the unit-test job instead of creating a new job for the windows tests, so I've added it to this.
I've created follow up tickets

@merelcht There is no difference between the setup for mac-OS or ubuntu really. There is however a limit on the number of concurrent jobs on macOS which is 5.

merelcht · 2023-04-19T10:48:20Z

@merelcht There is no difference between the setup for mac-OS or ubuntu really. There is however a limit on the number of concurrent jobs on macOS which is 5.

If there's no difference I don't really see a point in adding them to be honest.

merelcht

Great work @ankatiyar and also @SajidAlamQB for making the Windows tests work 👍

.github/workflows/check-plugin.yml

SajidAlamQB

This is awesome work @ankatiyar! 🌟 I believe this is in a good state for now. We can optimise things like pip install overwrites in subsequent issues.

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

deepyaman

LGTM overall! Added some comments, but I think the main thing that would be quick/easy to try and add some value is adding pytables to the requirements file with an environment marker. (Also, if you could update the PR title to not be Linux specific now, or I can do it.)

deepyaman · 2023-04-19T12:34:32Z

.github/workflows/check-plugin.yml

+      - name: Cache python packages for Linux
+        if: matrix.os == 'ubuntu-latest'
+        uses: actions/cache@v3
+        with:
+          path: ~/.cache/pip
+          key: ${{inputs.plugin}}-${{matrix.os}}-python-${{matrix.python-version}}
+          restore-keys: ${{inputs.plugin}}
+      - name: Cache python packages for Windows
+        if: matrix.os == 'windows-latest'
+        uses: actions/cache@v3
+        with:
+          path: ~\AppData\Local\pip\Cache
+          key: ${{inputs.plugin}}-${{matrix.os}}-python-${{matrix.python-version}}
+          restore-keys: ${{inputs.plugin}}


https://github.com/actions/cache/blob/main/tips-and-workarounds.md#cross-os-cache looks cool, but not sure it would work

deepyaman · 2023-04-19T12:37:45Z

.github/workflows/check-plugin.yml

+      - name: Install pytables (only for kedro-datasets on windows)
+        if: matrix.os == 'windows-latest' && inputs.plugin == 'kedro-datasets'
+        run: pip install tables


Just thinking about it--if using pip, should this not be part of the test_requirements.txt, with an environment marker (like https://stackoverflow.com/a/54281345)

I quickly tried it. The test_requirements.txt for kedro-datasets has this -

tables~=3.6.0; platform_system == "Windows" and python_version < '3.9' tables~=3.6; platform_system != "Windows"

installingtables 3.6 does not work for windows with python 3.10. I can check in a separate PR if the version of pyTables can be safely bumped and this step can be removed after this is merged in.

deepyaman · 2023-04-19T12:43:03Z

.github/workflows/check-plugin.yml

+      - name: Run unit tests for Windows / kedro-airflow, kedro-docker, kedro-telemetry
+        if: matrix.os == 'windows-latest' && inputs.plugin != 'kedro-datasets'
+        run: |
+          cd ${{ inputs.plugin }}
+          pytest tests
+      - name: Run unit tests for Windows / kedro-datasets / no spark sequential
+        if: matrix.os == 'windows-latest' && inputs.plugin == 'kedro-datasets' && matrix.python-version == '3.10'
+        run: |
+          make test-no-spark-sequential
+      - name: Run unit tests for Windows / kedro-datasets / no spark parallel
+        if: matrix.os == 'windows-latest' && inputs.plugin == 'kedro-datasets' && matrix.python-version != '3.10'
+        run: |
+          make test-no-spark


Maybe an item outside of the scope of the PR, but is this the same behavior on the current CircleCI build? Seems so convoluted. Maybe it makes more sense to abstract most (all?) of these differences in the Makefile, if using one anyway.

deepyaman · 2023-04-19T12:43:57Z

.github/workflows/check-plugin.yml

+          cd ${{ inputs.plugin }}
+          pip install git+https://github.com/kedro-org/kedro@main
+          pip install -r test_requirements.txt
+          pip freeze


Nit: You have pip freeze as a separate step above, and as part of requirements installation in this workflow.

Moved this to a separate step here as well.

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

This reverts commit 8203daa.

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

* Add unit test + lint test on GA * trigger GA - will revert Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add end to end tests * Add cache key Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add cache action Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Rename workflow files Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Lint + add comment + default bash Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add windows test Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update workflow name + revert changes to READMEs Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add kedro-telemetry/RELEASE.md to trufflehog ignore Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add pytables to test_requirements remove from workflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert "Add pytables to test_requirements remove from workflow" This reverts commit 8203daa. * Separate pip freeze step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com>

* Add unit test + lint test on GA * trigger GA - will revert Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add end to end tests * Add cache key Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add cache action Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Rename workflow files Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Lint + add comment + default bash Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add windows test Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update workflow name + revert changes to READMEs Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add kedro-telemetry/RELEASE.md to trufflehog ignore Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add pytables to test_requirements remove from workflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert "Add pytables to test_requirements remove from workflow" This reverts commit 8203daa. * Separate pip freeze step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com>

* Add unit test + lint test on GA * trigger GA - will revert Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add end to end tests * Add cache key Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add cache action Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Rename workflow files Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Lint + add comment + default bash Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add windows test Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update workflow name + revert changes to READMEs Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add kedro-telemetry/RELEASE.md to trufflehog ignore Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add pytables to test_requirements remove from workflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert "Add pytables to test_requirements remove from workflow" This reverts commit 8203daa. * Separate pip freeze step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai>

* Fix links on GitHub issue templates (#150) Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * add spark_stream_dataset.py Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Migrate most of `kedro-datasets` metadata to `pyproject.toml` (#161) * Include missing requirements files in sdist Fix gh-86. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Migrate most project metadata to `pyproject.toml` See kedro-org/kedro#2334. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Move requirements to `pyproject.toml` Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * restructure the strean dataset to align with the other spark dataset Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * adding README.md for specification Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Update kedro-datasets/kedro_datasets/spark/spark_stream_dataset.py Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * rename the dataset Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * resolve comments Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * fix format and pylint Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Update kedro-datasets/kedro_datasets/spark/README.md Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * add unit tests and SparkStreamingDataset in init.py Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * add unit tests Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * update test_save Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Upgrade Polars (#171) * Upgrade Polars Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> * Update Polars to 0.17.x --------- Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * if release is failed, it return exit code and fail the CI (#158) Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Migrate `kedro-airflow` to static metadata (#172) * Migrate kedro-airflow to static metadata See kedro-org/kedro#2334. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add explicit PEP 518 build requirements for kedro-datasets Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Typos Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Remove dangling reference to requirements.txt Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add release notes Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Migrate `kedro-telemetry` to static metadata (#174) * Migrate kedro-telemetry to static metadata See kedro-org/kedro#2334. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add release notes Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * ci: port lint, unit test, and e2e tests to Actions (#155) * Add unit test + lint test on GA * trigger GA - will revert Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Fix lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add end to end tests * Add cache key Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add cache action Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Rename workflow files Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Lint + add comment + default bash Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add windows test Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Update workflow name + revert changes to READMEs Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add kedro-telemetry/RELEASE.md to trufflehog ignore Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Add pytables to test_requirements remove from workflow Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Revert "Add pytables to test_requirements remove from workflow" This reverts commit 8203daa. * Separate pip freeze step Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Migrate `kedro-docker` to static metadata (#173) * Migrate kedro-docker to static metadata See kedro-org/kedro#2334. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Address packaging warning Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Fix tests Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Actually install current plugin with dependencies Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add release notes Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Introdcuing .gitpod.yml to kedro-plugins (#185) Currently opening gitpod will installed a Python 3.11 which breaks everything because we don't support it set. This PR introduce a simple .gitpod.yml to get it started. Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * sync APIDataSet from kedro's `develop` (#184) * Update APIDataSet Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync ParquetDataSet Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync Test Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Linting Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Revert Unnecessary ParquetDataSet Changes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> --------- Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * formatting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * formatting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * formatting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * formatting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * add spark_stream_dataset.py Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * restructure the strean dataset to align with the other spark dataset Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * adding README.md for specification Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Update kedro-datasets/kedro_datasets/spark/spark_stream_dataset.py Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * rename the dataset Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * resolve comments Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * fix format and pylint Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * Update kedro-datasets/kedro_datasets/spark/README.md Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * add unit tests and SparkStreamingDataset in init.py Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * add unit tests Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * update test_save Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * formatting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * formatting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * formatting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * formatting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * lint Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * lint Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * lint Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * update test cases Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * add negative test Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * remove code snippets fpr testing Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * lint Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * update tests Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * update test and remove redundacy Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * linting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * refactor file format Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix read me file Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * docs: Add community contributions (#199) * Add community contributions Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Use newer link to docs Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * adding test for raise error Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * update test and remove redundacy Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * linting Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * refactor file format Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix read me file Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * adding test for raise error Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix readme file Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix readme Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix conflicts Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix ci erors Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix lint issue Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * update class documentation Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * add additional test cases Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * add s3 read test cases Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * add s3 read test cases Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * add s3 read test case Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * test s3 read Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * remove redundant test cases Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix streaming dataset configurations Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * update streaming datasets doc Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * resolve comments re documentation Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * bugfix lint Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * update link Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> * revert the changes on CI Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * test(docker): remove outdated logging-related step (#207) * fixkedro- docker e2e test Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * fix: add timeout to request to satisfy bandit lint --------- Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * ci: ensure plugin requirements get installed in CI (#208) * ci: install the plugin alongside test requirements * ci: install the plugin alongside test requirements * Update kedro-airflow.yml * Update kedro-datasets.yml * Update kedro-docker.yml * Update kedro-telemetry.yml * Update kedro-airflow.yml * Update kedro-datasets.yml * Update kedro-airflow.yml * Update kedro-docker.yml * Update kedro-telemetry.yml * ci(telemetry): update isort config to correct sort * Don't use profile ¯\_(ツ)_/¯ Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> * chore(datasets): remove empty `tool.black` section * chore(docker): remove empty `tool.black` section --------- Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * ci: Migrate the release workflow from CircleCI to GitHub Actions (#203) * Create check-release.yml * change from test pypi to pypi * split into jobs and move version logic into script * update github actions output * lint * changes based on review * changes based on review * fix script to not append continuously * change pypi api token logic Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * build: Relax Kedro bound for `kedro-datasets` (#140) * Less strict pin on Kedro for datasets Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * ci: don't run checks on both `push`/`pull_request` (#192) * ci: don't run checks on both `push`/`pull_request` * ci: don't run checks on both `push`/`pull_request` * ci: don't run checks on both `push`/`pull_request` * ci: don't run checks on both `push`/`pull_request` Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * chore: delete extra space ending check-release.yml (#210) Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * ci: Create merge-gatekeeper.yml to make sure PR only merged when all tests checked. (#215) * Create merge-gatekeeper.yml * Update .github/workflows/merge-gatekeeper.yml --------- Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * ci: Remove the CircleCI setup (#209) * remove circleci setup files and utils * remove circleci configs in kedro-telemetry * remove redundant .github in kedro-telemetry * Delete continue_config.yml * Update check-release.yml * lint * increase timeout to 40 mins for docker e2e tests Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * feat: Dataset API add `save` method (#180) * [FEAT] add save method to APIDataset Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [ENH] create save_args parameter for api_dataset Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [ENH] add tests for socket + http errors Signed-off-by: <jmcdonnell@fieldbox.ai> Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [ENH] check save data is json Signed-off-by: <jmcdonnell@fieldbox.ai> Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [FIX] clean code Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [ENH] handle different data types Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [FIX] test coverage for exceptions Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [ENH] add examples in APIDataSet docstring Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * sync APIDataSet from kedro's `develop` (#184) * Update APIDataSet Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync ParquetDataSet Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync Test Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Linting Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Revert Unnecessary ParquetDataSet Changes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Sync release notes Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> --------- Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [FIX] remove support for delete method Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [FIX] lint files Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [FIX] fix conflicts Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [FIX] remove fail save test Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [ENH] review suggestions Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [ENH] fix tests Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> * [FIX] reorder arguments Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> --------- Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> Signed-off-by: <jmcdonnell@fieldbox.ai> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Co-authored-by: jmcdonnell <jmcdonnell@fieldbox.ai> Co-authored-by: Nok Lam Chan <mediumnok@gmail.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * ci: Automatically extract release notes for GitHub Releases (#212) * ci: Automatically extract release notes Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * fix lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Raise exceptions Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> * Lint Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> --------- Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * feat: Add metadata attribute to datasets (#189) * Add metadata attribute to all datasets Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * feat: Add ManagedTableDataset for managed Delta Lake tables in Databricks (#206) * committing first version of UnityTableCatalog with unit tests. This datasets allows users to interface with Unity catalog tables in Databricks to both read and write. Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * renaming dataset Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * adding mlflow connectors Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * fixing mlflow imports Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * cleaned up mlflow for initial release Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * cleaned up mlflow references from setup.py for initial release Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * fixed deps in setup.py Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * adding comments before intiial PR Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * moved validation to dataclass Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * bug fix in type of partition column and cleanup Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * updated docstring for ManagedTableDataSet Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * added backticks to catalog Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * fixing regex to allow hyphens Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/test_requirements.txt Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * adding backticks to catalog Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Require pandas < 2.0 for compatibility with spark < 3.4 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Replace use of walrus operator Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add test coverage for validation methods Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Remove unused versioning functions Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Fix exception catching for invalid schema, add test for invalid schema Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add pylint ignore Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add tests/databricks to ignore for no-spark tests Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Nok Lam Chan <mediumnok@gmail.com> * Update kedro-datasets/kedro_datasets/databricks/managed_table_dataset.py Co-authored-by: Nok Lam Chan <mediumnok@gmail.com> * Remove spurious mlflow test dependency Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add explicit check for database existence Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Remove character limit for table names Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Refactor validation steps in ManagedTable Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Remove spurious checks for table and schema name existence Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> --------- Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Co-authored-by: Danny Farah <danny.farah@quantumblack.com> Co-authored-by: Danny Farah <danny_farah@mckinsey.com> Co-authored-by: Nok Lam Chan <mediumnok@gmail.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * docs: Update APIDataset docs and refactor (#217) * Update APIDataset docs and refactor * Acknowledge community contributor * Fix more broken doc Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> * Lint Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Fix release notes of upcoming kedro-datasets --------- Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * feat: Release `kedro-datasets` version `1.3.0` (#219) * Modify release version and RELEASE.md Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Add proper name for ManagedTableDataSet Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> * Update kedro-datasets/RELEASE.md Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Revert lost semicolon for release 1.2.0 Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> --------- Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * docs: Fix APIDataSet docstring (#220) * Fix APIDataSet docstring Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Add release notes Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> * Separate [docs] extras from [all] in kedro-datasets Fix gh-143. Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * Update kedro-datasets/tests/spark/test_spark_streaming_dataset.py Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * Update kedro-datasets/kedro_datasets/spark/spark_streaming_dataset.py Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * Update kedro-datasets/setup.py Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> * fix linting issue Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> --------- Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Signed-off-by: Tingting_Wan <tingting_wan@mckinsey.com> Signed-off-by: Juan Luis Cano Rodríguez <hello@juanlu.space> Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com> Signed-off-by: Nok Chan <nok.lam.chan@quantumblack.com> Signed-off-by: Tom Kurian <tom_kurian@mckinsey.com> Signed-off-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com> Signed-off-by: jmcdonnell <jmcdonnell@fieldbox.ai> Signed-off-by: <jmcdonnell@fieldbox.ai> Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com> Signed-off-by: Danny Farah <danny_farah@mckinsey.com> Signed-off-by: Jannic Holzer <jannic.holzer@quantumblack.com> Co-authored-by: Juan Luis Cano Rodríguez <hello@juanlu.space> Co-authored-by: Tingting Wan <110382691+Tingting711@users.noreply.github.com> Co-authored-by: Nok Lam Chan <nok.lam.chan@quantumblack.com> Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu> Co-authored-by: Nok Lam Chan <mediumnok@gmail.com> Co-authored-by: Ankita Katiyar <110245118+ankatiyar@users.noreply.github.com> Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com> Co-authored-by: Tom Kurian <tom_kurian@mckinsey.com> Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com> Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com> Co-authored-by: McDonnellJoseph <90898184+McDonnellJoseph@users.noreply.github.com> Co-authored-by: jmcdonnell <jmcdonnell@fieldbox.ai> Co-authored-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com> Co-authored-by: Jannic <37243923+jmholzer@users.noreply.github.com> Co-authored-by: Danny Farah <danny.farah@quantumblack.com> Co-authored-by: Danny Farah <danny_farah@mckinsey.com> Co-authored-by: kuriantom369 <116743025+kuriantom369@users.noreply.github.com>

ankatiyar added 2 commits April 3, 2023 17:16

Add unit test + lint test on GA

5f7c890

trigger GA - will revert

8ce58f6

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

ankatiyar requested review from deepyaman and noklam April 3, 2023 13:31

Fix lint

1167fc4

Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>

ankatiyar mentioned this pull request Apr 3, 2023

Migrate kedro-plugins from CircleCI to Github Actions kedro-org/kedro#2153

Closed

ankatiyar requested a review from merelcht April 4, 2023 08:06

ankatiyar requested review from SajidAlamQB and AhdraMeraliQB April 4, 2023 09:09

noklam reviewed Apr 5, 2023

View reviewed changes

merelcht reviewed Apr 6, 2023

View reviewed changes

Add end to end tests

4ad9683

ankatiyar changed the title ~~Github Actions Migration (Part 1) : Unit tests (linux) and lint test~~ Github Actions Migration (Part 1) : Unit tests (linux)+ lint + e2e test Apr 6, 2023