Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MVP for "add-ons" flow within kedro new CLI command #2987

Merged
merged 92 commits into from
Oct 10, 2023

Conversation

SajidAlamQB
Copy link
Contributor

@SajidAlamQB SajidAlamQB commented Aug 30, 2023

NOTE: Kedro datasets are moving from kedro.extras.datasets to a separate kedro-datasets package in
kedro-plugins repository. Any changes to the dataset implementations
should be done by opening a pull request in that repository.

Notes for reviewers - @AhdraMeraliQB

Thank you for taking the time to review almost 600 new lines of code - we hope you enjoy your stay.

Just a few notes before diving right in:

1. This is the first iteration implementation of the add-ons flow (see: #2837 and #2850). It is important to note that the design of what the final workflow will look like is still in progress, and as a result, this PR may deviate somewhat from the descriptions in the aforementioned issues. Any changes/additions to the design will be addressed in #3047.

2. Not all changes are created equal. Here's a quick walkthrough of the changed files and what implementation lives where (ordered by degree of change):

Implementation

kedro/templates/project/hooks/post_gen_project.py: The code run by cookiecutter after the project template is generated.

kedro/templates/project/hooks/utils.py: Functions called by post_gen_project.py to setup the add-ons. Includes parsing user input, tear-down of unnecessary files, injection of necessary requirements, and alphabetic sorting of requirements.

kedro/templates/project/prompts.yml: The new add-on related prompt output when running kedro new with no starter and no config path.

kedro/framework/cli/starters.py: Additions to the output generated after project creation and regex validation for add-ons passed through from config

kedro/templates/project/{{ cookiecutter.repo_name }}/pyproject.toml, kedro/templates/project/{{ cookiecutter.repo_name }}/src/pyproject.toml, and kedro/templates/project/{{ cookiecutter.repo_name }}/src/requirements.txt: Removal of testing, linting, and documentation requirements from the base template.

kedro/templates/project/cookiecutter.json: Updated to include add-ons

kedro/templates/project/{{ cookiecutter.repo_name }}/conf/logging.yml: The return of the logging config

pyproject.toml: A fix to a somewhat recurring issue; see this stackoverflow post

Tests

tests/framework/cli/test_starters.py: Modified old tests to work with new flow, added requirements check helper function, and added tests to check success of including add-ons via CLI prompt and config file.

features/*: Any tests that make use of kedro new or any processes that are now part of add-ons have been updated in accordance to the new flow

3. If you've got the time, try and break it! After checking out this PR and installing the local version of Kedro, testing the changes is as simple as running kedro new. This is what it should look like if you do:

(kedrotest) a/path/on/your/computer % kedro new

Project Add-Ons
===============
Here you can select which add-ons you'd like to include. By default, none are included.
To read more about these add-ons and what they do visit: kedro.org/{insert-documentation}

Add-Ons
1) Linting : Provides a basic linting set up with Black and ruff
2) Testing : Provides basic testing set up with pytest
3) Custom Logging : Provides more logging options
4) Documentation: Provides basic documentations setup with Sphinx
5) Data Structure: Provides a directory structure for storing data

Which add-ons would you like to include in your project? [1-5/1,3/all/none]:
 [none]:

At which point you can select whichever combination of add-ons you wish to include (in this examples I chose none, the default). You'll then be prompted to give this project a name:

Project Name
============
Please enter a human readable name for your new project.
Spaces, hyphens, and underscores are allowed.
 [New Kedro Project]:

In this example I chose test add-ons. You should then see the following text:

You have selected no add-ons

The project name 'test add-ons' has been applied to: 
- The project title in  a/path/on/your/computer/test-add-ons/README.md 
- The folder created for your project in a/path/on/your/computer/test-add-ons 
- The project's python package in  a/path/on/your/computer/test-add-ons/src/test_add_ons

A best-practice setup includes initialising git and creating a virtual environment before running 'pip install -r src/requirements.txt' to install project-specific dependencies. Refer to the Kedro documentation: https://kedro.readthedocs.io/

(kedrotest) ahdra_merali@M-C02G2076ML87 Desktop % 

Exploring the file structure of the template generated, you should be able to determine if the add-on selection was successful (hopefully it is 🤞 ).

4. Opinion time! What do you think of this as a first implementation?

Any small pieces you think would look better changed around? (E.g Should the flow prompt for the project name before the add-ons?) Do you think the implementation is perfect as it is and you'd like to see this in ASAP? Let us know!

The rest of the notes below are the original PR description.


Description

Following #2758, this PR will seek to implement #2850

Update the kedro new command to allow users to select "add-ons" from:

testing
linting
logging
data structure
documentation

Development notes

This PR will also be partly completing #2837 as the design of the CLI is tightly coupled with implementation.

  • CLI flow has been implemented, prompts.yml and cookiecutter.json updated.
  • post_gen_update.py hook for cookiecutter added left placeholder code for rest of implementation. (Should the implementation live here? Maybe it should go into starters.py?
    Note: This PR does not implement the --add-on flag proposed by Add addon flags to kedro new command  #2873

TODO:

  • Add testing and coverage.
  • Complete implementation.
  • Update docs and release notes.

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
@SajidAlamQB SajidAlamQB linked an issue Aug 30, 2023 that may be closed by this pull request
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
@SajidAlamQB
Copy link
Contributor Author

SajidAlamQB commented Sep 1, 2023

I've looked into boolean_variables on cookiecutter and I'm not so sure they are the best approach here. Our options are multi-choices if we opt with using the boolean_variables it would mean each add-on then becomes a separate prompt for a yes/no decision. I feel this would be a detriment to user experience as it would take users longer to answer a series of yes/no questions rather than making a single multi-choice selection.

The drawback is if we go with single multi-choice option is the implementation on post_gen_project.py will be bit more complex.

Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
@SajidAlamQB
Copy link
Contributor Author

SajidAlamQB commented Sep 4, 2023

@merelcht @amandakys @lrcouto @deepyaman, this is still very much a WIP but its in a usable state. I was hoping to get some early feedback on it before going further? This PR is essentially doing both #2850 and #2837 as it made more sense to do these in tandem.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have had a quick look and left some initial comments. In general, I think the approach looks good 👍

kedro/templates/project/prompts.yml Outdated Show resolved Hide resolved
kedro/templates/project/prompts.yml Outdated Show resolved Hide resolved
kedro/templates/project/hooks/post_gen_project.py Outdated Show resolved Hide resolved
kedro/templates/project/hooks/post_gen_project.py Outdated Show resolved Hide resolved
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: SajidAlamQB <90610031+SajidAlamQB@users.noreply.github.com>
@SajidAlamQB
Copy link
Contributor Author

SajidAlamQB commented Sep 6, 2023

As we're also implementing the Cookiecutter hooks in this PR I had some thoughts on some approaches we could take. Specifically to handle the dependencies in requirements.txt and pyproject.toml in our template. I would love to get any feedback/opinions on them:

Approach A (Reactive):

Start with a requirements.txt and pyproject.toml that includes dependencies for all add-ons by default as we do now.
The post-generation hook script will then remove the dependencies for the add-ons that the user didn't select.

Adv:

  • Easier to manage if we only occasionally add or remove add-ons.

Cons:

  • More likely have errors as if the files change somehow in the future, the script might fail or introduce weird errors.

Approach B (Proactive):

Start with a barebones requirements.txt and pyproject.toml that only contain the essential dependencies.
The post-generation hook script will append necessary dependencies based on the add-ons the user selects.

Adv:

  • Less error-prone since we're adding known dependencies rather than trying to remove potentially unknown ones.
  • Keeps the base files clean and simple.

Cons:

  • If we frequently add/change dependencies the script might also have to be updated frequently (More burden?).

Which approach do you think is more better for our needs? I am inclined to Approach B but also open to hear everyone's thoughts or opinions.

PS, maybe an even more radical idea we have separate dependency files for each add-on: e.g., requirements-linting.txt, requirements-testing.txt, etc and we use the hook script to concatenate the selected add-on dependency files into the main requirements.txt post-generation.

@AhdraMeraliQB AhdraMeraliQB self-assigned this Sep 8, 2023
Ahdra Merali and others added 6 commits September 8, 2023 18:14
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
@AhdraMeraliQB AhdraMeraliQB mentioned this pull request Oct 9, 2023
7 tasks
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Ahdra Merali added 2 commits October 9, 2023 13:31
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
@AhdraMeraliQB AhdraMeraliQB removed the request for review from stichbury October 9, 2023 12:34
Copy link
Contributor Author

@SajidAlamQB SajidAlamQB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall left some comments. I think its almost ready to go in.

kedro/framework/cli/starters.py Outdated Show resolved Hide resolved
kedro/framework/cli/starters.py Outdated Show resolved Hide resolved
kedro/framework/cli/starters.py Outdated Show resolved Hide resolved
Ahdra Merali added 6 commits October 9, 2023 13:47
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it out once more and it all looks great! Much better error messages 👍

I would add this change to the release notes already, so we don't forget for 0.19.0!

kedro/framework/cli/starters.py Show resolved Hide resolved
AhdraMeraliQB and others added 8 commits October 9, 2023 15:54
Co-authored-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Copy link
Contributor

@ankatiyar ankatiyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manually tested and it works great! 💯
The only thing I noticed is that it doesn't accept different capitalisations of "all" or "none". Maybe it's a design decision? But we could easily allow any capitalisation of "all" or "none" and simple convert to lowercase before checking!

@AhdraMeraliQB
Copy link
Contributor

Manually tested and it works great! 💯 The only thing I noticed is that it doesn't accept different capitalisations of "all" or "none". Maybe it's a design decision? But we could easily allow any capitalisation of "all" or "none" and simple convert to lowercase before checking!

Noticed that as well! I think it's fine for the MVP, but definitely something to consider with design.

CC: @amandakys

@AhdraMeraliQB AhdraMeraliQB merged commit ab576bc into develop Oct 10, 2023
52 of 58 checks passed
@AhdraMeraliQB AhdraMeraliQB deleted the feat/add-ons-cli-flow branch October 10, 2023 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add "add-ons" flow to kedro new CLI command
9 participants