Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tests for the new project creation flow in test_starters.py #3594

Open
merelcht opened this issue Feb 5, 2024 · 1 comment
Open
Labels
Component: Testing Issue/PR that addresses tests

Comments

@merelcht
Copy link
Member

merelcht commented Feb 5, 2024

Description

The test_starters.py file has grown too much. It contains a huge amount of tests and when adding a new test it's not very clear where it should go. There's also inconsistency in how the test classes are divided. Some test valid and invalid only scenarios others contain a mix.

Possible Implementation

This a suggestion of how the test_starters.py could be refactored:

  • Split by user flow:
    1. project creation through prompts
    2. project creation through CLI flags
    3. project creation through a config.yml file
    4. the remaining test_starters.py contains test related to the starters flow e.g. parsing of tools and the starters list command.
  • Move all shared functions into conftest.py
  • Mock all git interactions, see Problems below

This PR shows this break down of the test_starters.py tests: #3550

Problems

❗ Currently this setup is not working because depending on how the tests are run (individually vs as part of the full test suite) there's an issue with the .cookiecutter cache.

E         Traceback (most recent call last):
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/shutil.py", line 664, in _rmtree_safe_fd
E             os.rmdir(entry.name, dir_fd=topfd)
E         OSError: [Errno 39] Directory not empty: 'pack'
E         
E         During handling of the above exception, another exception occurred:
E         
E         Traceback (most recent call last):
E           File "/home/runner/work/kedro/kedro/kedro/framework/cli/starters.py", line 895, in _create_project
E             result_path = cookiecutter(template=template_path, **cookiecutter_args)
E                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/cookiecutter/main.py", line 80, in cookiecutter
E             base_repo_dir, cleanup_base_repo_dir = determine_repo_dir(
E                                                    ^^^^^^^^^^^^^^^^^^^
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/cookiecutter/repository.py", line 107, in determine_repo_dir
E             cloned_repo = clone(
E                           ^^^^^^
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/cookiecutter/vcs.py", line 97, in clone
E             clone = prompt_and_delete(repo_dir, no_input=no_input)
E                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/cookiecutter/utils.py", line 95, in prompt_and_delete
E             rmtree(path)
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/cookiecutter/utils.py", line 33, in rmtree
E             shutil.rmtree(path, onerror=force_delete)
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/shutil.py", line 732, in rmtree
E             _rmtree_safe_fd(fd, path, onerror)
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/shutil.py", line 660, in _rmtree_safe_fd
E             _rmtree_safe_fd(dirfd, fullname, onerror)
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/shutil.py", line 660, in _rmtree_safe_fd
E             _rmtree_safe_fd(dirfd, fullname, onerror)
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/shutil.py", line 666, in _rmtree_safe_fd
E             onerror(os.rmdir, fullname, sys.exc_info())
E           File "/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/cookiecutter/utils.py", line 25, in force_delete
E             func(path)
E         OSError: [Errno 39] Directory not empty: '/home/runner/.cookiecutters/kedro-starters/.git/objects/pack'
E         
I've seen this locally several times as well.

Idea on how to fix this

The original starter tests added before any of the tools work was done mocked all the git interactions. Maybe the tools tests should be setup in a different way. We already have e2e tests to test the full flow. I'm thinking maybe these shouldn't be interacting with the full on starter and git interaction but just unit test the inputs.

Possible Alternatives

Move all tests that need starters to the kedro-starters repo.

@merelcht merelcht added the Component: Testing Issue/PR that addresses tests label Feb 5, 2024
@merelcht merelcht mentioned this issue Feb 5, 2024
7 tasks
@merelcht merelcht added this to the Project tools follow up milestone Feb 5, 2024
@noklam
Copy link
Contributor

noklam commented Feb 5, 2024

Copying some comments to here:

The original starter tests added before any of the tools work was done mocked all the git interactions. Maybe the tools tests should be setup in a different way. We already have e2e tests to test the full flow. I'm thinking maybe these shouldn't be interacting with the full on starter and git interaction but just unit test the inputs.

I agree most of it in principles, the tests that we have currently is slightly duplicate and arguably not a real "unit test". In addition, the tests runs way too slow, because it requires network travel to fetch the actual starters, which is why the original tests mock git and fetch the local one instead. This comes with additional cons because you cannot run the tests without internet (seems violating what unit tests should be).

I expected there will still be some coupling because the post_gen_hook.py logic is defined in starters side. I think this is the main problem and there is something that we can do about it.

See this post_gen_hook in spaceflights-pyspark and post_gen_hook in spaceflights-pysparks-viz. They are both executing the exact same main function, I am pretty sure this is redundant because at this stage the correct starter has already been selected, so for sure viz and pyspark are selected if spaceflights-pyspark-viz is cloned.

If we can unified this logic then maybe we can just keep one single mock_repo in kedro instead of kedro-starters, this will likely make development much easier too.
Bonus point: I wonder if the main logic can moved into kedro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Testing Issue/PR that addresses tests
Projects
Status: No status
Development

No branches or pull requests

2 participants