Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow default retry count to be determined by environment variable #1467

Closed
1 task
AetherUnbound opened this issue Aug 18, 2022 · 11 comments · Fixed by WordPress/openverse-catalog#806
Closed
1 task
Assignees
Labels
🤖 aspect: dx Concerns developers' experience with the codebase ✨ goal: improvement Improvement to an existing user-facing feature good first issue New-contributor friendly help wanted Open to participation from the community 🟩 priority: low Low priority and doesn't need to be rushed 🐍 tech: python Involves Python

Comments

@AetherUnbound
Copy link
Contributor

Description

Most of our DAGs share default arguments, which can be found here:

https://github.com/WordPress/openverse-catalog/blob/492ae8b521dd2cf7356b1295c68c72feaf7d491c/openverse_catalog/dags/common/constants.py#L14-L23

By default (and unless explicitly stated at the DAG or task level), DAGs which use these arguments have their retries set to 2, which means that the task will run 3 total times before it eventually fails.

For local development, these retries can be cumbersome and time-consuming. It would be great if we could have that value configured from an environment variable which defaults to 2 if not set.

Implementation

  • 🙋 I would be interested in implementing this feature.
@AetherUnbound AetherUnbound added good first issue New-contributor friendly help wanted Open to participation from the community ✨ goal: improvement Improvement to an existing user-facing feature 🐍 tech: python Involves Python 🟩 priority: low Low priority and doesn't need to be rushed 🤖 aspect: dx Concerns developers' experience with the codebase labels Aug 18, 2022
@ramadanomar
Copy link
Contributor

I would be interested in implementing this feature

@AetherUnbound
Copy link
Contributor Author

Awesome, I've assigned this to you @ramadanomar!

@Pmeet
Copy link
Contributor

Pmeet commented Oct 12, 2022

Hi @AetherUnbound , I'd like to work on this issue. Can you please assign this to me?
Thank you.

@AetherUnbound
Copy link
Contributor Author

Sure! I'll assign it to you 🙂 Let us know if you have any questions about getting set up or the issue itself.

@Pmeet
Copy link
Contributor

Pmeet commented Oct 14, 2022

Hi @AetherUnbound , thank you for assigning this to me. I will.

I have completed all the installations. I was working the whole day yesterday to install the pre-requisites. My Macbook Air is fairly new and I always used windows so it took little time to do all this.

I will update you about my work on this issue by end of weekend.

@Pmeet
Copy link
Contributor

Pmeet commented Oct 14, 2022

I found 2 ways I could solve this.

  1. Setting up environment variable through the .env file.
  2. Directly setting environment variable using os.environ and set and get variable

I'd like to use first method so it will use the existing framework of using dotenv to set up the environment.

Let me know which way you want me to set it up and I will implement it.

Also can you please share documentation for this with me? I will need to add how to change the retries value in there and I'd also like to see how users can set the retries right now.

@AetherUnbound
Copy link
Contributor Author

Sorry that you've been running into issues @Pmeet! We have some instructions for setting up a python environment for this project in our README.md. Were you able to use those steps to get you going? Are there other issues you're running to? If you're able to share specifics about the errors you're encountering, we'd be happy to help you get up and running 😄

The python environment should only be necessary for development with an IDE so that it knows which packages are installed and can give appropriate suggestions; the project is not intended to be run locally outside of the Docker stack.

@AetherUnbound
Copy link
Contributor Author

Oh forgive me, I must have seen an earlier message that's no longer present and was responding to that.

Ideally we would like the value set in the .env file, which is read in when loading up the Docker stack. An example of this is the DATA_REFRESH_POKE_INTERVAL variable:

https://github.com/WordPress/openverse-catalog/blob/3217ed5ecf3f9babeb96929e8883110e5da1c199/env.template#L106

https://github.com/WordPress/openverse-catalog/blob/3ee97b6cbe45337bec6289bd77b9a57512a82522/openverse_catalog/dags/data_refresh/data_refresh_task_factory.py#L143

Putting it in the .env file then accessing it using os.getenv or os.environ would work great!

@AetherUnbound
Copy link
Contributor Author

As for documentation, you can see an explanation of the retries value in the Airflow docs here: https://airflow.apache.org/docs/apache-airflow/1.10.6/_api/airflow/operators/index.html?highlight=retry#airflow.operators.BaseOperator

However, we should only need to add the os.getenv call to the dictionary defined in the description, since that's where the value is pulled from for most DAGs:

https://github.com/WordPress/openverse-catalog/blob/492ae8b521dd2cf7356b1295c68c72feaf7d491c/openverse_catalog/dags/common/constants.py#L19

@Pmeet
Copy link
Contributor

Pmeet commented Oct 15, 2022

Hi @AetherUnbound, thank you for clarifying all that.

I have edited the code and I tried to look for a way to test it but I couldn't so can you please guide me on how to the changes I made?

I used just build, just up and just test and all worked well. They returned the same output as they did before I made any changes but if there is another way to test whether my code works or not then I'd like to try that too.

Thank you.

@AetherUnbound
Copy link
Contributor Author

That's great @Pmeet! Since this is a module level variable, that makes it quite cumbersome to test (we'd have to monkeypatch the environment change then reload the module for each test). This is certainly doable, but for a change like this I don't believe tests are necessary. Do you mind submitting your changes as a PR? 😄

If you would be particularly interested in setting up tests for this nonetheless, we'd be happy to help you along with that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖 aspect: dx Concerns developers' experience with the codebase ✨ goal: improvement Improvement to an existing user-facing feature good first issue New-contributor friendly help wanted Open to participation from the community 🟩 priority: low Low priority and doesn't need to be rushed 🐍 tech: python Involves Python
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants