Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of testing Providers that were prepared on February 19, 2024 #37534

Closed
13 of 16 tasks
eladkal opened this issue Feb 19, 2024 · 23 comments
Closed
13 of 16 tasks

Status of testing Providers that were prepared on February 19, 2024 #37534

eladkal opened this issue Feb 19, 2024 · 23 comments
Labels
kind:meta High-level information important to the community testing status Status of testing releases

Comments

@eladkal
Copy link
Contributor

eladkal commented Feb 19, 2024

Body

I have a kind request for all the contributors to the latest provider packages release.
Could you please help us to test the RC versions of the providers?

The guidelines on how to test providers can be found in

Verify providers by contributors

Let us know in the comment, whether the issue is addressed.

Those are providers that require testing as there were some substantial changes introduced:

Provider cncf.kubernetes: 8.0.0rc3

All users involved in the PRs:
@MaksYermak @pankajkoti @Taragolis @potiuk @kacpermuda @VladaZakharova @hussein-awala @romsharon98 @vchiapaikeo @dirrao @pankajastro

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@eladkal eladkal added kind:meta High-level information important to the community testing status Status of testing releases labels Feb 19, 2024
@eladkal
Copy link
Contributor Author

eladkal commented Feb 19, 2024

I marked all items that were tested in RC2 #37504 as tested here too but feel free to retest them.

airflow-oss-bot added a commit to astronomer/astronomer-providers that referenced this issue Feb 19, 2024
@pankajkoti
Copy link
Member

#37514 works as expected. Thanks Elad for quick RC3.

@pankajastro
Copy link
Member

Tested my changes. looking good. Thanks for the RC's

@vchiapaikeo
Copy link
Contributor

#37508 looks good. Thank you! 🙇

@raphaelauv
Copy link
Contributor

logging_interval is at None by default in the 8.0.0 is that voluntary ?

would be more friendly for users that we have this , wdyt ( I was surprise to not see logs anymore in airflow for my pod ) ?

if get_logs:
    if logging_interval is None:
        logging_interval = 10

@pankajastro
Copy link
Member

I was surprise to not see logs anymore in airflow for my pod

That should not be the case. If the logging interval is none then you will see logs at the end of a task execution.

@raphaelauv
Copy link
Contributor

sorry I was not explicit

I was surprise to not see logs anymore in airflow for my pod during the run of the pod cause I have the option get_logs at true

@pankajkoti
Copy link
Member

@raphaelauv I believe get_logs was only meant to fetch logs at the end of the pod completion and so is also the case now.

I guess what you mean is the missing logs in trigger showing that the container is not yet finished and that the trigger is polling with sleep that we were seeing earlier. I have created a PR to show those logs #37546.

The good point now is that with setting logging_interval, we now would also be able to periodically fetch logs from the pod itself to show what's happening inside the pod.
I understand we now have less logs at the moment by default, however, I am unsure if we should call for cancelling the RC for these missing informative logs. WDYT?

@vchiapaikeo
Copy link
Contributor

Agree with @pankajkoti that this was also the behavior previously. For deferrable tasks, logs were fetched and sent at the end (on pod completion) to the task. New commits included in this RC now allow us to modify this behavior and fetch logs periodically.

If deferrable is set to false, the behavior of get_logs is synchronous and logs will be streamed to the task.

@raphaelauv
Copy link
Contributor

in 7.1.4 with only get_logs =true

it fetch the logs regularly when the pod is running

in 8.0.0rc3 with only get_logs =true

it fetch the logs only at the end


users will be surprise

@pankajkoti
Copy link
Member

pankajkoti commented Feb 19, 2024

Yes, I agree there will be no informative logs in RC3 from the triggerer saying the container is still running.

However, the trigger logs are independent of get_logs (even if we set it to False we will see the same diff between 7.1.4 and 8.0.0RC3).

In the case of deferrable mode, with 7.1.4, get_logs is relevant only for fetching logs from within the pod and at the end of pod completion - not while pod is running. It is not relevant while the pod is still running. While the pod is still running, we get logs from trigger saying the container is still running and it's going to sleep. This is independent of get_logs.

@pankajastro
Copy link
Member

pankajastro commented Feb 19, 2024

@raphaelauv I'm a bit confused now. Sorry!. Do you mean the container log is not visible regularly or the log statement something like "container is still running/pending" when the trigger loop is running?

  • We have just logging_interval in this RC and it is None by default.
  • Container logs were not available earlier regularly
  • Logging statements like "container is still running/pending" when the trigger is running should not have anything to do with param get_logs

If you are talking about just logs statement when the trigger loop is running then yeah it is important if the pod is running for a long time and we should add it but I'll not call it a breaking change

@raphaelauv
Copy link
Contributor

"Container logs was not available earlier regularly"

That's wrong, the kpo with get_logs , get the container logs regularly

@pankajkoti
Copy link
Member

"Container logs was not available earlier regularly"

That's wrong, the kpo with get_logs , get the container logs regularly

@raphaelauv It was not possible to get logs from within the container previously in deferrable mode while the pod was still running. They only appear after the pod completes it's execution. I think we're not on the same page when we say "container logs"

with 8.0.0RC3 you might want to try below example task in your DAG

create_k8s_pod = KubernetesPodOperator(
        task_id="create_k8s_pod",
        namespace=namespace,
        in_cluster=in_cluster,
        config_file=config_file,
        name="k8s_test_pod",
        deferrable=True,
        logging_interval=10,
        # startup_timeout_seconds=60,
        image="ubuntu",
        cmds=[
            "bash",
            "-cx",
            (
                "i=0; "
                "while [ $i -ne 150 ]; "
                "do i=$(($i+1)); "
                "echo $i; "
                "sleep 1; "
                "done; "
                "mkdir -p /airflow/xcom/; "
                'echo \'{"message": "good afternoon!"}\' > /airflow/xcom/return.json'
            ),
        ],
        do_xcom_push=True,
    )

and you can try to run the same task without the logging_interval parameter with 7.14.0 (as 7.14.0 did not have this param).

You will observe that you won't see logs from the echo command above periodically in 7.14.0 while the pod is running. You will only see them at the end.

@raphaelauv
Copy link
Contributor

I'm not speaking about the deferrable mode

@pankajkoti
Copy link
Member

pankajkoti commented Feb 19, 2024

I'm not speaking about the deferrable mode

Okay, :). The logging_interval param that you pointed at earlier that has been introduced only gets used for deferrable mode. It does not concern synchronous/non-deferrable mode.

And I checked both 7.14.0 and 8.0.0RC3 for synchronous (non-deferrable mode) and I do not see any diff, I can see that 8.0.0RC3 is able to show pod logs when the pod is still running.

Screenshot 2024-02-20 at 12 55 29 AM Screenshot 2024-02-20 at 12 55 46 AM

@pankajastro
Copy link
Member

pankajastro commented Feb 19, 2024

Still not sure, logging_interval is a new param in 8.0.0rc and it should not impact the sync version of the operator.
here code for the sync kpo
https://github.com/apache/airflow/blame/main/airflow/providers/cncf/kubernetes/operators/pod.py#L572-L637

@raphaelauv would you like to share some example DAG/logs

@raphaelauv
Copy link
Contributor

@pankajastro you are correct I mixed the settings between asyn and non async when I checked the RC.

@eladkal
Copy link
Contributor Author

eladkal commented Feb 20, 2024

@raphaelauv @pankajastro @pankajkoti If i get it right the resolution we are good to go with release?

@pankajkoti
Copy link
Member

@raphaelauv @pankajastro @pankajkoti If i get it right the resolution we are good to go with release?

@eladkal IMO, yes, we're good to go with release.

@pankajastro
Copy link
Member

@eladkal yes, we are good to go with the release. Thanks!

@VladaZakharova
Copy link
Contributor

Hi !
#36847 and #37072 work as expected, thank you :)

@eladkal
Copy link
Contributor Author

eladkal commented Feb 20, 2024

Thank you everyone. Providers are released.

I invite everyone to help improve providers for the next release, a list of open issues can be found here.

@eladkal eladkal closed this as completed Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:meta High-level information important to the community testing status Status of testing releases
Projects
None yet
Development

No branches or pull requests

6 participants