Skip to content

Conversation

@hankehly
Copy link
Contributor

@hankehly hankehly commented Nov 8, 2022

Closes #27396

Summary

The CloudWatch task log handler does not default to local logs (like the documentation says it should) when remote logs are unavailable. This PR fixes the issue.

Todo

  • Update CloudWatch log handler
  • Add/update unit tests
  • Manual check (see below for details)

Manual check

Procedure

  1. Run a test DAG with AWS CloudWatch logging configured
  2. Verify that logs are sent to CloudWatch
  3. Delete CloudWatch log stream associated with task
  4. Create log file locally echo "test" >> {log file path}
  5. Refresh the task log page

Expectations

  • "Unable to read remote logs from Cloudwatch" should appear in the log because we deleted the log stream associated with the task.
  • "Reading local file: ..." should appear in the log to show that the log handler falls back to local logs.
  • The contents of the local log file should appear (in this case the contents are "test")

Screenshot

Screenshot 2022-11-09 at 7 09 49

Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@hankehly
Copy link
Contributor Author

hankehly commented Nov 9, 2022

@vincbeck
Thanks for the early review. Once I've handled the failing tests I'll click the "Ready for review" button.
Ready for review

[{"end_of_log": True}],
)

def test_read_wrong_log_stream(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted these 2 test cases for the following reasons.

  • They overlap with the one I added (testing for generic failure to fetch remote logs)
  • AWS error messages are now propagated to the task log. These strings could change in the future and break our unit tests.

start_from_head=True,
)
return "\n".join(self._event_to_str(event) for event in events)
except Exception:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am propagating the AWS error message to the task log. I think this will be useful for debugging IAM/connection issues.

f"*** Reading local file: {self.local_log_location}/{self.remote_log_stream}\n"
)
assert log == expected_log
expected_log_pos = 26 + len(self.local_log_location) + len(self.remote_log_stream)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log position seems irrelevant in the context of the CloudWatch handler because we fetch/return all logs in one request. I'm following the example I found in the GCS task handler here.

assert metadata == {"end_of_log": False, "log_pos": 31 + len(self.local_log_location)}

@hankehly hankehly marked this pull request as ready for review November 9, 2022 22:41
@hankehly hankehly requested a review from eladkal as a code owner November 9, 2022 22:41
@potiuk potiuk merged commit c490a32 into apache:main Nov 11, 2022
@hankehly hankehly deleted the feature/issue-27396 branch November 11, 2022 01:08
@o-nikolas
Copy link
Contributor

🚀

Adityamalik123 pushed a commit to Adityamalik123/airflow that referenced this pull request Nov 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CloudWatch task handler doesn't fall back to local logs when Amazon CloudWatch logs aren't found

4 participants