Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved logging #480

Merged
merged 15 commits into from
Feb 1, 2024
Merged

Improved logging #480

merged 15 commits into from
Feb 1, 2024

Conversation

lieut-data
Copy link
Member

@lieut-data lieut-data commented Jan 30, 2024

Summary

Take a sledgehammer to our current logging strategy in order to align around a simple set of ideas:

  • Error logs must be human actionable (core misconfiguration, migration failure, panic in job)
  • Warning logs are for most "errors": in bulk we may consider them urgent (detected via monitoring)
  • Info logs for low-volume transitional events (jobs starting, synthetic user promotion, etc.)
  • Debug logs sparingly, and tending towards 0.
  • snake_case, consistent parameters (e.g. user_id and teams_user_id and error)
  • avoid high-volume logs
  • avoid low-quality logs

As part of this, I've gutted all testing of logs: we just accept anything during a test now. I've also fixed the spurious mutex_cron_monitoring_system failures that keep breaking the build.

There are discrete commits if there's interest in reviewing as such. I've also pulled out a separate ticket to investigate high-volume "no results found" type of logs that aren't errors in most cases, but are hard to distinguish from real errors: https://mattermost.atlassian.net/browse/MM-56719.

Ticket Link

Fixes: https://mattermost.atlassian.net/browse/MM-55515
Fixes: https://mattermost.atlassian.net/browse/MM-55516
Fixes: https://mattermost.atlassian.net/browse/MM-55741

@lieut-data lieut-data added 2: Dev Review Requires review by a core committer QA/Review Not Required labels Jan 30, 2024
@lieut-data lieut-data added this to the v1.7.0 milestone Jan 30, 2024
Copy link
Member

@sbishel sbishel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the changes to .Maybe()? Seems the mocked functions should still be called.

@lieut-data
Copy link
Member Author

Why the changes to .Maybe()? Seems the mocked functions should still be called.

Some of these are tied to the removal of log changes (i.e. we don't always need to call GetAPI() in order to log something.) But in a practical sense, none of these really "assert" anything except to pass down the mocked implementation at times, so whether they are called or not isn't needful to test, just to "wire up".

This is all going to go away once we abandon (most) mocks in favour of test containers and "real" integration tests.

Copy link
Member

@jespino jespino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I haven't read too much the specific of each log line, but we can polish that over time anyway.

@jespino
Copy link
Member

jespino commented Feb 1, 2024

My only concern is about the proposal of removal the debug logs, but I see a lot of info a warnings replacing the original debug logs, so I guess that is not reduce much our capacity of debugging things. Anyway I still hesitant of that specific point, but to be honest, I think it deserves a try. Let's add and remove debug log as they are needed during the initial phases of development of the future and keep cleaning then up short after. If we don't see ourself struggling with debugging, we can keep that approach.

@jespino jespino removed the 2: Dev Review Requires review by a core committer label Feb 1, 2024
@lieut-data lieut-data merged commit 5150282 into main Feb 1, 2024
7 checks passed
@lieut-data lieut-data deleted the improved-logging branch February 1, 2024 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants