-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix empty logs and status messages for mmless ingestion #15527
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My one hesitation with writing the log files directly is that some users may choose to configure the format of the logs that are written in their log4j.xml. If we just write this log line using StringUtils.format, users may not be able to parse their logs.
Have you considered this? Is there a way we could have this be written in the same format as the pron logs?
None of my comments are blockers, but it would be good to double check the log levels of the messages
@@ -288,7 +288,11 @@ private TaskStatus getTaskStatus(long duration) | |||
TaskStatus.class | |||
); | |||
} else { | |||
taskStatus = TaskStatus.failure(taskId.getOriginalTaskId(), "task status not found"); | |||
log.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.info( | |
log.warn( |
Why info and not warn? This seems like it's not normal operations
FileUtils.writeStringToFile( | ||
file.toFile(), | ||
StringUtils.format( | ||
"Peon for task [%s] did not report any logs. Check k8s metrics and events for the pod to see what happened.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the log line on 335 be updated to include this information. Also is there a reason that log message isnot warn?
Description
It is occasionally possible for a task running with the KubernetesTaskRunner to unsuccessfully push its status.json or for K8s to delete the log of the task before the overlord can pull it.
This can happen if a pod is evicted due to memory or disk pressure. The thing to do in response to this is check K8s events/metrics but the error messages in druid are not very clear.
Additionaly, when the overlord is unable to pull task logs, it uploads a empty log file which never gets deleted b/c the log file deleter ignores empty files.
Fixed the bug ...
Renamed the class ...
Added a forbidden-apis entry ...
Release note
Key changed/added classes in this PR
KubernetesPeonLifecycle
This PR has: