-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add interval time check #16951
base: master
Are you sure you want to change the base?
Add interval time check #16951
Conversation
Add tests for Time interval Add check to disallow intervals later than current time Resolve NPE Update interval to accept default of start = now, end = one year later Refactor logic in constructor Add boolean check Rename and allow testing pass and fail case Remove unnecessary JSON annotations Rename boolean to make it more user-friendly Adhere to Druid code style conventions Update ingestion-spec docs to reflect functionality Add restricted start and end time Change test case constructors, revamp Equal checks Adhere to style checks and conduct test build Update start safety interval to check violations before timestamp Fix equal bug in intervals Update hashcode function
...rc/test/java/org/apache/druid/segment/indexing/granularity/ArbitraryGranularitySpecTest.java
Fixed
Show fixed
Hide fixed
...rc/test/java/org/apache/druid/segment/indexing/granularity/ArbitraryGranularitySpecTest.java
Fixed
Show fixed
Hide fixed
...rc/test/java/org/apache/druid/segment/indexing/granularity/ArbitraryGranularitySpecTest.java
Fixed
Show fixed
Hide fixed
.../src/test/java/org/apache/druid/segment/indexing/granularity/UniformGranularitySpecTest.java
Fixed
Show fixed
Hide fixed
.../src/test/java/org/apache/druid/segment/indexing/granularity/UniformGranularitySpecTest.java
Fixed
Show fixed
Hide fixed
.../src/test/java/org/apache/druid/segment/indexing/granularity/UniformGranularitySpecTest.java
Fixed
Show fixed
Hide fixed
Bump @kfaraz 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, @GWphua !
I don't think touching GranularitySpec
for this requirement is the correct approach.
Also, I am not sure if the change is even needed. We already support validation of intervals:
- Streaming supervisors have
lateMessageRejectionPeriod
andearlyMessageRejectionPeriod
.
Docs: https://druid.apache.org/docs/latest/ingestion/supervisor#io-configuration - MSQ queries can declare a range of
__time
in the WHERE clause - Batch specs can define filters on the
__time
column.
Docs: https://druid.apache.org/docs/latest/querying/filters/#filtering-on-the-timestamp-column
Do you want to raise an alert when an out-of-range record is encountered or just filter out such records?
Thanks for the review @kfaraz, From the raised issue, it seems that there are cases where year of data is incorrectly labelled. I chose to make the system throw an error when facing an out-of-range data to notify the user. Noted on the support of validation. If such capabilities are already supported in Druid, I can offer to close this PR 😄 |
Thanks, @GWphua ! |
Hi @kfaraz, Reopen pull request based on the explanation in the relevant Issue. I will explain the concept behind the check. Taking the example of daily intervals, in normal circumstances, tasks will be initialized with the following intervals: (Supposing task of size 100G per day) Assuming we create the tasks today (12 Sep 2024). Based on my understanding, if payloads are misconfigured with 20240, they will be initialized with the following: 2024-06-11/20240-06-12 // 9200G This is encountered during ingestion. The implementations you raised deal with supervisors during ingestion, and not with the Granularity. |
@GWphua , I am not sure if this is actually the case in the current code. Could you please share the logs of the ingestion that you tried where you encountered this? Also, IIUC, the issue alludes to validating task payloads. The right way to do that would be to add new validations in the method |
Fixes #16945
Description
Add interval check functionality
safeInput
(boolean),safeStart
(string),safeEnd
(string) respectively.ingestion-spec.md
.equals
andhashcode
function, adhering to rule that objects that are equal must have the same hashcode.Unit test changes
Release note
Add functionality to check your ingestion intervals, giving early reporting to erroneous time intervals during ingestion.
This PR has: