-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write metadata cache data to mappings _meta with refresh time update #805
base: main
Are you sure you want to change the base?
Write metadata cache data to mappings _meta with refresh time update #805
Conversation
…rch-project#744) * write mock metadata cache data to mappings _meta Signed-off-by: Sean Kao <seankao@amazon.com> * Enable write to cache by default Signed-off-by: Sean Kao <seankao@amazon.com> * bugfix: _meta.latestId missing when create index Signed-off-by: Sean Kao <seankao@amazon.com> * set and unset config in test suite Signed-off-by: Sean Kao <seankao@amazon.com> * fix: use member flintSparkConf Signed-off-by: Sean Kao <seankao@amazon.com> --------- Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
add label to backport to the nexus branch. |
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
5f3af3b
to
7a8e1f3
Compare
* Handles refresh for refresh mode AUTO, which is used exclusively by auto refresh index with | ||
* internal scheduler. | ||
*/ | ||
private def refreshIndexAuto( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we update for auto refresh?
* @throws IllegalArgumentException if the schedule string is invalid | ||
*/ | ||
public static IntervalSchedule parse(String scheduleStr) { | ||
public static Long parseMillis(String scheduleStr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParseMillis sounds like parsing millisec string. Should we call it parseAndConvertToMillis
?
Description
Metadata Cache Writer
For the most part, same as
In addition to the regular metadata storage using
FlintIndexMetadataService
, we're dual-writing additional fields, defined byFlintMetadataCache
, to the index mappings_meta
field. It's intended for frontend users to access some crucial metadata for an index quickly without invoking another backend API call.This PR adds such fields for all indexes, if the spark config
spark.flint.metadataCacheWrite.enabled
is set to true._meta.properties.metadataCacheVersion
: "1.0"_meta.properties.refreshInterval
: Integer. Refresh interval of an index measured in seconds. This field is added only if index refresh type is auto refresh and refresh_interval is set_meta.properties.sourceTables
: Array of Strings. For now, it's mocked data. Update coming in later PR._meta.properties.lastRefreshTime
: Long. Timestamp in milliseconds when last refresh happened. This field is added only if index already gets refreshed at least onceLast Refresh Time
Added two new fields in
FlintMetadataLogEntry
and bumped version of its json doc from 1.0 to 1.1 (because adding new field but not changing existing fields)These are accurate only for manual refresh (full, incremental) and external scheduler for auto refresh.
For internal scheduler, the
jobStartTime
(orcreateTime
inFlintMetadataLogEntry
) is used to track streaming job start time.I'm not reusing
createTime
because they should be updated at different times.For createTime (for internal scheduler) it's during
refreshIndex
,recoverIndex
,updateIndexManualToAuto
But for lastRefreshStartTime and lastRefreshCompleteTime (for manual refresh and external scheduler) it's only updated in
refreshIndex
Related Issues
_meta
as read cache for frontend user to access #746By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.