-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[Core] support scan metrics #2170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9b30db8
to
93c0223
Compare
93c0223
to
c825b34
Compare
7a48528
to
3911f93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also bridge this metrics with Flink, just like CommitMetrics
?
Iterable<ManifestEntry> entries = | ||
ParallellyExecuteUtils.parallelismBatchIterable( | ||
files -> | ||
files.parallelStream() | ||
.filter(this::filterManifestFileMeta) | ||
.flatMap(m -> readManifest.apply(m).stream()) | ||
.filter(this::filterByStats) | ||
.peek(e -> cntEntries.getAndIncrement()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not directly increase cntEntries
by the size of the collected list? If you use peek
, cntEntries
will be increased by every element in this list. Note that this stream is parallelized, so there will be many concurrent writes which affect the performance.
@@ -247,6 +248,12 @@ public SnapshotReader withBucketFilter(Filter<Integer> bucketFilter) { | |||
return this; | |||
} | |||
|
|||
@Override | |||
public SnapshotReader withMetricRegistry(MetricRegistry registry) { | |||
// won't register metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? AuditLogTable
also scans from data file.
This closes apache#2170.
Purpose
Linked issue: close #2000