Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INLONG-11340][Sort] Added new source metrics for sort-connector-pulsar-v1.15 #11341

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PeterZh6
Copy link
Contributor

Fixes [Feature][Sort] Added new source metrics for sort-connector-pulsar-v1.15 #11340

Motivation

The purpose of this PR is to enhance observability for the sort-connector-pulsar-v1.15 by introducing new source metrics. These metrics will provide detailed insights into the deserialization process and checkpoint management, facilitating better monitoring, troubleshooting, and optimization for users.

Modifications

Basically the same way of modifying as that of #11130

Deserialization Metrics:
Added counters to track successful and failed deserialization attempts (numDeserializeSuccess, numDeserializeError).
Added latency gauge to measure time taken for deserialization (deserializeTimeLag).

SnapshotState Metrics:
Added counters for the number of snapshots created (numSnapshotCreate) and errors encountered during snapshot operations (numSnapshotError).

NotifyComplete Metrics:
Added a counter to track completed snapshots (numCompletedSnapshots).
Added latency gauge for the time between snapshot creation and checkpoint completion (snapshotToCheckpointTimeLag).

Verifying this change

(Please pick either of the following options)

  • This change is a trivial rework/code cleanup without any test coverage.

  • This change is already covered by existing tests, such as:
    (please describe tests)

  • This change added tests and can be verified as follows:

Can use the same way of verification as that in #11130, an simpler way, however, can be used in the following way.

Use an End-to-End test called inlong-sort/sort-end-to-end-tests/sort-end-to-end-tests-v1.15/src/test/java/org/apache/inlong/sort/tests/Pulsar2StarRocksTest.java.

Add a while loop after all the checks in testPulsarToStarRocks to stop the testing container from being torn down.
Add

'inlong.metric.labels' = 'groupId=pulsarGroup&streamId=pulsarStream&nodeId=pulsarNode'

in the source connection option of inlong-sort/sort-end-to-end-tests/sort-end-to-end-tests-v1.15/src/test/resources/flinkSql/pulsar_test.sql.
Run the maven test.
Wait until all the tests down(should take a while), visit localhost:8081, which is the url for Flink Web Dashboard.

Click the operator, and check the metrics column.
The result should be like this:
PulsarSourceMetrics

Documentation

@dockerzhang dockerzhang changed the title [INLONG-11340][Sort]Added new source metrics for sort-connector-pulsar-v1.15 [INLONG-11340][Sort] Added new source metrics for sort-connector-pulsar-v1.15 Oct 13, 2024
@dockerzhang dockerzhang requested review from XiaoYou201, vernedeng and EMsnap and removed request for XiaoYou201 and vernedeng October 13, 2024 13:58
Copy link
Contributor

@XiaoYou201 XiaoYou201 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature][Sort] Added new source metrics for sort-connector-pulsar-v1.15
4 participants