Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add metrics otelcol_exporter_queue_capacity #5475

Merged
merged 10 commits into from
Jul 19, 2022

Conversation

fatsheep9146
Copy link
Contributor

Signed-off-by: Ziqi Zhao zhaoziqi9146@gmail.com

Description:
Fix #4902, add metric otelcol_exporter_queue_capacity

Link to tracking Issue:
#4902

Testing:

Documentation:

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 19, 2022
@fatsheep9146
Copy link
Contributor Author

@bogdandrutu Could you help remove this stale label or review this pr? I saw that @jpkrohling is on leave now.

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good, please add a test to TestQueuedRetry_QueueMetricsReported to ensure the metric is reported

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add an entry to the changelog noting the new metric

@codecov
Copy link

codecov bot commented Jun 26, 2022

Codecov Report

Merging #5475 (bc61d26) into main (0509360) will decrease coverage by 0.05%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##             main    #5475      +/-   ##
==========================================
- Coverage   91.53%   91.48%   -0.06%     
==========================================
  Files         192      192              
  Lines       11398    11410      +12     
==========================================
+ Hits        10433    10438       +5     
- Misses        770      775       +5     
- Partials      195      197       +2     
Impacted Files Coverage Δ
exporter/exporterhelper/queued_retry_inmemory.go 91.89% <50.00%> (-3.70%) ⬇️
exporter/exporterhelper/obsreport.go 100.00% <100.00%> (ø)
pdata/internal/common.go 91.85% <0.00%> (-0.75%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0509360...bc61d26. Read the comment docs.

@fatsheep9146
Copy link
Contributor Author

The change looks good, please add a test to TestQueuedRetry_QueueMetricsReported to ensure the metric is reported

Done

@fatsheep9146
Copy link
Contributor Author

Please also add an entry to the changelog noting the new metric

@codeboten Done

@fatsheep9146
Copy link
Contributor Author

@codeboten It's really weird, the report said the TestQueuedRetry_QueueMetricsReported failed, but I run the test by myself using exactly same branch and in mac and linux, they both passed. Do you know what kind of reason may cause this, or what is the difference between the CI and local development?

Copy link
Member

@TylerHelmuth TylerHelmuth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a small nit but otherwise LGTM.

@@ -55,6 +56,12 @@ func newInstruments(registry *metric.Registry) *instruments {
metric.WithLabelKeys(obsmetrics.ExporterKey),
metric.WithUnit(metricdata.UnitDimensionless))

insts.queueCapacity, _ = registry.AddInt64DerivedGauge(
obsmetrics.ExporterKey+"/queue_capacity",
metric.WithDescription("Current capacity of the retry queue (in batches)"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like "Current" implies the capacity can change at some point, but until the collector is restarted it is a fixed size right? Maybe something like "Fixed capacity..." is clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I will fix this

@TylerHelmuth
Copy link
Member

the report said the TestQueuedRetry_QueueMetricsReported failed, but I run the test by myself using exactly same branch and in mac and linux, they both passed. Do you know what kind of reason may cause this, or what is the difference between the CI and local development?

@fatsheep9146 the latest runs show TestQueuedRetry_QueueMetricsReported passing. Do you see the same?

@fatsheep9146
Copy link
Contributor Author

the report said the TestQueuedRetry_QueueMetricsReported failed, but I run the test by myself using exactly same branch and in mac and linux, they both passed. Do you know what kind of reason may cause this, or what is the difference between the CI and local development?

@fatsheep9146 the latest runs show TestQueuedRetry_QueueMetricsReported passing. Do you see the same?

No I didn't see the same result. The result I saw is as following

=== RUN   TestQueuedRetry_QueueMetricsReported
    queued_retry_test.go:515: 
        	Error Trace:	queued_retry_test.go:515
        	            				queued_retry_test.go:345
        	Error:      	could not find metric exporter/queue_capacity with tags [{{exporter} test}] reported
        	Test:       	TestQueuedRetry_QueueMetricsReported
--- FAIL: TestQueuedRetry_QueueMetricsReported (0.00s)

Yet I run test TestQueuedRetry_QueueMetricsReported in local, the test case passed. I'm really confused.

@TylerHelmuth
Copy link
Member

You are correct, I see the failure you are seeing. I'm not sure why it happens.

@fatsheep9146
Copy link
Contributor Author

You are correct, I see the failure you are seeing. I'm not sure why it happens.

@TylerHelmuth
I found the reason, I didn't add metric to queued_retry_experimental, so when run go test with tag enable_unstable, the test case will fail. I already fix it.

@fatsheep9146
Copy link
Contributor Author

@codeboten @jpkrohling ping for review this pr again, thanks :-)

@TylerHelmuth
Copy link
Member

@fatsheep9146 @codeboten does the docs/monitoring.md need updated as part of this new metric add? That doc seems out of date in general.

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
@fatsheep9146
Copy link
Contributor Author

@TylerHelmuth @jpkrohling The monitoring doc are also updated. Please help review this again =D

Copy link
Member

@TylerHelmuth TylerHelmuth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 small nits in the new documentation, otherwise LGTM

docs/monitoring.md Outdated Show resolved Hide resolved
docs/monitoring.md Outdated Show resolved Hide resolved
fatsheep9146 and others added 3 commits July 19, 2022 22:51
Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Report exporter queue capacity as a metric
4 participants