Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename indexer metrics related to get operations #853

Merged
merged 8 commits into from
Jul 27, 2022

Conversation

mrodm
Copy link
Contributor

@mrodm mrodm commented Jul 26, 2022

Rename indexer get metrics (histogram) to be specific to each type of indexer.
According to the current indexers, this refers to be able to split Get operations depending on the file system, zip and storage indexers.

Calculations of these metrics have been moved to their specific implementations (Get method).

Relates #797

Before this PR, the metrics exposed for Get operations were:

# HELP epr_storage_indexer_get_duration_seconds A histogram of latencies for get processes run by the indexer.
# TYPE epr_storage_indexer_get_duration_seconds histogram
epr_storage_indexer_get_duration_seconds_bucket{le="0.005"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="0.01"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="0.025"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="0.05"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="0.1"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="0.25"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="0.5"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="1"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="2.5"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="5"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="10"} 90
epr_storage_indexer_get_duration_seconds_bucket{le="+Inf"} 90
epr_storage_indexer_get_duration_seconds_sum 2.2363000000000006e-05
epr_storage_indexer_get_duration_seconds_count 90

With this patch proposed, the metrics exposed will be like this:

# HELP epr_indexer_get_duration_seconds A histogram of latencies for get processes run by the indexer.
# TYPE epr_indexer_get_duration_seconds histogram
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="0.005"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="0.01"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="0.025"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="0.05"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="0.1"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="0.25"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="0.5"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="1"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="2.5"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="5"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="10"} 161
epr_indexer_get_duration_seconds_bucket{indexer="FileSystemIndexer",le="+Inf"} 161
epr_indexer_get_duration_seconds_sum{indexer="FileSystemIndexer"} 0.001622019999999999
epr_indexer_get_duration_seconds_count{indexer="FileSystemIndexer"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="0.005"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="0.01"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="0.025"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="0.05"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="0.1"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="0.25"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="0.5"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="1"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="2.5"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="5"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="10"} 161
epr_indexer_get_duration_seconds_bucket{indexer="ZipFileSystemIndexer",le="+Inf"} 161
epr_indexer_get_duration_seconds_sum{indexer="ZipFileSystemIndexer"} 0.0005851810000000002
epr_indexer_get_duration_seconds_count{indexer="ZipFileSystemIndexer"} 161

Rename indexer get metrics (histogram) to be specific to each
type of indexer. Currently, file system, zip and storage indexers.
@mrodm mrodm added the Team:Ecosystem Label for the Packages Ecosystem team label Jul 26, 2022
@mrodm mrodm self-assigned this Jul 26, 2022
@elasticmachine
Copy link

elasticmachine commented Jul 26, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-07-27T15:09:15.977+0000

  • Duration: 5 min 22 sec

Test stats 🧪

Test Results
Failed 0
Passed 213
Skipped 0
Total 213

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

},
)

StorageIndexerGetDurationSeconds = prometheus.NewHistogram(
IndexerGetDurationSeconds = prometheus.NewHistogramVec(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, I am wondering how useful is this metric. Looking at the sample metrics in the description, they seem to be all quite fast as these operations happen mostly in memory, and they end up all in the lower bucket.
We also have metrics in the middleware for /search, that may be more useful for final users, as it includes the time of the whole response.

So I wonder if maybe we can remove this metric. Or if we leave it, lower the boundaries of each bucket so they don't end up all in the lower bucket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I don't have data about these get operations in the storage indexer.
I was expecting that this metric would be more significant in that storage indexer than in the file system indexers. I was thinking that maybe those get ops against storage indexer would be slower (just hypothesis, I don't have data about this).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, current implementations are only filtering a list in memory. But it is true that the operation in the storage indexer is in a mutex whose lock can have some wait time while the index is updated. We may revisit the bucket thresholds when we have more data.

@@ -68,7 +68,7 @@ var (
func init() {
flag.BoolVar(&printVersionInfo, "version", false, "Print Elastic Package Registry version")
flag.StringVar(&address, "address", "localhost:8080", "Address of the package-registry service.")
flag.StringVar(&metricsAddress, "metrics-address", "", "Address to expose the Prometheus metrics.")
flag.StringVar(&metricsAddress, "metrics-address", "", "Address to expose the Prometheus metrics (experimental). ")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor change here to add this Prometheus metric endpoint is subject to changes.
Do you think it is worthy to add it ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, we may redefine something else while we start using it.

CHANGELOG.md Outdated
Comment on lines 13 to 14
* Rename indexer metrics related to get operations. [#853](https://github.com/elastic/package-registry/pull/853)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be moved to another section in the changelog ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can fit as a bugfix 🙂 Though it could be also seen as a breaking change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bugfix is fine :)

@mrodm mrodm requested a review from a team July 26, 2022 16:42
@mrodm mrodm marked this pull request as ready for review July 26, 2022 16:46
@mrodm mrodm requested a review from jsoriano July 26, 2022 16:46
CHANGELOG.md Outdated
Comment on lines 13 to 14
* Rename indexer metrics related to get operations. [#853](https://github.com/elastic/package-registry/pull/853)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bugfix is fine :)

@@ -43,6 +46,7 @@ func NewIndexer(storageClient *storage.Client, options IndexerOptions) *Indexer
return &Indexer{
storageClient: storageClient,
options: options,
label: "StorageIndexer",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. could it be a const instead of a structure field?

Copy link
Contributor Author

@mrodm mrodm Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to follow the same structure that it is used for FileSystemIndexers (link) where the label is part of the struct. But it is true that for storage indexer it is not needed to add that label as a field struct. So, I'll move it to be a const.

},
)

StorageIndexerGetDurationSeconds = prometheus.NewHistogram(
IndexerGetDurationSeconds = prometheus.NewHistogramVec(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, current implementations are only filtering a list in memory. But it is true that the operation in the storage indexer is in a mutex whose lock can have some wait time while the index is updated. We may revisit the bucket thresholds when we have more data.

@mrodm mrodm requested a review from mtojek July 27, 2022 12:11
@@ -48,7 +44,7 @@ func (c CombinedIndexer) Get(ctx context.Context, opts *packages.GetOptions) (pa
packages = packages.Join(p)
}

if !opts.Filter.AllVersions {
if opts != nil && opts.Filter != nil && !opts.Filter.AllVersions {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was needed after updating this branch with main.
This Get method is called with opts as nil from https://github.com/elastic/package-registry/blob/main/main.go#L290

@mrodm mrodm requested a review from jsoriano July 27, 2022 15:12
@mrodm mrodm merged commit a69ff8a into elastic:main Jul 27, 2022
@mrodm mrodm deleted the update_storage_metrics branch July 27, 2022 16:02
mrodm added a commit to elastic/integrations that referenced this pull request Jul 28, 2022
…s (get operations) (#3863)

Update field mappings, ingest pipeline and dashboard accordingly with
the changes done in elastic/package-registry#853
Specially, get operations related to indexers are now reporting the values
per each indexer (using labels).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Ecosystem Label for the Packages Ecosystem team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants