Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADR: 013] Observability #5188

Merged
merged 18 commits into from
Mar 5, 2020
Merged

[ADR: 013] Observability #5188

merged 18 commits into from
Mar 5, 2020

Conversation

tac0turtle
Copy link
Member

  • Inital ADR for metrics with in the SDK
  • extending metrics to be module specific as well

ref #4782

Signed-off-by: Marko Baricevic marbar3778@yahoo.com

  • Targeted PR against correct branch (see CONTRIBUTING.md)

  • Linked to github-issue with discussion and accepted design OR link to spec that describes this work.

  • Wrote tests

  • Updated relevant documentation (docs/)

  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md

  • Re-reviewed Files changed in the github PR explorer


For Admin Use:

  • Added appropriate labels to PR (ex. wip, ready-for-review, docs)
  • Reviewers Assigned
  • Squashed all commits, uses message "Merge pull request #XYZ: [title]" (coding standards)

@tac0turtle tac0turtle added the C: telemetry Issues and features pertaining to SDK telemetry. label Oct 14, 2019
@tac0turtle tac0turtle self-assigned this Oct 14, 2019
@fedekunze fedekunze added the T: ADR An issue or PR relating to an architectural decision record label Oct 16, 2019
@stale
Copy link

stale bot commented Nov 13, 2019

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Nov 13, 2019
@tac0turtle
Copy link
Member Author

tac0turtle commented Nov 14, 2019

@stale stale bot removed the stale label Nov 14, 2019
Part B of this ADR is geared more towards modules. Extending `AppModuleBasic` to support registering of metrics could be the best path forward.

```go
type AppModuleBasic interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting is off

@alexanderbez
Copy link
Contributor

alexanderbez commented Dec 2, 2019

Can we rename this ADR to Telemetry or Observability?

@jackzampolin jackzampolin changed the title [ADR: 013] Metrics [ADR: 013] Observability Dec 3, 2019
Copy link
Member

@jackzampolin jackzampolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@fedekunze
Copy link
Collaborator

is this R4R @marbar3778 ?

@stale
Copy link

stale bot commented Jan 18, 2020

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added stale and removed stale labels Jan 18, 2020

### A

There has been discussion around exposing more metrics to users and node operators about the application. Currently there is only a way to expose metrics from Tendermint and not the application itself.To bring more visibility into applications, I would like to propose reporting of metrics through [Prometheus](https://prometheus.io/).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

itself.To -> itself. To


| Name | type | Description |
| --------------------- | :-------: | ----------: |
| iavl_io | Gauage | |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gauge, I think we do need description and also what units are we measuring io as well.

@fedekunze
Copy link
Collaborator

@marbar3778 bump

@tnachen
Copy link
Contributor

tnachen commented Jan 29, 2020

@marbar3778 updated based on our discussion, let me know if this looks good or not.
@alexanderbez @fedekunze PTAL

@tnachen tnachen marked this pull request as ready for review January 30, 2020 18:50
tac0turtle and others added 10 commits February 3, 2020 15:00
- Inital ADR for metrics with in the SDK
- extending metrics to be module specific as well

Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
Co-Authored-By: Jack Zampolin <jack.zampolin@gmail.com>
Copy link
Collaborator

@fedekunze fedekunze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. CreateMetrics should be general instead of specific to one metric.

Transactions metrics.Counter
}

func CreateMetrics(namespace string, labelsAndValues ...string) *Metrics {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the CreateMetrics should be generic so that it can be called by the module manager. In this case, you are creating a function only for the specific case of the Size field. Say you want to initialize also the Counter for Transactions, with the following function:

NewCounterFrom(opts prometheus.CounterOpts, labelNames []string) *Counter

How do you know whichlabelsAndValues correspond to which metric? This also applies for Histogram and Summary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this comes from what Tendermint did and we could change this for sure, but in TM the same labels are being added to every metric that the module wants to expose.

So the workflow is that module owner will just add Metrics they want to track in the metric struct, and to add more metrics you simple add more fields in that struct, and not only tied to one metric per say.

So if we want the flexibility that every metric can have some custom labels that gets added when pulled from Prometheus then we need to perhaps create a different API for module to define these metrics, since you need the metric type, name and the labels associated with it, potentially even global labels that gets applied to every metric like what we have here.

@fedekunze do you know if we need to have this flexibility to definte custom labels on each metrics in the same module? Or having it apply to every metric in that module is good enough?

func CreateMetrics(namespace string, labelsAndValues ...string) *Metrics {
labels := make([]string, len(labelsAndValues/2))
for i := 0; i < len(labelsAndValues); i += 2 {
labels[i/2] = labelsAndValues[i]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to define the labelNames explicitly as params

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean CreateMetrics(namespace string, labelNames...string, labelValues ...string)?
It is much easier to iterate if it's one array, alternatively we can do map[string]string too which probably is more safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's the prometheus API that takes labelsAndValues list as defined above, so if we change it to two lists or even a map, then every module writer needs to convert it themselves. We could provide a helper method but I feel like it's a bit overkill.
@fedekunze are you fine keeping this? you can look at TM example here https://github.com/tendermint/tendermint/blob/c9ef824ddf623a873ca226b98ddb3539c1de1ebc/p2p/metrics.go#L44

@tnachen tnachen removed the WIP label Feb 21, 2020
@tnachen
Copy link
Contributor

tnachen commented Feb 27, 2020

This only added one markdown file so I don't think codecov is accurate here.

@fedekunze fedekunze added R4R A:automerge Automatically merge PR once all prerequisites pass. labels Feb 27, 2020
@mergify mergify bot merged commit 6c1e3e5 into master Mar 5, 2020
@mergify mergify bot deleted the adr/metrics branch March 5, 2020 16:02
@tac0turtle
Copy link
Member Author

dayuummmm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A:automerge Automatically merge PR once all prerequisites pass. C: telemetry Issues and features pertaining to SDK telemetry. T: ADR An issue or PR relating to an architectural decision record
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants