Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RPC stats for the client #90

Merged
merged 1 commit into from
Jun 12, 2017
Merged

Conversation

roclark
Copy link
Contributor

@roclark roclark commented Jun 6, 2017

This PR closes #69.

Signed-Off-By: Robert Clark robert.d.clark@hpe.com

@roclark roclark requested a review from joehandzik June 6, 2017 14:49
@@ -320,6 +325,14 @@ func (s *lustreProcfsSource) generateClientMetricTemplates() error {
{"stats", "inode_permission_total", inodePermissionHelp, s.counterMetric},
{"xattr_cache", "xattr_cache_enabled", "Returns '1' if extended attribute cache is enabled", s.gaugeMetric},
},
"mdc/*": {
{"rpc_stats", "rpc_in_flight", rpcsInFlightHelp, s.gaugeMetric},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So...I see that this metric is named differently than the 'rpcs_in_flight' value in osc/*. Maybe we just need to name these mdc_rpcs_in_flight and osc_rpcs_in_flight?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was actually just a typo as I meant to put rpcs_in_flight, but I can certainly do osc* and mdc* if you desire!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, rpcs_in_flight would cause metric conflicts, wouldn't it? Or can mdc and osc not be populated at the same time? What do the labels end up looking like for those metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. You are right, we would need to prepend the values with the location. The labels would be identical since they are both on the same node and the nodeType will always be client for these two. Also, the metrics can and likely always will be populated at the same time, so we definitely need to change them.

@roclark roclark force-pushed the wip-add-client-rpc-stats branch 3 times, most recently from 9c8dd0a to fae3d95 Compare June 6, 2017 20:35
@joehandzik
Copy link
Contributor

@knweiss @mjtrangoni Do either of you have strong opinions about the metric names for this PR? We're trying to decided if the rpcs_in_flight value should have the osc/mdc text prepended to it, or if we should try to add a new label to the rpcs_in_flight metric for what type of client rpc the value represents.

@joehandzik
Copy link
Contributor

@roclark I'll merge this sometime after 1 PM if I haven't heard anyone say that I shouldn't!

@mjtrangoni
Copy link
Contributor

Hi @joehandzik,
I would prefer having a new label here {osc,mdc,mdt} since it is much better for Grafana`s graph. With different variables you will have to define more than just one query for "rpc_stats".

@joehandzik
Copy link
Contributor

@mjtrangoni Thanks, I figured that would be the answer but wanted to confirm before sending Robert down that path.

@roclark Rename the metrics back to what you originally had for rpcs_in_flight, and add some sort of label (node/type/???). You should be able to reuse the extra value fields I added for my converged stats PR.

@roclark
Copy link
Contributor Author

roclark commented Jun 9, 2017

On it! Thanks @mjtrangoni and @joehandzik!

Signed-Off-By: Robert Clark <robert.d.clark@hpe.com>
@roclark
Copy link
Contributor Author

roclark commented Jun 9, 2017

Done! Though I did notice the client label this time which shows what it is targeting (ie. lustrefs-OST0000-osc-ffffxxxxxxxxxxxx). This will make sure we don't overlap identical metrics on different targets, but I still like @mjtrangoni's suggestion so that we can easily parse the type in Grafana.

@joehandzik
Copy link
Contributor

LGTM! We can wait until Monday if necessary to get full buy-in from @mjtrangoni

@knweiss
Copy link

knweiss commented Jun 9, 2017

@joehandzik Sorry, for not answering earlier. I'm busy and didn't have time to test or review these PR yet. It doesn't help that it's hard to review the patches without access to a real Lustre system (i.e. at home).

Did you ever think about a test suite with copies of proc files (one of each) from a real Lustre OSS, MDS, and client respectively? I.e. a test suite that allows simulating the lustre_exporter output for each of these machine types?

This would make before/after comparisons of PRs much easier and faster. Also, it would be great help for end-to-end testing and benchmarking/performance tuning. I think @mjtrangoni's issue #93 goes into the same direction.

(E.g. see node_exporter's end-to-end-test.sh with its reference output e22-output.txt. However, please also notice prometheus/node_exporter#478). There's also the remaining problem of negative tests that must fail...)

@joehandzik
Copy link
Contributor

@knweiss No worries! I was just asking to make sure that we had the right design, not necessarily expecting you to actually do a full-blown review of the code or functionality. We'll definitely see what we can add with respect to #93.

@joehandzik
Copy link
Contributor

LGTM, merging.

@joehandzik joehandzik merged commit 95b81a8 into master Jun 12, 2017
@roclark roclark deleted the wip-add-client-rpc-stats branch June 14, 2017 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add client statistics
4 participants