Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an optional health metric for the sqlserver input plugin #8544

Merged
merged 9 commits into from
Mar 11, 2021
Merged

Add an optional health metric for the sqlserver input plugin #8544

merged 9 commits into from
Mar 11, 2021

Conversation

coquagli
Copy link
Contributor

@coquagli coquagli commented Dec 11, 2020

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

Add the ability to collect a per-connection health metric for the sqlserver input plugin. This health metric is optional and is disabled by default. Users can leverage this metric to detect if their SQL Server monitoring is not working as intended. The details below are fairly comprehensive and are copied from the updated README.

Health Metric
All collection versions (version 1, version 2, and database_type) support an optional plugin health metric called sqlserver_telegraf_health. This metric tracks if connections to SQL Server are succeeding or failing. Users can leverage this metric to detect if their SQL Server monitoring is not working as intended.

In the configuration file, toggling health_metric to true will enable collection of this metric. By default, this value is set to false and the metric is not collected. The health metric emits one record for each connection specified by servers in the configuration file.

The health metric emits the following tags:

  • sql_instance - Name of the server specified in the connection string. This value is emitted as-is in the connection string. If the server could not be parsed from the connection string, a constant placeholder value is emitted
  • database_name - Name of the database or (initial catalog) specified in the connection string. This value is emitted as-is in the connection string. If the database could not be parsed from the connection string, a constant placeholder value is emitted

The health metric emits the following fields:

  • attempted_queries - Number of queries that were attempted for this connection
  • successful_queries - Number of queries that completed successfully for this connection
  • database_type - Type of database as specified by database_type in the configuration file. If database_type from the configuration file is empty, the QueryVersion and AzureDB fields are concatenated instead

If attempted_queries and successful_queries are not equal for a given connection, some metrics were not successfully gathered for that connection. If successful_queries is 0, no metrics were successfully gathered.

* First iteration of health metric changes

* Update tests and add new column

* Revert config

* Update central config docs

* Update README

* Update README again
@coquagli coquagli changed the title Add an optional health metric for the sqlserver input plugin (#1) Add an optional health metric for the sqlserver input plugin Dec 11, 2020
@sjwang90 sjwang90 added area/sqlserver feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin labels Dec 11, 2020
Copy link
Contributor

@ssoroka ssoroka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is largely covered already by the regular output stats collected by inputs.internal

@masree
Copy link
Contributor

masree commented Jan 5, 2021

I think this is largely covered already by the regular output stats collected by inputs.internal

Thanks @ssoroka ! I took a look at inputs.internal and it looks like these metrics are reported either at the agent level or at the plugin level. However, we're looking to gather metrics per connection (i.e server/database combination). Having this data point will help us check which connection failed vs which connection succeeded in a plugin and also number of queries that were successfully executed per connection. From my understanding, I think it is not possible to gather such metrics per connection from inputs.internal as of today. It would be great if you can help clarify and let us know your thoughts. Thanks!

@coquagli coquagli requested a review from ssoroka January 14, 2021 18:13
@masree
Copy link
Contributor

masree commented Jan 14, 2021

I think this is largely covered already by the regular output stats collected by inputs.internal

I think this is largely covered already by the regular output stats collected by inputs.internal

Thanks @ssoroka ! I took a look at inputs.internal and it looks like these metrics are reported either at the agent level or at the plugin level. However, we're looking to gather metrics per connection (i.e server/database combination). Having this data point will help us check which connection failed vs which connection succeeded in a plugin and also number of queries that were successfully executed per connection. From my understanding, I think it is not possible to gather such metrics per connection from inputs.internal as of today. It would be great if you can help clarify and let us know your thoughts. Thanks!

@ssoroka Please let us know if per connection level metric is possible from inputs.internal. From the docs, we could not its usage for such granularity. Thanks!

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @coquagli thanks for submitting this PR. I think the code looks quite good already. I have one major remark about using explicit returns. In telegraf we use explicit returns instead of implicit ones, so please change those.

@srebhan srebhan self-assigned this Feb 3, 2021
@coquagli
Copy link
Contributor Author

coquagli commented Feb 4, 2021

Thanks for the review @srebhan! I've pushed another iteration to address your comments.

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only remaining thing you have to look into is the first return in getConnectionIdentifiers(), not sure but you might want to return the defaults!?

All other comments are suggestions or a matter of taste, so please change or keep as you like. Just leave me a comment on those what your decision is and resolve the resolved comments. :-)

@coquagli
Copy link
Contributor Author

coquagli commented Feb 4, 2021

Hey, @srebhan thanks for the great suggestions. I pushed some more changes.

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks perfect to me. Thank you very much for the nice collaboration!

@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Feb 4, 2021
@coquagli
Copy link
Contributor Author

coquagli commented Mar 2, 2021

@sjwang90 @srebhan @jagularr @ssoroka what else is needed to merge this?

Copy link
Contributor

@ssoroka ssoroka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, looks good. Just needs a merge conflict fixed.

@coquagli
Copy link
Contributor Author

coquagli commented Mar 3, 2021

Should be good now @ssoroka.

@helenosheaa helenosheaa merged commit 30e189d into influxdata:master Mar 11, 2021
jblesener pushed a commit to jblesener/telegraf that referenced this pull request Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sqlserver feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants