Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use counter time in win perf counters #4267

Merged
merged 11 commits into from
Jun 30, 2018

Conversation

vlastahajek
Copy link
Contributor

Trying to fix #4250: added possibility to use timestamp from performance counters

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

@danielnelson danielnelson added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services) labels Jun 12, 2018
@danielnelson danielnelson added this to the 1.8.0 milestone Jun 12, 2018
@matthenning
Copy link

matthenning commented Jun 13, 2018

@danielnelson I'm not sure this can be classified as an enhancement.
If it indeed fixes #4250 this is a bugfix. I hope this makes it in 1.7.1.

Copy link
Contributor

@danielnelson danielnelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also rebase or merge master to bring in a build fix?

}

return nil
}

//returns true if err is an error we count with
func isKnowError(err error) bool {
if phderr, ok := err.(*PdhError); ok && (phderr.ErrorCode == PDH_INVALID_DATA || phderr.ErrorCode == PDH_CALC_NEGATIVE_VALUE || phderr.ErrorCode == PDH_CSTATUS_INVALID_DATA) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you line wrap this around 78 chars?

}

return nil
}

//returns true if err is an error we count with
func isKnowError(err error) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename this function and redo the comment? I think based on your comment above maybe you could name it isCounterDataError. You can probably remove the comment above if this name is more precise.

- refactoring common measurement creation code to a function
@@ -98,6 +103,10 @@ func TestWinPerformanceQueryImpl(t *testing.T) {
require.NoError(t, err)

arr, err := query.GetFormattedCounterArrayDouble(hCounter)
if phderr, ok := err.(*PdhError); ok && phderr.ErrorCode != PDH_INVALID_DATA && phderr.ErrorCode != PDH_CALC_NEGATIVE_VALUE {
time.Sleep(time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a red flag for me because it indicates the test may be timing dependent, why do we need to sleep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid data error happens randomly, but can be seen several times in an hour. It generally means that some counter provided invalid value or it was unable to compute value from samples gathered during provided period. If we give it more time (which happens automatically during next gather cycle in normal Telegraf run), it solves the problem.

When using wildcards expansion, each counter is queried for value separately is it is filtered in my code on that level.
When using _PdhGetFormattedCounterArray, it queries multiple counters at once and it seems that if any counter returns status about invalid data, the function fails completely.

The code historically didn't bothered about errors, they were totally ignored. Maybe it is the way.

It makes sense for me to tell about unexpected errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this can happen in the actual code then we should probably ignore or ignore and log. In the tests though this will actually be more problematic, we don't want to miss failures but we also don't want slow tests or intermittent failures.

The best fix would be to mock the calls to the library, in the meantime mark this test to skip if -short is set during testing, which I just added in ee6e4b0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is already skipped in the short mode.

Do you mean to log such error as warning ? Not as error as it is now..

Regarding commit ee6e4b0. Are all tests (not the short mode) run some check during a pull request checking?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think we are good then, obviously timing based stuff is problematic but since it is skipped on short it is okay. We always only run go test -short for CI unit tests, which means tests like this are almost always skipped unless a developer runs them by hand.

@danielnelson danielnelson changed the title Attempt to fix #4250 Use counter time in win perf counters Jun 30, 2018
@danielnelson danielnelson merged commit ed2bc11 into influxdata:master Jun 30, 2018
rgitzel pushed a commit to rgitzel/telegraf that referenced this pull request Oct 17, 2018
otherpirate pushed a commit to otherpirate/telegraf that referenced this pull request Mar 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/windows Related to windows plugins (win_eventlog, win_perf_counters, win_services) feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1.7.0-rc1 PerfMon % Processor Time discrepancy
3 participants