-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use counter time in win perf counters #4267
Conversation
@danielnelson I'm not sure this can be classified as an enhancement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also rebase or merge master to bring in a build fix?
} | ||
|
||
return nil | ||
} | ||
|
||
//returns true if err is an error we count with | ||
func isKnowError(err error) bool { | ||
if phderr, ok := err.(*PdhError); ok && (phderr.ErrorCode == PDH_INVALID_DATA || phderr.ErrorCode == PDH_CALC_NEGATIVE_VALUE || phderr.ErrorCode == PDH_CSTATUS_INVALID_DATA) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you line wrap this around 78 chars?
} | ||
|
||
return nil | ||
} | ||
|
||
//returns true if err is an error we count with | ||
func isKnowError(err error) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename this function and redo the comment? I think based on your comment above maybe you could name it isCounterDataError
. You can probably remove the comment above if this name is more precise.
- refactoring common measurement creation code to a function
@@ -98,6 +103,10 @@ func TestWinPerformanceQueryImpl(t *testing.T) { | |||
require.NoError(t, err) | |||
|
|||
arr, err := query.GetFormattedCounterArrayDouble(hCounter) | |||
if phderr, ok := err.(*PdhError); ok && phderr.ErrorCode != PDH_INVALID_DATA && phderr.ErrorCode != PDH_CALC_NEGATIVE_VALUE { | |||
time.Sleep(time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a red flag for me because it indicates the test may be timing dependent, why do we need to sleep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid data error happens randomly, but can be seen several times in an hour. It generally means that some counter provided invalid value or it was unable to compute value from samples gathered during provided period. If we give it more time (which happens automatically during next gather cycle in normal Telegraf run), it solves the problem.
When using wildcards expansion, each counter is queried for value separately is it is filtered in my code on that level.
When using _PdhGetFormattedCounterArray, it queries multiple counters at once and it seems that if any counter returns status about invalid data, the function fails completely.
The code historically didn't bothered about errors, they were totally ignored. Maybe it is the way.
It makes sense for me to tell about unexpected errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this can happen in the actual code then we should probably ignore or ignore and log. In the tests though this will actually be more problematic, we don't want to miss failures but we also don't want slow tests or intermittent failures.
The best fix would be to mock the calls to the library, in the meantime mark this test to skip if -short
is set during testing, which I just added in ee6e4b0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is already skipped in the short mode.
Do you mean to log such error as warning ? Not as error as it is now..
Regarding commit ee6e4b0. Are all tests (not the short mode) run some check during a pull request checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think we are good then, obviously timing based stuff is problematic but since it is skipped on short it is okay. We always only run go test -short
for CI unit tests, which means tests like this are almost always skipped unless a developer runs them by hand.
Trying to fix #4250: added possibility to use timestamp from performance counters