-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an input plugin to monitor basic info of Windows Services #3023
Conversation
…m state and fixed startup config
Measurement name made configurable Error service is indicated with state -1
- snake case names of fields/tag - simplified configuration - state and startup mode as string - better error reporting
- added sample TICK script
This reverts commit 3b4df68.
- Better script alert message
It produces: | ||
``` | ||
* Plugin: inputs.win_services, Collection 1 | ||
> win_services,host=WIN2008R2H401,display_name=Server,service_name=LanmanServer state="running",startup_mode="auto_start" 15 00040669000000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a stray space in 15 00040669000000000
``` | ||
It produces: | ||
``` | ||
* Plugin: inputs.win_services, Collection 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove everything in this section up through this line, so the only items are the example line protocol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, np.
I made it this way cause it's done so in the example plugins you advised me (ngnix, kapacitor).
"github.com/influxdata/telegraf" | ||
"github.com/influxdata/telegraf/plugins/inputs" | ||
"golang.org/x/sys/windows/svc/mgr" | ||
"fmt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure you are using go fmt
or the tests won't pass.
] | ||
` | ||
|
||
var description = "Input plugin to report Windows services info: service name, display name, state, startup mode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't mention the specific fields, this will be enough: "Input plugin to report Windows services info"
serviceInfos[i].DisplayName = srvCfg.DisplayName | ||
serviceInfos[i].StartUpMode = int(srvCfg.StartType) | ||
} else { | ||
serviceInfos[i].Error = fmt.Errorf("Could not get config of service '%s': %s", srvName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If either Query or Config fail the ServiceInfo could have zero values on some of the fields. This will cause tags/fields to be empty strings which we don't want. I would append the ServiceInfo only if there are no errors.
For error handling this will probably require you to pass the accumulator into this function. Also, you probably only want to report the first error per service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relates to the issue above. A weird state value will reported as an error, but for example empty display name is a valid value, as Windows API allows this.
var KnownServices = []string{"LanmanServer", "TermService"} | ||
|
||
func TestList(t *testing.T) { | ||
services, err := listServices(KnownServices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to add any tests that require special permissions or certain services to running (unless it can also set them up). It would be better to use an interface with a test version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what is the best goal here.
I can easily mock listServices to test the Gather method, but to properly test listServices function I will need to mock Mgr and Service from mgr package. As there is type dependency, as Mgr.OpenService returns mgr.Service, I can not just create an interface which will be satisfied by existing structs in mgr package and their methods.
I would need to create proxies of both to satisfy new interface in real implementation and this will complicate the code. Did you mean this?
Direct API calls are always harder co mock than RPC calls most current plugins use, but I don't mean I don't want to do it, in fact, it will be fun.. ;),
Cause current tests use services available on every Windows edition. I would rather create a special for test purposes, but that would be more E2E test and that's probably not suitable for such test suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chronograf uses a nice pattern for mocks that can handle this. I wrote an example and added it to this gist https://gist.github.com/danielnelson/79908f0bc7145d3feb7a91e0ef56d88c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly what I meant.
You have to create on struct (proxy) that satisfies interface and redirect call to real implementation,
a lot boilerplate code .
But if it has to be that way...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to have unit tests. The integration tests can remain but should be guarded with a testing.Short()
test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
What is the best practise to store unit tests and integration test. I would separate them into two files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have win_services_test.go
and win_services_integration_test.go
.
tags["service_name"] = service.ServiceName | ||
|
||
fields["state"] = ServiceStatesMap[service.State] | ||
fields["startup_mode"] = ServiceStartupModeMap[service.StartUpMode] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to do something smart if the value returned is not in the map, probably return an error and skip the service. This could be tested earlier on in listServices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussion in the issue leads to using numbers I will use directly what I get from Windows API and report unusual values as error, ok?
@@ -0,0 +1,76 @@ | |||
# Telegraf Plugin: win_services | |||
Input plugin to report Windows services info: service name, display name, state, startup mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't list the tags/fields here, so we won't need to keep this updated if we add anything in the future.
- Fixed few mistakes
- Extended values validation - Reporting first found service error
I implemented all changes except changing the testing approach. |
if err == nil { | ||
state := int(srvStatus.State) | ||
if !checkState(state) { | ||
serviceInfo.Error = fmt.Errorf("Uknown state of Service %s: %d", serviceName, state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in Uknown -> Unknown
return | ||
} | ||
|
||
//returns true of state is in valid range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't super important to me, but go has the convention of starting the comment with the name of the function. https://blog.golang.org/godoc-documenting-go-code, ex:
// checkState returns true if state is valid
tags := make(map[string]string) | ||
|
||
tags["display_name"] = service.DisplayName | ||
tags["service_name"] = service.ServiceName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tags should not contain the empty string, since InfluxDB will not accept them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the best practice for storing a measurement in case of a tag has empty value?
- Skip the tag? Store just the service_name tag and fields?
- Insert a place holder text, e.g. "" or "<empty_display_name>"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either skip the tag or omit the entire point, depending on if the field needs to be required. For display name I would skip, since it is optional. For service name I would omit the point.
return serviceInfos, nil | ||
} | ||
|
||
func collectServiceInfo(scmgr *mgr.Mgr, serviceName string) (serviceInfo ServiceInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend returning (ServiceInfo, error) then you can remove the Error field from ServiceInfo.
serviceInfos := make([]ServiceInfo, len(serviceNames)) | ||
|
||
for i, srvName := range serviceNames { | ||
serviceInfos[i] = collectServiceInfo(scmgr, srvName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Easiest way to deal with zero value tags/fields IMO is to return an (ServiceInfo, error) and append if err != nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would then need to propagate a service error to Gather, or pass Accumulator here.
I want to keep the collecting info separated from plugin API.
The ServiceInfo structure is a domain model of a service, any error occurred during the collecting of service info is a property of such model.
This allows future enhancement in case a user will want to record also service errors into db. Because I still think that if user wants to monitor a particular service and on some hosts it is not possible due to an error, user has to look into the telegraf log instead into the db.
But maybe not, we will see.
|
||
//While getting service info there could a theoretically a lot of errors on different places. | ||
//However in reality if there is a problem with a service then usually openService fails and if it passes, other calls will most probably be ok | ||
//So, following error checking is just for sake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this comment, basically applies to almost any error :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Network, or more generally communication, errors are for example more probable than subsequent errors from Windows Service API. I still think that checking valid ranges of state and startup_mode is not necessary cause if invalid value would be return than the function will return error anyway.
This comment was intended to explain this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could get rid of the checkState logic since we are now operating on ints, but we need to check all errors from the external api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, checking errors of functions is must, no doubts.
|
||
var description = "Input plugin to report Windows services info." | ||
|
||
type Win_Services struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this WinServices
var KnownServices = []string{"LanmanServer", "TermService"} | ||
|
||
func TestList(t *testing.T) { | ||
services, err := listServices(KnownServices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chronograf uses a nice pattern for mocks that can handle this. I wrote an example and added it to this gist https://gist.github.com/danielnelson/79908f0bc7145d3feb7a91e0ef56d88c
- better name of main struct - better methods description - removed useless checking - added empty service description checking
Hi Daniel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this plugin to the test-windows makefile target? You might need to rebase since we have modified it recently.
@@ -0,0 +1,180 @@ | |||
// +build windows | |||
|
|||
//this test must be run under administrator account |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still required?
Both done |
Info
Input plugin to report Windows services info: service name, display name, state, startup mode
Tested on Windows Server 2008 R2 Enterprise, Windows 10 Enterprise
It requires Telegraf must run under the administrator privileges
Readme includes sample TICK script
Required for all PRs: