Description
Describe the bug
This is no traditional bug report, more a documentation how to solve the described issue.
If maintainers would like a PR to expand the docs anywhere to contain information about this (not sure what from it), I could try to contribute.
If not, this could also considered a rant and be closed. Maybe it helps somebody google landing here.
I spent several HOURS to get a solution for something that should by trivial to do.
The Situation:
- Running
otelcol.exe
on windows as a windows-service (supported natively by the binary) - Otelcol spams windows application eventlog with messages (is this config documented anywhere?),
- The event logs are somewhat broken? (see screenshots below)
- It is not possible to "bypass" otelcol telemetry logs to an
exporter
to send them off directly? (e.g. usingloki
)
The Goal:
- Write logmessages of
otelcol
to files (to tail and ship to central logging server)- This is necessary to observe issues with otelcol
- Use
json
as log format - Use Log rotation so the file does not get huge and fills up harddisk in any case
- Log Roration is a must have feature on production ⚡
SCREENSHOTS: Buggy windows application logs from otelcol (CLICK ME)
The journey:
The docs here https://opentelemetry.io/docs/collector/configuration/#service describe the telemetry
logs and link to here: https://github.com/open-telemetry/opentelemetry-collector/blob/7666eb04c30e5cfd750db9969fe507562598f0ae/config/service.go
// Encoding sets the logger's encoding.
// Example values are "json", "console".
Encoding string `mapstructure:"encoding"`
// OutputPaths is a list of URLs or file paths to write logging output to.
// The URLs could only be with "file" schema or without schema.
// The URLs with "file" schema must be an absolute path.
// The URLs without schema are treated as local file paths.
// "stdout" and "stderr" are interpreted as os.Stdout and os.Stderr.
// see details at Open in zap/writer.go.
// (default = ["stderr"])
OutputPaths []string `mapstructure:"output_paths"`
Windows Eventlog is not mentioned however.
encoding: json
worked when testing in console. But stdout
/ stderr
are not available in a windows-service and can't be redirected either (afaik). So tried using a filepath:
telemetry:
logs:
level: info
encoding: json
output_paths: ["otelcol.log.json"]
Realize using relatives paths is not good, because windows services use C:\Windows\System32
(doh) so your logs end up there.
Since our installation directory is fixed, I then tried using an absolute file path for output_paths
, which was unsuccessful.
After googling found this: uber-go/zap#621
For nearly 4 yours, logging to absolute file paths on windows is broken in zap ¯\(ツ)/¯
I nearly managed a hack-around by using this path to get out from system32
:
output_paths: ["../../Program Files/OtelCollector/otelcol.log.json"]
which created the logfile in the correct directory, but did not write any logs to it. Could not figure out why.
Also, otelcol source does not mention log filesize limit / log rotation (?), so unsuitable anyway.
Workaround service wrapper
Logging natively seemed not possible. So try using stdout
and redirect that to a file using 3rd party.
https://nssm.cc/ is no longer maintained and dubious, so tried using https://github.com/winsw/winsw
When starting otelcol as a child of winsw as service, it crashed with the following message:
The service process could not connect to the service controller
After searching the github repo find this:
The process may fail to start in a Windows Docker container with the following error: The service process could not connect to the service controller. In this case the NO_WINDOWS_SERVICE=1 environment variable should be set to force the collector to be started as if it were running in an interactive terminal, without attempting to run as a Windows service.
The final puzzle peace. Using an env variable in winsw config kept otelcol from crashing.
Logs from stdout where picked up by winsw and logged to a file, enabling filerotation through winsw.
(the files are then tailed by Grafana promtail and sent to loki, since logging in OTEL is still unstable)
What did you expect to see?
An easier way to log otelcol log-messages to a file.
What version did you use?
Version: (e.g., v0.48.0
)
Environment
OS: Windows 10 / Windows Server 2019