Description
What version of Go are you using (go version
)?
go version go1.9.2 windows/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env
)?
set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Users\XXXXX\d\go
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\XXXXX\AppData\Local\Temp\go-build933228923=/tmp/go-build -gno-record-gcc-switches
set CXX=g++
set CGO_ENABLED=1
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
What did you do?
Configured our Go language application to start as a Windows Service, with start type of 'Automatic'. The service is expected to start when the system boots up. Sometimes the service process will not be running after the system has booted up and the 'Windows Logs / 'System' logs in Event Viewer show two logs:
Event ID: 7009
Level: Error
Text:
A timeout was reached (30000 milliseconds) while waiting for the <service name> service to connect.
Event ID: 7000
Level: Error
Text:
The <service name> service failed to start due to the following error:
The service did not respond to the start or control request in a timely fashion.
These events do not appear in the logs when the service successfully starts.
It appears that Windows terminates the service process when the service fails to respond to the Windows Service start event within the 30 s time frame. Our application contains the Windows Service code based on the sample code from sys/windows/svc/example
.
This timeout issue occurs both with 64-bit and 32-bit versions of our executable.
What did you expect to see?
The Go language Windows service start reliably every time the system boots up.
What did you see instead?
The Go language service does not start reliably on some systems on boot-up.
Some observations from systems where the service often fails to start during boot-up:
- Manually starting the service once the system is up and running has never failed to start correctly. Timestamps from the logs show the process starting quickly (no more than a few seconds).
- Setting the service's start type to 'Automatic (Delayed Start)', where Windows starts the service 120 s after the 'Automatic' services, works without fail. The user is able to login to the desktop well before the service has been started. It is possible to change this global Windows setting to a lower value (eg 60 s) to make it less noticeable to the user, but because it will affect all delayed start services, it is not an acceptable solution. Some search results claim a per-service delay value can be specified, but we were not able to make this work.
- Changing the 30000 ms (30 s) Windows service timeout value to larger values (60 s or 90 s) allowed the service to startup. However, this is a global Windows setting and it is not possible to set the timeout per-service. Again, this is not an acceptable solution.
- We experimented with using the Windows Service "Recovery" settings to restart our service in an attempt to mitigate this problem. However, it appears that these recovery settings have no effect for services that fail to start in time.
- We instrumented our service with timestamped logs when the first imported package is initialized and when
main()
starts. We found thatmain()
was often not reached when the timeout occurred. Sometimes even the first imported package's initialization was not reached. - A stripped-down Windows service that just logged would often fail to start, but it was less likely to happen compared to our full-fledged service.
- Using a simple C++ Windows Service (code here) modified to launch our executable (running as a 'normal' application, not as a Windows Service) always launched without fail (no timeouts) during boot-up. Logs within our executable, as well as in the C++ service (just prior to the
CreateProcess()
call that started our executable) often showed large time deltas. The logs indicated it often took a long time for the Go executable to get launched, and also to reach themain()
.
While we have some systems (VirtualBox VMs) that reliably show these startup problems, the issue is harder to reproduce on other systems. We have VirtualBox VMs where the problem occurs only occasionally and we have yet to see the timeout occur on non-virtual Windows 10 installations. Timestamp-instrumented loads on those systems sometimes show time deltas of up to 10 s during startup. While this is not long enough to trigger the 30 s timeout on a powerful real system, the concern is that in virtualized environments and/or slower systems the timeout will be reached and the service will not start.
Sample Programs
Sample code can be found in this GitHub repository, including instructions for building and installing (README.md).
Both the launchserv.exe
and winserv.exe
programs contain the Go Windows Service code (again, based one the sample code). The Go Windows Service sample code was combined into a single file in both cases in order to 'flatten' the main package for tracking imported package initialization.
It is intended that only one of the two should run as a Windows Service. The recommended configuration is to install launchserv.exe
as a Windows Service ("my service"), and it will in turn launch winserv.exe
as a child process (winserv.exe
will run as a 'normal' program, not a service). launchserv.exe
writes a timestamped log just prior to starting winserv.exe
, which allows measuring the time it takes for winserv.exe
to start.
winserv.exe
contains a large number of standard package imports, to mimic a real program. These imports are marked as 'unused' and as such only their initializations run (variable initialization and init()
).
The VirtualBox version of the Win10 MSEdge test VM can be downloaded from Microsoft here to run these test programs. The following changes may help to increase the likelihood of seeing the issue with this, or other systems:
- Increase the number of CPU cores from 1 (to say 2 or 4). This might increase the concurrent activity when the system is starting up.
- Install the Guest Additions.
- Install other software that starts services when the system boots. For example, one VM that shows the issue very reliably has two VNC server packages, OpenVPN client and an SSL VPN client (Array Networks) (as well as other software) installed.
Multiple boot-ups of the system may be required to see the problem. In many cases, a VM may sometimes demonstrate the problem, but then not do so at other times. YMMV.
When delays are seen, the two most common places are:
- Launching
winserv.exe
; that is, between the lastlaunchserv.exe
log and the firstwinserv.exe
log. Time deltas as high as 26 s have been seen (and as low as less than 400 ms) have been seen on an Win10/MSEdge VM. - The 'lots of imports' in
winserv.exe
main.go
, between the 'b' and 'c' logs. These are the large number of unused standard package imports. Times as high as 9 s (and less than 100 ms on the low end) have been seen on the same Win10/MSEdge VM.
This service is an important component of our product; we are adding more functionality all the time. As the amount of code increases, it seems even more likely that this problem will occur. If there is anything else we can do to help resolve this issue (experiments, measurements, debugging, etc), please let us know.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status