Description
Description
Service startup is noticeably slower with Ruby 3.
I'm talking about the time to reach the Running
state. In other words, the time to finish StartServiceCtrlDispatcher.
With Ruby 2, it takes about 5 seconds, but with Ruby 3, it can take more than 20 seconds.
The application implementation is irrelevant.
Windows service needs to finish StartServiceCtrlDispatcher in 30 seconds by default.
With Ruby 3 or later, this problem can cause timeout errors.
EventID 7000
The testservice service failed to start due to the following error:
The service did not respond to the start or control request in a timely fashion.
EventID 7009
A timeout was reached (30000 milliseconds) while waiting for the testservice service to connect.
This problem occurs when both of the following conditions are met.
- Ruby 3 (Ruby 3.0.0 ~ 3.3.0)
- Mulitple CPU
- This problem does not occur with single CPU (such as EC2 t2.small type)
Gem Version
2.3.2
Windows Version
I confirm this problem in the following:
- Windows 10 Home
- Windows Server 2019 Base EC2 Instance (t2.medium, t2.large, c5.large)
Replication Case
Here is a graph of the number of seconds to start a sample service for each Ruby version:
(I will explain 322 with patch
at the end)
Measured using a minimum implementation service:
require "win32/daemon"
class TestService < Win32::Daemon
def service_main
while running?
sleep 10
end
end
end
TestService.new.mainloop
Environment:
- Windows 10 Home 64bit
- 4 core CPU, 8 logical processors
- Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz 1.99GHz
- 16GB RAM
Steps:
- Register the sample service above (
C:\test\run.rb
)
require "win32/service"
# Specify ruby bin path
ruby_path = ""
Win32::Service.create(
service_name: "testservice",
binary_path_name: "\"#{ruby_path}\" C:\\test\\run.rb",
display_name: "testservice",
)
- Measure the time with the following PowerShell script
while($true)
{
$datetime = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
echo "$datetime, Start launching the service"
Start-Service testservice
$datetime = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
echo "$datetime, Finish launching the service"
Start-Sleep -seconds 10
Stop-Service testservice
Start-Sleep -seconds 10
}
- After measuring, delete the service.
require "win32/service"
Win32::Service.delete("testservice")
Additional Information
- I confirm the same tendency in the AWS EC2 instance Windows Server 2019 Base t2.medium, t2.large, c5.large
- Interestingly, this problem does not occur in t2.small
- I think it occurs in environments with multiple CPUs
- Maybe some implementation about threading has changed with Ruby 3 series, and this seems to be influenced by that
Possible causes and solutions
Maybe the problem is on Ruby's side, but I think it could be avoided by implementation on this side.
In the current implementation, it switches between multiple Ruby threads and non-Ruby threads.
This complex process may be unsuitable for Ruby 3.
https://github.com/chef/win32-service/blob/win32-service-2.3.2/lib/win32/daemon.rb#L257
CreateThread
makes a non-Ruby thread, and it calls Ruby Proc, and it makes a Ruby thread, and ...
Also, StartServiceCtrlDispatcher
creates a non-Ruby thread, and it makes a new Ruby thread again and executes Service_Main
Ruby Proc...
Although this is still PoC, I found that using Ruby thread from the beginning, as shown below, was very fast.
(It is 322 with patch
data in the above graph).
lib\win32\daemon.rb
@@ -254,26 +254,21 @@
raise SystemCallError.new("CreateEvent", FFI.errno)
end
- hThread = CreateThread(nil, 0, ThreadProc, Service_Main, 0, nil)
-
- if hThread == 0
- raise SystemCallError.new("CreateThread", FFI.errno)
- end
-
- events = FFI::MemoryPointer.new(:pointer, 2)
- events.put_pointer(0, FFI::Pointer.new(hThread))
- events.put_pointer(FFI::Pointer.size, FFI::Pointer.new(@@hStartEvent))
-
- while (index = WaitForMultipleObjects(2, events, 0, 1000)) == WAIT_TIMEOUT
- end
-
- if index == WAIT_FAILED
- raise SystemCallError.new("WaitForMultipleObjects", FFI.errno)
- end
-
- # The thread exited, so the show is off.
- if index == WAIT_OBJECT_0
- raise "Service_Main thread exited abnormally"
+ hThread = Thread.new(Service_Main) do |service_proc|
+ ste = FFI::MemoryPointer.new(SERVICE_TABLE_ENTRY, 2)
+
+ s = SERVICE_TABLE_ENTRY.new(ste[0])
+ s[:lpServiceName] = FFI::MemoryPointer.from_string("")
+ s[:lpServiceProc] = service_proc
+
+ s = SERVICE_TABLE_ENTRY.new(ste[1])
+ s[:lpServiceName] = nil
+ s[:lpServiceProc] = nil
+
+ StartServiceCtrlDispatcher(ste)
+ end
+
+ while WaitForSingleObject(@@hStartEvent, 1000) == WAIT_TIMEOUT
end
thr = Thread.new do
@@ -322,6 +317,7 @@
end
thr.join
+ hThread.join
end
# Returns the state of the service (as an constant integer) which can be any