windows: handle pending control actions#268
Conversation
|
Marked as a draft; there's a similar issue around This will cause problems for some of the other controls which don't check or wait for pending state. On windows As such, I'm going to rework this patchset to make |
ea2447d to
a0e7b19
Compare
|
This seems better now, and likely ready for review. Originally I could run my own pkg's control tests once, sometimes passing and sometimes not. I'm going to investigate this more later. |
67cfdf2 to
7b85551
Compare
|
I reworked the patchset so that communication with the service manager happens during Some of the constants used are arbitrary, like the time value's and attempt count. Some of the tests from my pkg:Each test was run 100 times without failure. func TestServiceControl(t *testing.T) {
t.Run("bad sequence", func(t *testing.T) {
for _, test := range []struct {
controlAction string
shouldError bool
}{
{"invalid control action", true},
{"uninstall", true},
{"install", false},
{"start", false},
{"start", true},
{"stop", false},
{"stop", true},
{"uninstall", false},
} {
var (
controlAction = test.controlAction
shouldError = test.shouldError
)
t.Run(controlAction, func(t *testing.T) {
err := issueControlRequest(controlAction)
if shouldError &&
err == nil {
t.Errorf("control \"%s\" was supposed to return an error, but did not",
controlAction)
}
if !shouldError &&
err != nil {
t.Errorf("control \"%s\" returned error: %s",
controlAction, err)
}
})
}
waitForUninstall(t)
})
t.Run("good sequence", func(t *testing.T) {
for _, testControl := range []string{
"install",
"start",
"restart",
"stop",
"uninstall",
} {
controlAction := testControl
t.Run(controlAction, func(t *testing.T) {
if err := issueControlRequest(controlAction); err != nil {
t.Error(err)
}
})
}
waitForUninstall(t)
})
}Supplemental notes: I further tested the Windows service checkpoint+waithint mechanisms. However, this package does not have a means for a service implementation to do so. While testing this out, I had to modify Go's sys patch:diff --git a/windows/svc/mgr/service.go b/windows/svc/mgr/service.go
index 7d735ca..cb7ff4f 100644
--- a/windows/svc/mgr/service.go
+++ b/windows/svc/mgr/service.go
@@ -54,8 +54,12 @@ func (s *Service) Control(c svc.Cmd) (svc.Status, error) {
return svc.Status{}, err
}
return svc.Status{
- State: svc.State(t.CurrentState),
- Accepts: svc.Accepted(t.ControlsAccepted),
+ State: svc.State(t.CurrentState),
+ Accepts: svc.Accepted(t.ControlsAccepted),
+ CheckPoint: t.CheckPoint,
+ WaitHint: t.WaitHint,
+ Win32ExitCode: t.Win32ExitCode,
+ ServiceSpecificExitCode: t.ServiceSpecificExitCode,
}, nil
}
@@ -70,6 +74,8 @@ func (s *Service) Query() (svc.Status, error) {
return svc.Status{
State: svc.State(t.CurrentState),
Accepts: svc.Accepted(t.ControlsAccepted),
+ CheckPoint: t.CheckPoint,
+ WaitHint: t.WaitHint,
ProcessId: t.ProcessId,
Win32ExitCode: t.Win32ExitCode,
ServiceSpecificExitCode: t.ServiceSpecificExitCode,After that, I just hardcoded sleeps and checkpoint responses within |
|
Also, I previously made Partial client side implementation I'm using for the test posted above:// The service uninstall control may return
// before the system service is fully stopped and deleted.
// As such, we want to explicitly wait between tests
// until the service control manager finishes removing the service,
// or we give up.
func waitForUninstall(t *testing.T) {
t.Helper()
const (
checkInterval = 100 * time.Millisecond
checkTimeout = 10 * time.Second
)
var (
ticker = time.NewTicker(checkInterval)
timeout = time.NewTimer(checkTimeout)
)
defer func() {
ticker.Stop()
timeout.Stop()
}()
for {
t.Logf("waiting %s for uninstall...", checkInterval)
select {
case <-ticker.C:
status, err := issueStatusRequest()
if err != nil {
t.Fatal(err)
}
if svcErr := status.ControllerError; svcErr != nil &&
errors.Is(svcErr, hostservice.ErrNotInstalled) {
return
}
case <-timeout.C:
t.Fatalf("uninstall control did not finish in time (%s)", checkTimeout)
}
}
} |
|
Small update, this still seems to be fine. However I think the original code and this patch still have an issue. I encountered this once due to an implementation error on my end, but didn't need to deal with it. |
`IsAnInteractiveSession` is deprecated
tl;dr this should fix: #30
When running my own tests, I was encountering random failure/success related to sequences of
service.Controlfunctions.Specifically the sequence
["install", "start", "restart"]failed more often than the rest.I tracked it back to a call to
ControlwithinstopWait.Before; if the service was in a
StartPendingstate, that call toControlwould fail.Now; We query the status first, and wait for
StartPendingservices to reachRunning, before issuing a "stop" request for them.And still as before; wait for the service to go from
StopPendingtoStoppedafter before returning.In addition, there were 2 other bugs that may have been associated with this:
breakwithin the timeout'sselectscope, breaks out of theselectscope, but was likely meant to break out of the outerfor's scope.As a result, I don't think these operations ever timed out.
Run's signal channel was unbuffered, meaning it could potentially miss the signal from the OS.Leaving services running indefinitely.
(The signal test was also calling test methods in a non-test routine - a goroutine within a test routine)
The wait loop was rewritten to use an upper and lower bound that matched the service example provided on MSDN.
And should use timeout hints if they're provided by the service.
The os signal was buffered, and the signal test was re-ordered so that the test failure happens inside the test routine scope.
In any case, my own tests of
Controlare passing consistently now. And the signal test passes too.With the exception of any review remarks, this is probably good to go.
(I wrote this rather quickly, so things might be wrong or something could probably be expressed more elegantly - just let me know what should be changed and I'll push them through)