Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

many: implement "refresh-mode: {restart,endure,...}" for services #4617

Merged
merged 13 commits into from
Feb 19, 2018

Conversation

mvo5
Copy link
Contributor

@mvo5 mvo5 commented Feb 6, 2018

This PR adds a new "refresh-mode" to apps/services. The supported
values are: {restart,survive} with restart being the default.

Any service that has "refresh-mode: survive" will not get restarted
during a snap refresh.

See also https://forum.snapcraft.io/t/140/21

Thanks for helping us make a better snapd!
Have you signed the license agreement and read the contribution guide?

This PR adds a new "refresh-mode" to apps/services. The supported
values are: {restart,survive} with restart being the default.

Any service that has "refresh-mode: survive" will not get restarted
during a snap refresh.

See also https://forum.snapcraft.io/t/140/21
@mvo5 mvo5 requested a review from niemeyer February 6, 2018 11:56
@zyga
Copy link
Contributor

zyga commented Feb 6, 2018

++ snap pack /home/gopath/src/github.com/snapcore/snapd/tests/lib/snaps/test-snapd-service /home/gopath/src/github.com/snapcore/snapd/tests/lib/snaps/test-snapd-service
2018/02/06 12:09:45.808651 container.go:242: in snap "test-snapd-service": path "bin/stop" does not exist
error: cannot pack "/home/gopath/src/github.com/snapcore/snapd/tests/lib/snaps/test-snapd-service": snap is unusable due to missing files

@@ -55,7 +55,7 @@ type managerBackend interface {
CopySnapData(newSnap, oldSnap *snap.Info, meter progress.Meter) error
LinkSnap(info *snap.Info) error
StartServices(svcs []*snap.AppInfo, meter progress.Meter) error
StopServices(svcs []*snap.AppInfo, meter progress.Meter) error
StopServices(svcs []*snap.AppInfo, reason string, meter progress.Meter) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stop reason should be a distinct type with some constants, to avoid and deter typos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, when I wrote this I was wondering this too. I dislike the extra verbosity but I agree about the usefulness of the static checking.

@@ -3,8 +3,14 @@ version: 1.0
apps:
test-snapd-service:
command: bin/start
stop-command: bin/stop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

Copy link
Contributor

@chipaca chipaca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One question though: wouldn't it make sense to fire a reload to survive services, once the new revno is in place?

snap/validate.go Outdated
return fmt.Errorf(`"refresh-mode" field contains invalid value %q`, app.RefreshMode)
}
if app.RefreshMode != "" && app.Daemon == "" {
return fmt.Errorf(`"refresh-mode" can only be used for services not in %q`, app.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"refresh-mode" cannot be used for %q, only for services or something like that?

Copy link
Contributor

@zyga zyga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicky name suggestions, mostly. Otherwise +1

@@ -62,7 +63,7 @@ func patch5(st *state.State) error {
continue
}

err = wrappers.StopServices(svcs, log)
err = wrappers.StopServices(svcs, snap.ServiceStopReasonRefresh, log)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe snap.StopForRefresh?

@@ -142,6 +142,7 @@ func doInstall(st *state.State, snapst *SnapState, snapsup *SnapSetup, flags int
if snapst.IsInstalled() {
// unlink-current-snap (will stop services for copy-data)
stop := st.NewTask("stop-snap-services", fmt.Sprintf(i18n.G("Stop snap %q services"), snapsup.Name()))
stop.Set("stop-reason", "refresh")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snap.StopForRefresh

@@ -1085,6 +1086,7 @@ func Disable(st *state.State, name string) (*state.TaskSet, error) {

stopSnapServices := st.NewTask("stop-snap-services", fmt.Sprintf(i18n.G("Stop snap %q (%s) services"), snapsup.Name(), snapst.Current))
stopSnapServices.Set("snap-setup", &snapsup)
stopSnapServices.Set("stop-reason", "disable")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snap.StopDisabled

@@ -1246,6 +1248,7 @@ func Remove(st *state.State, name string, revision snap.Revision) (*state.TaskSe

stopSnapServices := st.NewTask("stop-snap-services", fmt.Sprintf(i18n.G("Stop snap %q services"), name))
stopSnapServices.Set("snap-setup", snapsup)
stopSnapServices.Set("stop-reason", "remove")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snap.StopRemoved

snap/validate.go Outdated
@@ -414,6 +414,17 @@ func ValidateApp(app *AppInfo) error {
return err
}

// validate refresh-mode
switch app.RefreshMode {
case "", "restart", "survive":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could those be constans rather than literals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, we don't use constants for e.g. app.Daemon - but if you have strong opinions I can use constants here.

@mvo5
Copy link
Contributor Author

mvo5 commented Feb 7, 2018

@chipaca re reload> I'm inclined to leave leave this to the snap author by using e.g. the post-refresh hook.

@codecov-io
Copy link

codecov-io commented Feb 7, 2018

Codecov Report

Merging #4617 into master will increase coverage by 0.13%.
The diff coverage is 89.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4617      +/-   ##
==========================================
+ Coverage   78.42%   78.55%   +0.13%     
==========================================
  Files         464      464              
  Lines       33144    33350     +206     
==========================================
+ Hits        25994    26199     +205     
  Misses       5008     5008              
- Partials     2142     2143       +1
Impacted Files Coverage Δ
snap/info.go 84.38% <ø> (ø) ⬆️
snap/types.go 82.35% <ø> (ø) ⬆️
overlord/snapstate/backend/link.go 56.16% <0%> (ø) ⬆️
overlord/patch/patch5.go 6.66% <0%> (ø) ⬆️
overlord/snapstate/snapstate.go 82.23% <100%> (+0.05%) ⬆️
snap/validate.go 96.35% <100%> (+0.06%) ⬆️
snap/info_snap_yaml.go 93.14% <100%> (+0.02%) ⬆️
systemd/systemd.go 84.72% <100%> (+0.15%) ⬆️
overlord/snapstate/handlers.go 70.08% <50%> (-0.12%) ⬇️
wrappers/services.go 84.58% <97.05%> (+1.44%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa2bbdc...f463b38. Read the comment docs.

@mvo5 mvo5 changed the title many: implement "refresh-mode: survive" for services many: implement "refresh-mode: {restart,keep,...}" for services Feb 8, 2018
@mvo5 mvo5 force-pushed the dont-kill-me branch 2 times, most recently from 8ae3fe5 to 0d9da90 Compare February 8, 2018 08:58
@mvo5 mvo5 added this to the 2.32 milestone Feb 12, 2018
Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, and LGTM when you're happy. Thank you!

// Skip stop on refresh when refresh mode is "keep"
if reason == snap.StopReasonRefresh {
switch app.RefreshMode {
case "keep":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for the back and forth on this term, Michael. I'm still concerned about the current one.. the problem with both survive and keep is that it feels to innocent. I'd like to have a term that gives people a strange feeling about using it, and make them look for details on why that is so. As hopefully my last suggestion here, how about "endure". This comes close to "survive", but it has that feeling that it might not be that easy to keep it up, which is realistic given the constraints we've discussed for this (things changing behind the process, write access being lost, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I'm happy about the attention to this. I changed this now in my PR to endure.

sysd.Kill(app.ServiceName(), "USR1", "main")
continue
case "sigusr1-all":
sysd.Kill(app.ServiceName(), "USR1", "all")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change ended up more contained than I expected, thank you!


trap "echo got sigusr1" SIGUSR1
trap "echo got sighup" SIGHUP
trap "echo got sigterm" SIGTERM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

case "sighup-all":
sysd.Kill(app.ServiceName(), "HUP", "all")
continue
case "sigusr1":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not support usr2 as well? Those are the typical signals I can think of for implementing custom behavior (hup, usr1, usr2).. any others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I added sigusr2 now as well. I think (hup,usr1,usr2) are the ones used. We can easily add more if needed but it seems the other ones are really exotic (or inappropriate).

@mvo5 mvo5 changed the title many: implement "refresh-mode: {restart,keep,...}" for services many: implement "refresh-mode: {restart,endure,...}" for services Feb 13, 2018
if err := validateAppOrderNames(app, app.After); err != nil {
return err
}

// validate refresh-mode
switch app.RefreshMode {
case "", "endure", "restart", "sigterm", "sigterm-all", "sighup", "sighup-all", "sigusr1", "sigusr1-all", "sigusr2", "sigusr2-all":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This begs to be documented somewhere!

@mvo5 mvo5 merged commit 410afe8 into canonical:master Feb 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants