-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon: Exit on idle after txn #606
Conversation
This code all works, but marking as WIP. |
@petervo From looking at the cockpit code, it seems like this would result in an error message? If so, how hard would it be to adjust the code to handle it? |
@cgwalters way back when, the original WIP version of this had this in, but we took it out in review because of races. I think cockpit will handle this correctly and shouldn't cause an error. In any case I agree that's it's good to have so when it hits centos continuous, i'll make sure everything continues to work as expected. |
I think I need to revisit https://lists.freedesktop.org/archives/dbus/2015-May/016671.html and at least try to adapt things here for that. |
☔ The latest upstream changes (presumably f9944e6) made this pull request unmergeable. Please resolve the merge conflicts. |
Ok, dependencies in #660 This should be good to go now. Removing WIP. |
☔ The latest upstream changes (presumably 7cf3664) made this pull request unmergeable. Please resolve the merge conflicts. |
OK, rebased again. Not totally sure whether to land this right now though...maybe we make it a config option? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, wait we're missing a transition to APPSTATE_FLUSHING
here, no?
diff --git a/src/app/rpmostree-builtin-start-daemon.c b/src/app/rpmostree-builtin-start-daemon.c
index 9bbcbd1..91dd817 100644
--- a/src/app/rpmostree-builtin-start-daemon.c
+++ b/src/app/rpmostree-builtin-start-daemon.c
@@ -382,6 +387,8 @@ rpmostree_builtin_start_daemon (int argc,
* https://github.com/cgwalters/test-exit-on-idle
*/
sd_notify (FALSE, "STOPPING=1");
+ state_transition (APPSTATE_FLUSHING);
g_dbus_connection_call (bus,
"org.freedesktop.DBus", "/org/freedesktop/DBus",
"org.freedesktop.DBus", "ReleaseName",
Also, it doesn't seem like STOPPING=1
causes systemd to start up a new instance. E.g. if I do:
diff --git a/src/app/rpmostree-builtin-start-daemon.c b/src/app/rpmostree-builtin-start-daemon.c
index 9bbcbd1..81eccc7 100644
--- a/src/app/rpmostree-builtin-start-daemon.c
+++ b/src/app/rpmostree-builtin-start-daemon.c
@@ -372,6 +372,11 @@ rpmostree_builtin_start_daemon (int argc,
g_debug ("Entering main event loop");
rpmostreed_daemon_run_until_idle_exit (rpm_ostree_daemon);
+ /* note that STOPPING=1 was already done in the idle exit */
+ sd_journal_print (LOG_INFO, "sleeping for race");
+ sleep (10);
+
if (bus)
{
/* We first tell systemd we're stopping, so it knows to activate a new instance
And run rpm-ostree status
during the race window, I get blocked until the daemon exits (which is expected), and then get:
error: Rejected send message, 1 matched rules; type="method_call", sender=":1.117" (uid=0 pid=22345 comm="rpm-ostree status ") interface="org.projectatomic.rpmostree1.Sysroot" member="RegisterClient" error name="(unset)" requested_reply="0" destination=":1.116" (uid=0 pid=22316 comm="/usr/bin/rpm-ostree start-daemon ")
OTOH, if I instead put the race window after ReleaseName
, I still get blocked, but I then get served by the new instance, which I think is what we want to happen before as well, right?
src/daemon/rpmostreed-daemon.c
Outdated
|
||
sd_notifyf (0, "STOPPING=1\nSTATUS=Exiting due to idle"); | ||
self->running = FALSE; | ||
g_main_context_wakeup (NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need this though? This is already running as part of the main context, so isn't it already woken up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we don't need it here. However, if one is doing multithreading, the main loop does need to be woken up. So I always add this as part of the "manage mainloop termination condition" pattern. It's just a write()
to an eventfd()
underneath so it's cheap.
Urgh, yes. Good catch. I'm going to need to squash together the idle exit/mainloop logic into one file really. I guess |
RE the dbus error...I think the problem is bus policy won't let messages through to us anymore since we don't own the name. To fix this we need to do #745 |
☔ The latest upstream changes (presumably 675066a) made this pull request unmergeable. Please resolve the merge conflicts. |
Note to self: from looking at |
8849a4c
to
338ec2a
Compare
This closes a race condition for having the daemon idle exit. After the daemon has released its bus name, the dbus-daemon will no longer allow messages through that targeted its unique name. Since the intention of the `RegisterClient` method is to be the "knock on the door", fix this by directly sending a message to the well-known name. Second, we need to handle the case where the daemon exits without replying; @jlebon added a `sleep(10)` invocation after the daemon mainloop quit but before we `ReleaseName`, and I verified these two things combine to fix that case.
Now that we have the ability to both track clients and our active transaction, and the `RegisterClient` call acts "atomically", let's start doing exit-on-idle and return the RAM to the people.
338ec2a
to
bb0d5bb
Compare
Rebased 🏄 and fixed up! I figured out a way to close the race you found; see the first commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor nits, otherwise looks good! I tried add a sleep
in the same places and got appropriate behaviour in all of them.
src/app/rpmostree-dbus-helpers.c
Outdated
break; /* Success! */ | ||
else | ||
{ | ||
if (g_dbus_error_is_remote_error (local_error)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we can de-indent all of this by just dropping the else
.
* FLUSHING right now, let's just forcibly do so if it hasn't been done | ||
* already. | ||
*/ | ||
if (appstate <= APPSTATE_FLUSHING) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should technically be strictly <
, right?
src/daemon/rpmostreed-daemon.c
Outdated
|
||
g_clear_pointer (&self->idle_exit_source, (GDestroyNotify)g_source_unref); | ||
|
||
sd_notifyf (0, "STOPPING=1\nSTATUS=Exiting due to idle"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just drop the STOPPING=1
here? We already do this once we exit the loop and it covers all other possible non-idle-exit scenarios as well.
Nice review, thanks! Fixups ⬆️ |
Nice! @rh-atomic-bot r+ 2cda8d7 |
⚡ Test exempted: merge already tested. |
Now that we have the ability to both track clients and our active transaction, and the `RegisterClient` call acts "atomically", let's start doing exit-on-idle and return the RAM to the people. Closes: #606 Approved by: jlebon
So...we have some bad memory leaks somewhere. We go from 14MB RSS after being
activated from
atomic host status
to 54MB after a dummyrpm-ostree install foo
, 80MB for a successful install, and I've seen it as high as 140MB.Exiting on idle is what PackageKit has done forever. One problem particularly
for both of us is that libdnf and all of the libraries underneath it are used
almost exclusively by command line clients that exit when done, so memory leaks
aren't particularly important to them.
This may break Cockpit though, since it won't be able to monitor for sysroot
changes. I considered instead running txns in a subprocess, which is possible
but harder.