-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/run: Ensure underlying container is stopped when toolbox is killed #1207
base: main
Are you sure you want to change the base?
Conversation
f28ca61
to
af0fc8b
Compare
@matthiasclasen i think this is probably a fix for the shutdown issue you told me about yesterday |
Build failed. ❌ unit-test FAILURE in 24m 23s |
Right now "toolbox enter" creates a container on the fly, but then lets it linger after the foreground toolbox process is killed (for instance, from a terminal hangup). Not killing the underlying container has the negative side effect of stalling shutdown if a toolbox shell is running. This commit addresses that problem by detecting when the toolbox process is signaled, and then in response, kills off the entire cgroup associated with the underlying container. Closes containers#1157 Signed-off-by: Ray Strode <[email protected]>
db770e5
to
2c0ec31
Compare
Build succeeded. ✔️ unit-test SUCCESS in 27m 38s |
Just for the sake of posterity ...
I think the blocked shutdown problem was reported as:
... and might have been fixed by: Note that while containers/podman#14531 points at containers/podman#16785 , I don't think it had anything to do with fixing blocked shutdowns caused by lingering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @halfline - my apologies for not looking at it sooner.
I don't think the changes in this pull request are working as expected, and the commit message suggests that there might be some misunderstanding.
Longer explanation follows ...
First, toolbox enter
doesn't quite create a container on the fly. It is confusing because there's some syntactic sugar where toolbox create
will offer to create a container if you had none. You could think of it as an embedded toolbox create
.
The canonical flow is that a container is created with toolbox create
, which wraps podman create
. Then that container is entered with toolbox enter
, which wraps podman start
and podman exec
. The podman start
launches the container's entry point process, which would have been PID 1 inside the container had we been using a PID namespace, but we don't. For Toolbx containers, this entry point is toolbox init-container
. The podman exec
launches the process that the user actually interacts with. eg., a CLI shell, Emacs, Vim, etc..
The same container can be entered multiple times from multiple terminals at the same time. When that happens the podman start
is basically a NOP because a container can have only one entry point and it's already running.
Neither the entry point nor the foreground processes are parented by podman start
or podman exec
. Each of them are parented by a separate conmon(8)
process which is double forked from podman(1)
and parented by the systemd user instance.
The blocked shutdown problem was that when SIGTERM
was sent to the container's entry point, either through podman stop
or the shutdown sequence, the processes launched by podman exec
wouldn't be terminated. This would end up blocking shutdown. My understanding is that this was caused by Toolbx containers not having a separate PID namespace and was fixed by containers/podman#17025
This uncovers another problem. How did those processes launched by podman exec
manage to outlive their respective terminals? This is #1204 I suspect this happens because the signal sent by the terminal only reaches upto the wrapper podman exec
and doesn't get to the corresponding conmon(8)
process.
With the changes in this branch, if I have multiple graphical terminals open with toolbox enter
against the same container, then closing any one of those terminals with the cross button terminates the container's entry point and the foreground container process in all the other terminals. This shouldn't happen. Closing a terminal should only terminate its own foreground container process.
I don't understand cgroups very well, so I don't know why this is happening.
Toolbx containers also use the host's cgroup namespace, and, as far as I know, each terminal created by GNOME Terminal has its own cgroup. For each of the foreground container processes, I get:
$ cat /proc/56151/cgroup
0::/user.slice/user-1000.slice/[email protected]/app.slice/app-org.gnome.Terminal.slice/vte-spawn-2b4111a9-f4a9-4f2b-8acd-4adc9e673bac.scope
$
$ cat /proc/55919/cgroup
0::/user.slice/user-1000.slice/[email protected]/app.slice/app-org.gnome.Terminal.slice/vte-spawn-9bf0516d-bd6f-4094-9504-0be37b1ce6bf.scope
$
$ cat /proc/56363/cgroup
0::/user.slice/user-1000.slice/[email protected]/app.slice/app-org.gnome.Terminal.slice/vte-spawn-6ecb9e7e-aa06-42d6-b9b6-bb9d6cdb44f4.scope
So, I have no idea why terminating the processes in one cgroup is affecting the others.
sorry it's been a while since i looked at this and even when I was working on this I only put it together in a few hours after Matthias visited and he mentioned the problem off hand, so I may be off base here... I haven't done much deep diving and I may have made wrong assumptions. Having said that, my impression from your message is you believe I'm sending the hangup signal to the terminal's cgroup, but that's not the intent. The idea was to send the signal to the cgroup of the bash running in the container. I got (at least) two things wrong I think:
╎❯ cat /proc/self/cgroup in all my terminals. I find this really surprising. I do see crun has a --cgroup option, but indeed looking in ps output, it's not getting used.
Right, when a terminal is closed, a SIGHUP is sent to all processes in the terminal "session" (in the setsid() sense of the word). These sessions are inherited from their parents, unless a child creates a fresh session. There's no way in unix afaik to move a child into a pre-existing session; it either uses the one it started with, or it starts its own. conmon isn't part of the child hierarchy of podman-exec, so it won't inherit the terminal session. This means the hangup signal needs to be ferried either from podman-exec or toolbox to the process started by conmon (or if children had their own cgroup like I thought, we could just send the signal to the whole cgroup) |
I managed to confuse myself as well. GNOME Terminal puts every
That's why I didn't realize that all the
Right. Thanks for the hint, because I had missed it before.
nod
nod I don't see an obvious way to get the PID of the Otherwise, we can insert a process that we control between This intermediate process under our control sounds a lot like the toolbox shell idea that I had before. We don't have to call it |
I think it probably is possible to get the right pid. just as an experiment: ╎❯ bundle_path=$(podman inspect 4e141e468f1a756b43407f64f0273b5c18fcf5b784ea2a13050d0a9f4e1528a | jq -r '.[0].StaticDir') There may be a more direct command for it. not sure. libpod has an api for it:
though it looks like toolbx doesn't use libpod ? |
I guess the complication is the exec_id is only printed for background exec invocations. You can fish it out of podman inspect output, but you would have to diff the list of execids before and after to find the right one, which is a little racy. |
Cool. I didn't know that exec IDs are a thing. I think I never tried
When we rewrote Toolbx in Go, we wanted to use libpod and the other backend Go packages directly instead of going through These days, for other reasons, I am considering implementing some small things directly inside If we can't get the PID inside the container through a Podman interface, do you think the alternative of inserting a shim is sane? |
|
Interesting. Right, when the terminal is closed, podman exec would be receiving a hang up signal (SIGHUP) except:
You can see regardless whatever is splicing input between the two ttys (podman? conmon?), should be getting a hang up in poll so it should know to tear things down i guess. |
just browsing around a little, i think the splicing happens here:
So maybe something like this would fix it
|
Right now "toolbox enter" creates a container on the fly, but then lets it linger after the foreground toolbox process is killed (for instance, from a terminal hangup).
Not killing the underlying container has the negative side effect of stalling shutdown if a toolbox shell is running.
This commit addresses that problem by detecting when the toolbox process is signaled, and then in response, kills off the entire cgroup associated with the underlying container.
Closes #1157