-
Notifications
You must be signed in to change notification settings - Fork 693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script to upgrade from focal to noble #7406
base: develop
Are you sure you want to change the base?
Conversation
843ac2a
to
ce32f1e
Compare
I think the script is basically complete at this point, but I haven't actually tried it yet. So I need to do that, and then figure out how we're going to do CI on it. I think we should ideally be able to take the focal staging environment, upgrade it, and then re-run testinfra (noble) checks on it. |
b72bef2
to
6e433aa
Compare
Fixed a number of issues found by actual test runs, currently hit:
Will get to that tomorrow. |
Interesting wrinkle, because the systemd unit is installed by the Debian package, it smartly wants to restart the service during package upgrade. Except that kills the apt-get process upgrading the package, which totally breaks everything. I'm trying to figure out how to stop that restart, but it doesn't seem like dh_systemd_start's Alternatively we could have the script fork in a way that it doesn't get killed when the service stops - similar to what unattended-upgrades does. |
6e433aa
to
e1396bb
Compare
I'm thinking of getting rid of the --without-new-pkgs step and just doing a single full-upgrade step, it'll avoid a lot of the dependency constraint weirdness that keeps manifesting in weird ways. For example, I dropped the apparmor-utils dependency, which meant that Python 3.12 got pulled in later, and then app-code got upgraded too early. |
19c39ae
to
5bf94fe
Compare
I think this is the way to go, but with a slightly different variant. We should just ensure the apt-get/dpkg processes don't get killed. It's okay if the upgrade script dies, as long as apt-get keeps going. The commands are all idempotent so when the script gets restarted by the timer, it'll re-run the apt-get command it'll do nothing and not kill itself, and then keep moving on. Plus by just keeping apt-get alive, we don't need to implement any locking, etc ourselves, because dpkg already takes care of all of that in a battle-tested manner. |
5bf94fe
to
087599e
Compare
The changes I just pushed introduce a new To visualize:
One gotcha here is that we can no longer reliably capture stdout/stderr because if the parent is killed, it'll go nowhere. So we will need to send it to a file presumably. |
087599e
to
14ebc22
Compare
I mostly got through a full automated migration; something is going wrong during the installation of iptables-persistent/nftables-persistent and /etc/iptables/rules.{v4,v6} are blank, so there's no firewall up, causing the integrity check to fail. Once I bypassed that, it reached the done stage. \o/ |
239e0fd
to
bf03fa6
Compare
I successfully completed a fully-automated app migration today, so I'm marking this as ready for review. I still need to write up a more comprehensive test plan and stuff but at least the code can begin to be looked at. |
I've written up the full test plan now. |
Also as far as the code review goes, I'll try to split this up into more commits to simplify review. I also want to write a brief architecture document that explains how it's all supposed to work. |
https://github.com/freedomofpress/securedrop/wiki/noble-upgrade-architecture |
bf03fa6
to
6753f8a
Compare
I thought it was going to be more, but in the end it's two commits: 1) move some logic out of the check.rs file into a new Rust lib.rs and 2) the migration script and everything else. |
As part of the upgrade script, we want to run the check one last time to ensure that everything is ready to go. Instead of shelling out to it, move the logc into a Rust library that can be shared by both binaries.
6753f8a
to
d9ccbd2
Compare
(Rebased on top of the Rust upgrade) |
faf3129
to
87d6e1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is terrific work, @legoktm. The upgrade bucketing and state machine are very elegant.
My comments here are intended (a) to clarify things and (b) to see if there are simplifications worth adopting. If not, that's fine too: none of these suggestions are blocking, and I'll begin testing tomorrow either way. If there's anything that you'd like to discuss, rather than just either addressing or dismissing, let's chat to save us some back-and-forth here!
securedrop/debian/config/lib/systemd/system/securedrop-noble-migration-upgrade.service
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review, I'll push fixes in a bit. I didn't respond to most of the comments in lib.rs because that code isn't new, it was already shipped in check.rs and I just moved it into lib.rs so it can be reused by upgrade.rs. The only new part should be the is_ready_except_apt
you commented on - sorry for not making this clear.
Given that, do you think the modifications you suggested, which were all reasonable, are still worth it? From one side it's good that they're getting more scrutiny now that they're being used in a different and more critical context, but also we know that the current implementations work in production, so I'm not sure changing them is worth it.
That's very fair. Let's leave |
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. Further details of the script are explained inline and at <https://github.com/freedomofpress/securedrop/wiki/noble-upgrade-architecture>. Fixes #7332.
6c8f3b4
to
36e9554
Compare
I believe I've addressed everything except for the nit regarding imperative naming; I'm down to do it, just trying to address the more important stuff first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's testing on production VMs: success! A few notes inline.
Preparation
- Build focal and noble packages off this PR (
UBUNTU_VERSION=focal make build-debs
andUBUNTU_VERSION=noble make build-debs
).- Until Publish noble OSSEC packages #7411 is resolved, you will also need to build noble ossec packages:
UBUNTU_VERSION=noble make build-debs-ossec
- Until Publish noble OSSEC packages #7411 is resolved, you will also need to build noble ossec packages:
- Copy the respective focal and noble folders to a accessible website
- Create a
dist
folder alongside focal and noble, runapt-ftparchive packages ./focal > dists/focal/main/binary-amd64/Packages
andapt-ftparchive packages ./noble > dists/noble/main/binary-amd64/Packages
(you'll need to create the intermediate directories in dist/ manually)- As an example, you can look at how I've set up https://legoktm.com/apt/ - but don't use these packages, they're out of date and volatile!
- Set up a 2.11.1 staging/prod install (n.b. I've only tested this on physical hardware, nothing should stop it from working on a virtualized setup though)
The next step of steps need to be applied to both app and mon:
- Add your custom apt repository to the
/etc/apt/sources.list.d/apt_freedom_press.list
file with a line likedeb [trusted=yes] https://example.org/apt focal main
. The[trusted=yes]
bypasses PGP signature checking so we don't need to also fiddle with signing the temporary packages and installing the keyring. - Run
sudo apt-get update && sudo unattended-upgrade
to upgrade to the 2.12.0-rc0 packages.
Afterwards:
root@app-prod:/home/vagrant# apt-cache policy securedrop-app-code
securedrop-app-code:
Installed: 2.11.1+focal
Candidate: 2.12.0~rc1+focal
Version table:
2.12.0~rc1+focal 500
500 https://homes.cs.washington.edu/~cfmyers/securedrop focal/main amd64 Packages
*** 2.11.1+focal 500
500 https://apt.freedom.press/ focal/main amd64 Packages
100 /var/lib/dpkg/status
I had to use apt upgrade
.
- Edit
/lib/systemd/system/securedrop-noble-migration-upgrade.service
to add the lineEnvironment=EXTRA_APT_SOURCE="deb [trusted=yes] https://example.org/apt noble main"
(note that this says noble and not focal! also keep an eye on the quotes) - Reboot.
Upgrading
This should be repeated twice, once for app and then once that's done, for mon.
- verify
/etc/securedrop-noble-migration-state.json
was created with{"finished": None, "bucket": 1-5}
. - open a background window that runs
journalctl -f
- primarily to follow along the progress - edit
/usr/share/securedrop/noble-upgrade.json
, to setapp.enabled = true
andapp.bucket = 5
. (or mon if that's what you're upgrading) - Wait for the securedrop-noble-migration-upgrade systemd timer to start (no more than 3 minutes). You can also initiate it manually with
sudo systemctl start securedrop-noble-migration-upgrade
. - The server should reboot (you'll need to restart your journalctl -f window). Once it comes back,
/etc/securedrop-noble-migration-state.json
should now have finished PendingUpdates. - Wait again for the securedrop-noble-migration-upgrade systemd timer to start. You should see apt's progress in the journalctl window, it'll take a while depending on internet and hardware speed.
NB. One needs to restart journalctl -f
after systemd and journald restart with Journal stopped
.
- if this is the app migration, you should be able to verify that the SI/JI are unreachable (apache is masked)
- Eventually it should reboot again. Once it's back,
/etc/securedrop-noble-migration-state.json
should be at Reboot. - Wait for the securedrop-noble-migration-upgrade systemd timer to start again, once it does it should pretty quickly reach the
Done
stage. -
cat /etc/os-release
should output noble.
Verification
After the upgrade:
- Running
./securedrop-admin verify
should pass.
This is still running (slowly, per #7428); I'll check up on it in the morning.
- Basic SI/JI functionality should work
- OSSEC notifications are coming through like expected
...which were suspended during the upgrade on mon
but not on app
, as expected.
Status
Ready for review
Description of Changes
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle.
The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process.
Fixes #7332.
Testing
How should the reviewer test this PR?
Preparation
UBUNTU_VERSION=focal make build-debs
andUBUNTU_VERSION=noble make build-debs
).UBUNTU_VERSION=noble make build-debs-ossec
dist
folder alongside focal and noble, runapt-ftparchive packages ./focal > dists/focal/main/binary-amd64/Packages
andapt-ftparchive packages ./noble > dists/noble/main/binary-amd64/Packages
(you'll need to create the intermediate directories in dist/ manually)The next step of steps need to be applied to both app and mon:
/etc/apt/sources.list.d/apt_freedom_press.list
file with a line likedeb [trusted=yes] https://example.org/apt focal main
. The[trusted=yes]
bypasses PGP signature checking so we don't need to also fiddle with signing the temporary packages and installing the keyring.sudo apt-get update && sudo unattended-upgrade
to upgrade to the 2.12.0-rc0 packages./lib/systemd/system/securedrop-noble-migration-upgrade.service
to add the lineEnvironment=EXTRA_APT_SOURCE="deb [trusted=yes] https://example.org/apt noble main"
(note that this says noble and not focal! also keep an eye on the quotes)Upgrading
This should be repeated twice, once for app and then once that's done, for mon.
/etc/securedrop-noble-migration-state.json
was created with{"finished": None, "bucket": 1-5}
.journalctl -f
- primarily to follow along the progress/usr/share/securedrop/noble-upgrade.json
, to setapp.enabled = true
andapp.bucket = 5
. (or mon if that's what you're upgrading)sudo systemctl start securedrop-noble-migration-upgrade
./etc/securedrop-noble-migration-state.json
should now have finished PendingUpdates./etc/securedrop-noble-migration-state.json
should be at Reboot.Done
stage.cat /etc/os-release
should output noble.Verification
After the upgrade:
./securedrop-admin verify
should pass.Misc.
Deployment
Any special considerations for deployment? Consider both:
Checklist
If you made changes to the server application code:
make lint
) and tests (make test
) pass in the development containerIf you made changes to
securedrop-admin
:make -C admin test
) pass in the admin development containerIf you made changes to the system configuration:
If you added or removed a file deployed with the application:
If you made non-trivial code changes:
Choose one of the following:
If you added or updated a reference to a production code dependency:
Production code dependencies are defined in:
admin/requirements.in
admin/requirements-ansible.in
securedrop/requirements/python3/requirements.in
securedrop/requirements/python3/translation.in
(used in the buildcontainer)
If you changed another
requirements.in
file that applies only to developmentor testing environments, then no diff review is required, and you can skip
(remove) this section.
Choose one of the following: