-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate SecureDrop servers to Ubuntu 16.04 (Xenial) #3204
Comments
Config tests are passing against Xenial. Will run through a few repeated runs on both Trusty and Xenial to make sure there aren't any test flakes. On the subject of flakes, over in #3206 (comment) I noted:
Pleased to report that was a false alarm! The corrupted messages were caused by my running a separate |
Hey, so this came up when chatting with @redshiftzero recently, and I figure I'll leave my thoughts here. We had a timeboxed upgrade attempt from Trusty to Xenial, but I think we should not do this on production boxes. I have a few times upgraded laptops from one Debian version to the immediate next using Where things didn't work in cases where a piece of software was included as the default in version It's also possible to do a bunch of manual fiddling to get the system in to the next state by installing, removing, updating alternatives, and updating the default configs, but this is a lot of work. I hesitate to suggest we do this because one problem we have with SD is that right now it's treated like a package/service when in reality it's more an appliance. Which is to say, we don't have full control or even introspection into the base OS, so if things get a little out of whack, we just don't know. I say we should tell admins that they need to do a full reinstall and give them one (two?) releases of SD in which to do that in where we support both Trusty and Xenial. We already have the backup / restore script that does the magic of replacing the files, so this not so burdensome. My biggest concern isn't that we botch the Trusty to Xenial upgrade, and in fact I bet we could totally nail it. The concern is that it's complex and that knowledge will be lost quickly (people forget, and likely only 2-3 people on the team will really fully understand it), and then future change will have to remember that systems could be fresh Xenials or upgrade Xenials when considering all ops / app related problems. Another advantage of just mandating that we do reinstalls is that it's less engineering effort on our side, and this deadline feels very close. Given this constraint, it seems like a safer choice even if we disregard the long term complexities I mentioned above. Also, I'm acknowledging that I haven't been paying super close attention to this ticket/epic and we may be too far down the line to change this now. Or we may have addressed these concerns already, and what I'm saying here is out of date. |
If we have to ask for reinstall, this is a good time to think other options in the server side too. For example, the read only file system of Atomic (the PoC I did). |
This might take development effort away, but then it puts that effort onto administrators and FPF through support (i.e. some core engineering staff would need to travel to assist with reinstalls). Many administrators already reinstalled fresh late last year, which significantly slowed development for about a month amid significant travel from the core team installing SecureDrops at major news organizations. In my opinion, we should not do reinstalls unless there is an unavoidable technical reason (we haven't come across one yet), as I don't want us to largely halt development in order to travel around and do these reinstalls. Unfortunately, if we don't assist people (ignoring the fact that we have support contracts with a bunch of news organizations) and we don't provide an upgrade path, it means that a significant fraction of instances could be on an EOL OS or stop using SecureDrop altogether, either of which is a bad outcome for sources in my opinion.
We need to complete the Xenial transition in the next few months. In SecureDrop's current state, I don't see how we could be ready to move to a read only file system before Trusty EOLs. What do you think? |
A reinstall and restore is definitely the easier target to hit for us, but I'm loathe to recommend it as I feel it puts a lot of responsibility on admins, especially those who didn't install the system in the first place (because they inherited it or because FPF did it for them). There are a lot of steps for things to go wrong, and it would be hard for us to know how they went about doing the upgrade if things did go wrong. @heartsucker, given that we're talking about systems where we have a pretty good idea of their current state, how much of the upgrade complexity you mention could be identified ahead of time and reproduced in an Ansible playbook or similar? |
I am not saying to do this before moving to Xenial, but, we should keep options open for future. And, reinstall can be a part of that story (in long term). |
Also, I feel that no matter what, this won't be an unattended update. Workflow could go something like:
(Actually the os-update task should probably just do the backup automatically, or prompt them.) (One thing that would be cool to have from the support perspective is a way to verify the OS in use by a given instance, so we could have a Nagios check like that for SD version. Could have security implications however, though it's not like it's not common knowledge what OSes an SD instance is likely to be running.) |
@conorsch here's the playbook I mentioned earlier: https://www.jeffgeerling.com/blog/2018/ansible-playbook-upgrade-all-ubuntu-1204-lts-hosts-1404-or-1604-1804-etc |
tagging myself to get on my radar |
I removed #3208 from this epic because it needs further discussion and does not need to be coupled to this issue. Closing this epic as the other work has been completed |
Ubuntu Trusty is reaching EOL in April 2019. We should upgrade SecureDrop servers to Ubuntu Xenial (16.04) before then. This will also unblock some blocked issues.
(Please keep discussion about moving to 18.04 out of scope of this issue. We will consider the best path to 18.04, but will not immediately go from 14.04 to 18.04.)
Initially this epic captures preliminary only work; we will update it as we discover more work. The preliminary work must only impact the development environment and must not have production consequences.
Tasks:
Unknown status message: u'IMPORT_OK'
related test failures - [xenial] fixUnknown status message: u'IMPORT_OK'
related test failures #4038securedrop-admin logs
command ([xenial] Add package/version information to securedrop-admin logs command #3967)In current sprint
Stretch goals
The text was updated successfully, but these errors were encountered: